關(guān)于ARM體系中棧的對(duì)齊問題
基于ARM架構(gòu)的處理器的C語言程序設(shè)計(jì)遵循ATPCS(ARM-THUMB procedure call standard)和AAPCS(ARM Application Procedure Call Standard)。ATPCS規(guī)定數(shù)據(jù)棧為FD(滿遞減Full Decrease)類型,并且對(duì)數(shù)據(jù)棧的操作是8字節(jié)對(duì)齊的。在我自己的輕量級(jí)的嵌入式操作系統(tǒng)tqOS中沒有考慮到線程工作棧的8字節(jié)對(duì)齊的問題,這樣從內(nèi)存池中分配到的棧的起始地址可能是4字節(jié)對(duì)齊的也可能是8字節(jié)對(duì)齊的,如果運(yùn)氣好每一個(gè)線程的棧式8字節(jié)對(duì)齊的則不會(huì)有什么問題出現(xiàn),如果運(yùn)氣差線程的棧式4字節(jié)對(duì)齊的,那么就會(huì)導(dǎo)致種種錯(cuò)誤......例如,最要命的是在線程函數(shù)中進(jìn)行浮點(diǎn)數(shù)運(yùn)算的時(shí)候,兩個(gè)浮點(diǎn)數(shù)初始化之后打印出來都是錯(cuò)誤的數(shù)據(jù),進(jìn)行算術(shù)運(yùn)算之后也是錯(cuò)誤的結(jié)果。因?yàn)楦↑c(diǎn)數(shù)double是8字節(jié)的,可能非8字節(jié)對(duì)齊的棧會(huì)導(dǎo)致運(yùn)算出錯(cuò)。為了解決這個(gè)問題,我在tqOS的任務(wù)創(chuàng)建函數(shù)中對(duì)任務(wù)??臻g的分配中做了調(diào)整,將棧的起始地址始終設(shè)置為8字節(jié)對(duì)齊,如果不為8字節(jié)對(duì)齊則將棧指針下移至8字節(jié)對(duì)齊處。另外,APTCS要求了對(duì)棧的操作必須是8字節(jié)對(duì)齊的,所以對(duì)任務(wù)棧的初始化也是有要求的,也就是說一次要入棧兩個(gè)數(shù)據(jù),或者說一次要入棧偶數(shù)個(gè)數(shù)據(jù),因?yàn)橐淮沃蝗霔R粋€(gè)數(shù)據(jù)(4字節(jié)長(zhǎng)度)的話就會(huì)導(dǎo)致棧的地址變成非8字節(jié)對(duì)齊了,這是不允許的!C語言程序會(huì)由ARM的C編譯器會(huì)做好8字節(jié)對(duì)齊工作,但是涉及匯編和操作系統(tǒng)的時(shí)候就需要自己把握好入棧的數(shù)據(jù)問題。為此我把棧初始化函數(shù)里面的一個(gè)小小的部分改了一下,就是為了保證初始化完棧之后棧頂指針依然為8字節(jié)對(duì)齊的(詳見tqOS的修改版的注釋)。
綜上,程序設(shè)計(jì)時(shí)需要保證棧指針為8字節(jié)對(duì)齊,使用操作系統(tǒng)是要保證每個(gè)任務(wù)的工作棧為8字節(jié)對(duì)齊的。詳細(xì)信息搜索ATPCS和AAPCS。
引例:http://bbs.elecfans.com/jishu_468500_1_1.html
1. 當(dāng)堆棧為單字對(duì)齊時(shí),將有可能導(dǎo)致lib c這樣嚴(yán)格按照AAPCS規(guī)范的庫(kù)函數(shù)使用異常。
2. 程序中MSP、PSP的地址應(yīng)盡量雙字對(duì)齊(即地址能被8整除)。
由于編譯器在后續(xù)的反匯編中保證堆棧的雙字對(duì)齊,但為了應(yīng)對(duì)極端情況,Cortex-M3and Cortex-M4中提供了一種硬件自動(dòng)補(bǔ)齊功能。用戶可以通過將SCB->CCR[9]置1使能此項(xiàng)功能。(缺省為雙字對(duì)齊),當(dāng)發(fā)生中斷時(shí)由硬件自動(dòng)檢測(cè)堆棧是否雙字對(duì)齊,如果對(duì)齊了,則不進(jìn)行任何操作,如果沒有對(duì)齊,則自動(dòng)將SP減4這樣便對(duì)齊。同時(shí)將xPSR的第9位置位。詳細(xì)描述如下:
Another requirement of the AAPCS is that the stack pointer value should be double-wordaligned at function entry or exit boundary. As a result, the Cortex-M3 and Cortex-M4processors can insert an additional word of padding space in the stackautomatically if the stack pointer was not aligned to double-word location whenthe interrupt happened. In this way, we can guarantee that the stack pointer willbe at the beginning of the exception handler. This “double-word stack alignment”feature is programmable, and can be turned off if the exception handlers do notneed full AAPCS compliance.
The bit 9 of the stacked xPSR is used to indicate if the valueof the stack pointer has been adjusted. In Figure8.2, the stack pointer was aligned to double-word address location,so no padding was inserted and bit 9 of the stack xPSR is set to 0. The samestack frame behavior can also be found when the double-word stack alignmentfeature is turned off, even if the value of stack pointer wasn’t aligned to double-word boundary.