How does ".align" works

Can anyone help me get how the “.align” method in mips works. Lets consider the example from the lecture notes.

  .data
    .ascii "Hallo"
x:
    .byte 8, 0, 0, 0

    .text
    .globl main
main:
    lw $t0 x

It says the above will produce an error when executed because the address of x is 0x10000005. My first question is how/why is the address of x so. I thought addresses are always incremented by 4. So am abit loss why we have 05 at the end knowing (from the lecture notes) that the data segment starts at 0x10000000.

The above problem was solved by using “.align 2” but am also confused because I tried all number (from 1 - 10) for “.align” and all of them except work the “1” work fine. My second question is then why use “.align 2” and not “.align 5” if both have thesam effect.

Thanks in advance

No. Every time you put a new “thing”, the address advances by the size (in bytes) of that “thing”.

For example, when I say .byte 42, the assembler places 1 byte in memory at the current address. The current address is then advanced by one.

Note that all instructions are 4 bytes large. So when I place an instruction, the current address advances by 4. The MIPS instruction set in fact requires that all your instructions are aligned to addresses divisible by 4 (otherwise, you get a misaligned pc error).

Now, the MIPS ISA also requires that certain operation only happen on aligned addresses. Importantly, when you load a word using lw, the address needs to be aligned to a multiple of 4, otherwise, your program crashes.

By default, if you use .word 42, the assembler will first check whether the current address is a multiple of 4, and if not, add padding, such that the current address becomes such a multiple. It then emits the 4 bytes, and you don’t have any issues.

When you use .byte (or .ascii, which just emits repeated bytes), the assembler “checks” that the current address is a multiple of 1, since bytes have size 1. Of course, all addresses are multiples of 1, so there is no padding emitted. You simply put the bytes at whatever address you currently are at.

This leads to issues in that example, because the .byte 8,0,0,0 emits 4 bytes, but does not align them to a 4 byte boundary. When you then later load the 4 bytes using lw, you get a crash, since the address was misaligned.

Finally, .align allows you to emit padding until you are at a multple of the specified alignment. However, for reasons :tm:, .align n does not align to multiples of n. Instead, it aligns to multiples of 2^n. So .align 0 aligns to multiples of 1 byte. .align 1 to mutliples of 2, and .align 2 to multiples of 4.

This explains why .align n works whenever n \geq 2. When you do for example .align 5, you align things to addresses divisible by 32. Since such addresses are also divisible by 4, there are no issues. The reason we use .align 2 is that .align 5 wastes space by producing 27 blank bytes just to get to the next multiple of 32. However, emitting 3 unused bytes is sufficient, and we don’t want to waste memory.

Hope this helps,
Johannes

6 Likes

I see. It’s now more clear
Thanks