Unaligned memory access is the access of data with a size of N number of bytes from an address that is not evenly divisible by the number of bytes N. If the address is evenly divisible by N, we have aligned memory access.

We can express this as Address/N, where Address is the memory address and N is the number of bytes that are accessed. Here are some examples:

 - Two byte access from address 4
Address/N = 4/2 = 2  (aligned access)
 - Two byte access from address 3
Address/N = 3/2 = 1.5 (unaligned access)
 - Four byte access from address 24
Address/N = 24/4 = 6 (aligned access)

As a practical note, If the rightmost digit of the address (represented in a hexadecimal format) is divisible by the number of bytes, we have aligned memory access.

Fig. 1 Aligned and unaligned memory access based on address and access size

There are microprocessors that allow unaligned memory access and those that don’t allow it. The unaligned access usually has a negative impact on the performance, as more operations (instruction) are required for performing it. In case unaligned access is not supported by the microprocessor, an exception can be triggered (e.g bus error exception) when such access is attempted.

Software Point of View

Memory access from the point of view of the software is just instructions for reading or writing bytes of data to or from memory.

Let’s look at practical situations where unaligned access may occur when using the C programming language. First, we start with the structure shown below:

struct Example {
   uint16_t data_1;
   uint32_t data_2;
   uint8_t data_3;

Let’s say the structure shown in the code above is mapped starting from address 0x00001000. This means that data_1 occupies addresses 0x00001000 and 0x00001001 (Note: each address can store a single byte). Variable data_2 occupies addresses 0x00001002 to 0x00001005 and data_3 is at address 0x00001006. Variable data_1 has a size of two bytes and using the simple calculation Addr/N we can see that it is properly aligned. Variable data_2 is not aligned and variable data_3 is aligned. Here we should mention that single byte variables are always aligned because all addresses are evenly divisible by one. If we want to have all members of the structure properly aligned, the compiler can do that job for us. It can insert so-called “padding” bytes after data_1, so we can have data_2 aligned properly. The compiler (if configured) can place all variables and function arguments in an aligned manner, complying with the alignment requirements of the CPU architecture that is used.

The next possible situation where we can encounter unaligned memory access is when casting pointers from one type to another. Although the C language allows such castings, the result may cause undefined behavior:

A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.

The C Standard, ISO/IEC 9899:2011
void test_func(uint8_t *data) {
        /*The rest of the code removed for clarity*/
	uint32_t value = *((uint32_t *) data);

As we can see from the code above, we have read access of 4 bytes from a memory address that is passed as a function parameter uint8_t * data. This memory access can be aligned or unaligned, and it all depends on the address of the variable pointed by the data pointer. For example, if we pass a variable with address 0x0004 as an argument to the function we will end up with aligned access, if the address however is 0x0005 then the access will be unaligned. This is a situation in which the compiler can’t help us to resolve, as it does not generate code for run-time checks.

Compiler Specifics

The C programming language classifies unaligned memory access as undefined behavior. The default behavior of the compiler when it comes to unaligned access is dependent on the target CPU architecture. If the architecture does not allow unaligned accesses, then the compiler will place all variables, functions, etc. in an aligned manner. If the CPU architecture allows unaligned access, then the compiler should have options where we can select whether it should take advantage of this or not. For example, gcc compiler has the following options for ARM processors that can be used: -munaligned-access -mno-unaligned-access.

If we look at the example with the structure from the previous chapter and we don’t want to have padding bytes inserted by the compiler, we can explicitly instruct it, by using a compiler-specific language extension. For gcc, if we want to “pack” (remove padding bytes) a structure the code will look like this:

struct __attribute__((packed)) Example {
   uint16_t data_1;
   uint32_t data_2;
   uint8_t data_3;

The attribute packed specifies that a type must have the smallest possible alignment. C objects that can be “packed” include unions, pointers, structures.

Hardware Point of View

Analyzing the hardware point of view will give us a better understanding of why unaligned memory access can happen in the first place. It is not a design flaw of the microprocessor. The limitations in regards to unaligned memory access are related to the way memories are structured and integrated in CPU-based systems.

There is a limited amount of bits that can be accessed from a memory by a single read or write cycle. For example, a memory that has an 8bit data bus, limits a single read/write access to that size. A memory with a 32bit data bus limits the single access to a maximum of 32bits. The same logic applies to memories with other data bus sizes.

Fig. 2 Simplified pinout of memory units

In Fig.2 we can see two memory units with their pinout:

  • Memory 2k x 8 – This memory has a total of 2k (2048) addresses, selectable by bus A (lines A0 to A10). Each address can store a single byte accessible by bus D (lines D0 to D7).
  • Memory 2k x 32 – This memory has a total of 2k (2048) addresses, selectable by bus A (lines A0 to A10). Each address can store 4 bytes (32 bits) accessible by bus D (lines D0 to D31).

Looking at these memories as standalone units, there is no such thing as aligned and unaligned access. The available addresses start at 0 and go up to 2047. The issue with unaligned access comes into play when we integrate these memories into a larger system and map them into that system’s address space.

If we use memory with an 8bit data bus in a 32bit system, we will not be able to access the natural 32bit data size using single-cycle access. We will have to access 4 consecutive addresses from the memory (each 8bit) so we can construct a data with a 32bit size. If we choose a 32bit memory (e.g 2k x 32 memory shown in Fig.2) then we can have 32bit single-cycle accesses. Smaller access sizes (e.g 8bit, 16bit) are also possible. For write operation individual byte enable (BW) lines can be used. For read access, the whole 32bit data can be read and the unnecessary bits can be discarded.

Unaligned Memory Access Example

Fig. 3 Memory (2k x 32) integrated in a 32bit system

In fig.3 we have a simplified example of 2k x 32 memory integration in a CPU-based system. The first thing we should note is that the memory is not directly connected to the 32bit CPU bus. This is due to the fact that memories have specific interfaces (pinout) and connecting them to a bus requires additional logic. This logic is usually implemented in a memory controller unit that takes care of all low-level timing requirements for read/write operations.

Another very important thing to consider is the mapping of the memory unit into the available memory map (system address space). This is implemented using a decoder logic (e.g Interconnect unit shown in Fig.3) that takes as input an address from the system address space and decodes it to address values for the memory.

In Fig. 4 below we can see examples of write accesses issued on the 32bit memory bus that are decoded to write accesses for the 2k x 32bit memory. In our example, the memory is mapped at address 0x00008000 of the system address space. We can see that four addresses from the system address space, correspond to one address from the memory unit. This is due to the fact that the system address space is byte addressable (each address is 1 byte) while the memory in our example holds 4 bytes in a single address. Byte access is made possible by the use of individual byte enable signals (BW) of the memory. For example, a write operation of a byte at address 0x00008005 will be interpreted as a write operation at the second byte (BW2) at address 0x001 of the memory.

Fig. 4 Aligned memory access of a 2k x 32 memory in a 32bit microprocessor system

All examples in Fig. 4 show aligned memory access. Now let’s look at situations where unaligned access can occur. We are again using the setup shown in Fig. 3. If the start address of the access and the end address (address + transfer size) from the system address space are decoded as two addresses on the side of the 2k x 32 memory, then we have unaligned memory access. For example, system addresses 0x00008004 to 0x00008007 are decoded to a single address 0x001 on the memory unit side. If however, we issue a 4 bytes write transfer at address 0x00008005 (as shown in the table below), then this will be decoded as two separate access operations on the 2k x 32 memory side. One access to address 0x001 where we will write 3 bytes, and one at address 0x002 where we will write one byte.

Fig. 5 Unaligned memory access examples

For the memory 2k x 32 that we used in the examples so far, we can say that it has a 32bit word boundary. Read or write across this boundary is not possible in a single cycle, and such an attempt is classified as unaligned memory access. There are two approaches for handling unaligned accesses:

  • Break up the access into multiple accesses (as shown in Fig.5) – This can be done by the memory controller. The downside of this approach is the additional time required for performing the operation.
  • Restrict unaligned access – The memory controller will return an error if such a request is made. The benefit of this approach is the reduced complexity of the hardware.


As a conclusion, we should note that unaligned access is not necessarily a bad thing. There are many CPU architectures that support it. Unaligned memory access usually consumes more time, but it allows more efficient use of the available memory.

The most common situation where you can encounter unaligned memory access are:

  • Casting variables to types of increased sizes
  • Accessing multiple bytes of data using pointers (especially when casting is involved)

Was this article helpful?