0
0
ARM Architectureknowledge~15 mins

Subroutine call convention (AAPCS) in ARM Architecture - Deep Dive

Choose your learning style9 modes available
Overview - Subroutine call convention (AAPCS)
What is it?
The Subroutine Call Convention (AAPCS) is a set of rules that define how functions (subroutines) communicate in ARM-based systems. It specifies how arguments are passed, how results are returned, and how the processor's registers and stack are used during function calls. This ensures that different pieces of code can work together correctly, even if written by different programmers or compiled by different tools.
Why it matters
Without a standard call convention like AAPCS, programs would not know how to exchange information during function calls, leading to errors and crashes. It allows software components to interact reliably, making complex programs and operating systems possible. This standardization also helps optimize performance and resource use on ARM processors, which are common in many devices.
Where it fits
Learners should first understand basic computer architecture concepts like registers, memory, and the stack. After learning AAPCS, they can study advanced topics like compiler design, operating system internals, and ARM assembly programming.
Mental Model
Core Idea
AAPCS is a shared language that tells functions how to pass data and manage resources so they can work together smoothly on ARM processors.
Think of it like...
It's like a well-organized kitchen where every chef knows exactly where ingredients are placed, how to pass dishes, and clean up, so the cooking process flows without confusion or mistakes.
┌───────────────────────────────┐
│          Function Call         │
├─────────────┬─────────────────┤
│ Arguments   │ Passed in R0-R3 │
│             │ and/or stack    │
├─────────────┼─────────────────┤
│ Return Val  │ In R0           │
├─────────────┼─────────────────┤
│ Caller Save │ R0-R3, R12      │
│ Callee Save │ R4-R11, SP, LR  │
└─────────────┴─────────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Registers and Stack Basics
🤔
Concept: Introduce the basic hardware elements used in function calls: registers and the stack.
Registers are small, fast storage locations inside the CPU used to hold data temporarily. The stack is a special area in memory that stores information like function arguments, return addresses, and local variables in a last-in, first-out order. Together, they help manage data during program execution.
Result
Learners understand the tools (registers and stack) that functions use to exchange data and keep track of execution.
Knowing what registers and the stack are is essential because AAPCS defines how these are used to pass information between functions.
2
FoundationWhat is a Call Convention?
🤔
Concept: Explain the purpose of a call convention as a set of rules for function communication.
A call convention is like a contract that all functions agree to follow. It defines how to pass inputs (arguments), where to put outputs (return values), and which parts of the CPU must be saved or restored during calls. This ensures different code pieces can work together without confusion.
Result
Learners grasp why call conventions exist and their role in program correctness.
Understanding the call convention concept helps learners see why AAPCS is necessary for ARM systems.
3
IntermediateArgument Passing in AAPCS
🤔Before reading on: do you think all function arguments are passed using registers or the stack? Commit to your answer.
Concept: Learn how AAPCS specifies passing the first four arguments in registers and the rest on the stack.
AAPCS uses registers R0 to R3 to pass up to four function arguments quickly. If there are more than four arguments, the extra ones are placed on the stack. This approach speeds up common cases while still supporting functions with many parameters.
Result
Learners can predict where each argument will be found during a function call.
Knowing this rule helps understand performance optimizations and how to read or write ARM assembly that follows AAPCS.
4
IntermediateReturn Values and Register Usage
🤔Before reading on: do you think return values use the same registers as arguments or different ones? Commit to your answer.
Concept: Understand how return values are passed back and which registers must be preserved by caller or callee.
AAPCS specifies that return values are placed in register R0. Registers R0 to R3 and R12 are caller-saved, meaning the calling function must save them if needed. Registers R4 to R11, the stack pointer (SP), and link register (LR) are callee-saved, so the called function must preserve them if it uses them.
Result
Learners know how to manage register saving responsibilities to avoid data loss.
This knowledge prevents bugs caused by overwriting important data during nested function calls.
5
IntermediateStack Frame Structure in AAPCS
🤔
Concept: Explore how the stack frame is organized during a function call under AAPCS.
When a function is called, a stack frame is created to store return addresses, saved registers, and local variables. AAPCS defines a standard layout so debuggers and other tools can understand the program state. The frame grows downward in memory, and the stack pointer (SP) points to the top.
Result
Learners visualize how function calls manage memory on the stack.
Understanding stack frames is key to debugging and writing low-level code that interacts with function calls.
6
AdvancedHandling Variadic Functions and Alignment
🤔Before reading on: do you think variadic functions use the same argument passing rules as fixed-argument functions? Commit to your answer.
Concept: Learn how AAPCS manages functions with variable numbers of arguments and enforces data alignment.
Variadic functions (like printf) require special handling because the number of arguments is not fixed. AAPCS mandates that all arguments are aligned to 4-byte boundaries and that the stack is aligned to 8 bytes at function entry. This ensures consistent access and prevents crashes on some ARM processors.
Result
Learners understand the extra care needed for flexible argument functions and memory alignment.
Knowing alignment rules avoids subtle bugs and improves compatibility across ARM devices.
7
ExpertOptimizations and ABI Extensions in AAPCS
🤔Before reading on: do you think all ARM systems use the exact same AAPCS rules, or are there variations? Commit to your answer.
Concept: Discover how AAPCS has evolved with optimizations and extensions for different ARM architectures and operating systems.
While AAPCS defines a baseline, some ARM platforms add extensions or optimizations, like using floating-point registers for arguments or special handling for Thumb mode. These variations improve performance but require careful adherence to the specific ABI version. Tools and compilers must support these to generate correct code.
Result
Learners appreciate the complexity and adaptability of AAPCS in real-world ARM development.
Understanding these nuances helps experts write portable, efficient code and troubleshoot platform-specific issues.
Under the Hood
At the hardware level, AAPCS works by assigning specific CPU registers for passing data during function calls, minimizing memory access delays. When a function is called, the processor uses the link register (LR) to store the return address. The stack pointer (SP) manages the stack frame, which holds saved registers and local data. The convention ensures that the caller and callee agree on which registers to save and restore, preventing data corruption. The CPU's instruction set supports pushing and popping registers efficiently, enabling fast context switches between functions.
Why designed this way?
AAPCS was designed to balance speed and flexibility. Using registers for the first few arguments speeds up common calls, while the stack supports more complex cases. The division of caller-saved and callee-saved registers reduces unnecessary saving and restoring, improving performance. The design also supports debugging and interoperability across different compilers and languages. Alternatives like passing all arguments on the stack were slower, and saving all registers on every call was inefficient, so AAPCS strikes a practical compromise.
┌───────────────┐
│ Caller        │
│               │
│ R0-R3: Args   │
│ Push LR to SP │
│ Call Function │
└───────┬───────┘
        │
┌───────▼───────┐
│ Callee        │
│ Save R4-R11   │
│ Use R0-R3     │
│ Return in R0  │
│ Restore R4-R11│
│ Return to LR  │
└───────┬───────┘
        │
┌───────▼───────┐
│ Caller        │
│ Restore R0-R3 │
│ Continue Exec │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think all function arguments are always passed on the stack? Commit to yes or no.
Common Belief:All function arguments are passed on the stack.
Tap to reveal reality
Reality:AAPCS passes the first four arguments in registers R0 to R3, not on the stack.
Why it matters:Assuming all arguments are on the stack can lead to incorrect assembly code or debugging confusion, causing bugs and crashes.
Quick: Do you think the called function must always save all registers it uses? Commit to yes or no.
Common Belief:The callee must save all registers it uses during a function call.
Tap to reveal reality
Reality:Only callee-saved registers (R4-R11, SP, LR) must be preserved by the callee; caller-saved registers (R0-R3, R12) are the caller's responsibility.
Why it matters:Misunderstanding this leads to redundant saving or missing saves, causing performance loss or data corruption.
Quick: Do you think return values can be placed in any register? Commit to yes or no.
Common Belief:Return values can be placed in any register the function wants.
Tap to reveal reality
Reality:AAPCS specifies that return values are always placed in register R0 (or R0-R1 for larger values).
Why it matters:Ignoring this causes caller and callee to disagree on where to find results, breaking program correctness.
Quick: Do you think stack alignment is optional in ARM function calls? Commit to yes or no.
Common Belief:Stack alignment is not important and can be ignored.
Tap to reveal reality
Reality:AAPCS requires the stack to be 8-byte aligned at function entry to ensure proper operation on ARM processors.
Why it matters:Ignoring alignment can cause crashes or incorrect behavior, especially with SIMD or floating-point instructions.
Expert Zone
1
Some ARM variants use floating-point registers for argument passing when the floating-point unit is enabled, which changes the standard AAPCS behavior subtly.
2
The link register (LR) is used to store return addresses, but in nested calls or interrupts, it must be saved on the stack to avoid overwriting.
3
Thumb mode instructions have specific calling convention adaptations, requiring careful handling of instruction sets and register usage.
When NOT to use
AAPCS is specific to ARM architecture; for other processors like x86 or RISC-V, different calling conventions apply. In some embedded systems with minimal runtime, custom or simplified conventions may be used for performance or size reasons.
Production Patterns
In real-world ARM software, AAPCS is used by compilers to generate interoperable code across languages like C, C++, and assembly. Operating systems rely on it for system calls and interrupt handling. Developers use it to write optimized assembly routines and to debug low-level issues by understanding register and stack states.
Connections
Application Binary Interface (ABI)
AAPCS is a part of the broader ABI that defines how software components interact at the binary level.
Understanding AAPCS helps grasp how ABIs ensure compatibility between compiled code and operating systems.
Stack Data Structure
AAPCS uses the stack as a fundamental data structure to manage function calls and local data.
Knowing stack behavior clarifies how function calls nest and how data is preserved across calls.
Human Communication Protocols
Like AAPCS, communication protocols define rules for exchanging information to avoid misunderstandings.
Recognizing this parallel highlights the importance of agreed rules for successful interaction in both machines and humans.
Common Pitfalls
#1Passing more than four arguments without using the stack.
Wrong approach:void func(int a, int b, int c, int d, int e) { // Assume all args in R0-R4 // Access e directly in R4 }
Correct approach:void func(int a, int b, int c, int d, int e) { // First four args in R0-R3 // Fifth arg 'e' accessed from stack }
Root cause:Misunderstanding that only the first four arguments are passed in registers; extra arguments go on the stack.
#2Not preserving callee-saved registers in a function that uses them.
Wrong approach:void func() { // Use R4 without saving asm("mov r4, #10"); // ... }
Correct approach:void func() { asm("push {r4}"); asm("mov r4, #10"); // ... asm("pop {r4}"); }
Root cause:Confusing caller-saved and callee-saved registers, leading to data corruption.
#3Ignoring stack alignment requirements.
Wrong approach:void func() { // Stack pointer not aligned to 8 bytes // Push/pop operations unaligned }
Correct approach:void func() { // Ensure SP is 8-byte aligned before call // Use aligned push/pop }
Root cause:Lack of awareness about hardware requirements for stack alignment causing runtime errors.
Key Takeaways
AAPCS defines a clear, standardized way for ARM functions to pass arguments, return values, and manage registers.
Using registers R0 to R3 for the first four arguments speeds up function calls, while the stack handles additional arguments.
Caller-saved and callee-saved registers split responsibilities to optimize performance and prevent data loss.
Stack frames organize function call data and must maintain proper alignment for reliable execution.
Understanding AAPCS is essential for writing, debugging, and optimizing ARM assembly and compiled code.