This article accompanies Lesson 10 Function Calls and Lesson 11 Stack Operations of the ARM64 assembly tutorial of LaurieWired.

ARM64’s stack is 16-bytes aligned, so 64-bit registers got stored in pair, or some gap in the stack is left. Refer to the The first ARM64 assembly program for how to call Linux (for ARM64) exit() to end execution.

Function call with stack operation

For what [sp, #-16]! does etc, refer to Aarch64 addressing mode.

func.s:

.global _start

_start:
    mov x0, #1
    mov x1, #2

    stp x0, x1, [sp, #-16]!      // store pair x0 x1
                                 // store x0 to sp - 16
                                 // store x1 to sp - 8
                                 // and set sp to sp - 16
    bl add_nums
    ldp x0, x1, [sp], #16        // load pair x0 x1
                                 // load x0 from sp
                                 // load x1 from sp + 8
                                 // and set sp to sp + 16

    mov x8, #0x5d
    svc #0

add_nums:
    add x0, x0, x1
    ret

x0 after the add_nums call would store 3, but after the ldp call it was reset to the old value 1.

$ as -o func.o func.s
$ gcc -o func func.o -nostdlib -static
$ ./func; echo $?
1

Generic prologue and clean-up code around function call

If one function calls another in nested way, the link register lr need be stored on the stack for each function call, along with the frame pointer fp. Aarch64 code walkthrough gave some generic prologue and clean-up code around function call for example:

Prologue:

stp     x19, x20, [sp,#-0x20]!
str     x21, [sp,#0x10]
stp     fp, lr, [sp,#-0x10]!
mov     fp, sp

Clean-up:

ldp     fp, lr, [sp], #0x10
ldr     x21, [sp, #0x10]
ldp     x19, x20, [sp], #0x20
ret

In the above example, after entry into a function, before executing any other code of it, the stack would be pushed

[empty 64-bit]
x21
x20
x19
lr
fp                <--- the new stack pointer sp

and grow downward: after the line stp fp, lr, [sp,#-0x10]!, the stack pointer sp would point to the bottom of the stack, an address storing the old frame pointer fp. Then the line move fp, sp sets the frame pointer fp to the current stack pointer sp. All this will be reversed right before exit of the same function.

In this way, successively called functions could remember their own return address, and simply pop from the stack into the link register lr before calling ret for return.

References

ARM64 assembly tutorial, LaurieWired, https://www.youtube.com/playlist?list=PLn_It163He32Ujm-l_czgEBhbJjOUgFhg.

Introduction to Aarch64 architecture, 8. The Stack, https://hrishim.github.io/llvl_prog1_book/stack.html.

Aarch64 part 3: addressing mode, https://devblogs.microsoft.com/oldnewthing/20220728-00/?p=106912.

Aarch64 part 24: code walkthrough, https://devblogs.microsoft.com/oldnewthing/20220829-00/?p=107066

First ARM64 assembly program, /2025/07/13/first-arm64-code.html.

Arm Compiler armasm User Guide. On https://developer.arm.com, search for “armasm user guide”. In the result list, find the latest version of “Arm Compiler armasm User Guide”.