Shellcode Basics & Assembly Guide for Beginners

x32x01
  • by x32x01 ||
Shellcode Development for Beginners:
Understanding Assembly, CPU Registers & Building Your First Shellcode
💥🐚
If you’ve ever explored the world of binary exploitation, you’ve definitely come across the term shellcode. Shellcodes are nothing but machine instructions used as a payload to exploit vulnerabilities. And while most beginners simply download shellcodes from websites like shell-storm.org, this series aims to teach you how to build your own shellcodes from scratch.

In Part 1, we’ll take a beginner-friendly journey into:
  • Understanding CPU registers
  • Writing our first assembly program
  • Assembling & linking using NASM
  • Extracting raw shellcode bytes
  • Fixing NULL bytes
  • Executing shellcode in C
  • Preparing for more advanced shellcode techniques

Let’s jump in! 🚀



Understanding CPU Registers 🧠⚙️

Assembly is often described as “the language of the operating system.” But before we write a single instruction, we must understand the foundation of assembly programs: CPU registers.

A modern x86-64 CPU contains several 64-bit general-purpose registers. These registers are extremely fast - much faster than RAM or disk - and are used for arithmetic, memory access, system calls, and more.

General Purpose Registers (GPRs)

RegisterUsage
RAXAccumulator - arithmetic & I/O operations
RBXBase register - indexing data in memory
RCXCounter - loop counters and shifts
RDXData register - I/O and multiprecision arithmetic

Pointer Registers

  • RIP → Instruction Pointer
  • RSP → Stack Pointer
  • RBP → Base Pointer (used to access function variables)

Index Registers

  • RSI → Source index
  • RDI → Destination index

Flag Registers

These registers control program flow:
  • CF – Carry Flag
  • PF – Parity Flag
  • ZF – Zero Flag
These flags are essential for conditional jumps like JZ, JNZ, etc.

Now that you understand the CPU register landscape, let’s write our first assembly program. 📝



Writing Your First Assembly Program (“Hello World”) 🌍👋

Every assembly program consists of three major sections:
  • .text → program instructions
  • .data → initialized data
  • .bss → uninitialized data
Linux supports two syntax styles: AT&T and Intel. We’ll use Intel syntax because it’s cleaner and widely used in reverse engineering.

Step 1: Program Skeleton

Code:
global _start

section .data
    message db "Hello World", 0xa
    msg_len equ $ - message

section .text
_start:
    ; write(1, message, msg_len)

    mov rax, 1
    mov rdi, 1
    mov rsi, message
    mov rdx, msg_len
    syscall

    ; exit(0)
    mov rax, 60
    mov rdi, 0
    syscall



Assembling and Linking the Program 🛠️🐧

To assemble the code:
Code:
nasm -f elf64 1.asm -o 1.o
ld 1.o -o 1
./1

You should see: Hello World
Congrats - you just built your first assembly program! 🎉



Extracting Shellcode from the Binary 🔍📦

Now we extract raw machine instructions using objdump:
Code:
objdump -d 1 -M intel
This shows:
  • Assembly instructions
  • Corresponding hex opcodes
    These opcodes are what become the shellcode payload.

To extract only opcodes:
Code:
objdump -d ./PROGRAM | grep -Po '\s\K[a-f0-9]{2}(?=\s)' | sed 's/^/\\x/g' | perl -pe 's/\r?\n//' | sed 's/$/\n/'

This produces something like:
Code:
\x48\x31\xc0\x48\x83...



Removing NULL Bytes (Avoiding Shellcode Breakage) ⚠️🚫

NULL bytes (\x00) can terminate strings unexpectedly - which breaks shellcode.

Example trick:

Instead of:
Code:
mov rax, 1

We use:
Code:
xor rax, rax
add rax, 1
This avoids NULL bytes.

Another trick:

Use LEA instead of MOV to load addresses more safely:
Code:
lea rsi, [rel message]
These optimizations are required for real-world shellcode crafting.



Sample Shellcode Produced 🐚🔥

After optimizing, you get a shellcode like:
Code:
\x48\x31\xc0\x48\x83\xc0\x01\x48\x31\xff\x48\x83\xc7\x01\x48\x8d\x35\xeb\x0f\x00\x00\x48\x31\xd2\x48\x83\xc2\x0c\x0f\x05\x48\x31\xc0\x48\x83\xc0\x3c\x48\x31\xff\x0f\x05



Executing Shellcode in C 💻🐍

To test shellcode, we use a small C wrapper program:
C:
#include <stdio.h>
#include <string.h>

char code[] =
"\x48\x31\xc0\x48\x83..."; // your shellcode

int main() {
    printf("len: %zu bytes\n", strlen(code));
    (*(void(*)()) code)();
    return 0;
}

Compile with protections disabled (old systems recommended):
Bash:
gcc exec.c -o exec -fno-stack-protector -z execstack
Run: ./exec
If your shellcode spawns a shell or prints “Hello World,” then it works 🎉



Conclusion 🏁✨

In this first part, you learned:
  • How CPU registers work
  • How to write your first assembly program
  • How to assemble & link using NASM
  • How to extract opcodes from binaries
  • How to remove NULL bytes
  • How to execute shellcode inside a C program

In the next article, we’ll dive deeper into JMP-CALL-POP, stack-based shellcode, and position-independent code (PIC) - key skills for writing advanced exploit payloads.
Stay tuned for Part 2! 🔥💻
 
Last edited:
Related Threads
x32x01
Replies
0
Views
127
x32x01
x32x01
x32x01
Replies
0
Views
1K
x32x01
x32x01
x32x01
Replies
0
Views
956
x32x01
x32x01
x32x01
  • x32x01
Replies
0
Views
929
x32x01
x32x01
x32x01
Replies
0
Views
1K
x32x01
x32x01
Register & Login Faster
Forgot your password?
Forum Statistics
Threads
660
Messages
668
Members
67
Latest Member
TraceySet
Back
Top