Assembly Language Crash Course 1
Hey. So today we will work on assembly language.
Assembly Language is just translated binary or machine code.
An example of this would be to write out 5
in binary you would write 0101
but in assembly it’s just 5
.
It is very dependent on an understanding of computer science, but I will hold your hand throughout all that.
Anyway sit back, relax, and get ready to learn.
To start off we will learn a bit about the CPU.
At its core the cpu consists of some basic parts.
- CPU Registers
- Arithmetic and Logic Unit
- Clock
Every clock cycle the CPU gets an instruction from RAM and executes it.
RAM is random access memory.
What is memory?
Memory or registers is just a place that holds data.
This data is in the form of binary.
Anyway let’s get our programming environment set up.
To program you can use this or a real assembler.
A good real assembler is NASM and it’s my assembler of choice.
Today we will be using x86 32-bit NASM running either in the website or on your computer.
To install NASM go here and install the latest version for your operating system.
Unless you are on Linux look up the corresponding NASM tutorial for your OS.
But here’s is a quick tutorial for Linux.
Type this into your terminal to compile:
nasm -f elf *.asm; ld -m elf_i386 -s -o demo *.o
Replace the *s with your file name.
Then type this to link:
ld -m elf_i386 *.o -o *
Replace the *s with your file name.
Then type this to run in your current directory:
./*
Replace the * with your file name.
Anyway we can finally start coding!
If your are using NASM open a file with a .asm
extension in your text editor of choice. Then in a terminal type cd <path to your file>
if you are on Mac or Linux. If you are on Windows type chdir <path to your file>
.
If you are on the website this is all abstracted from you. But you must clear the pre-generated code on the website.
Section 1: moving values into registers.
Then type this in either your text file or the website:
global _start _start: mov eax, 1
If you run it you get nothing as output.
Why?
Well let’s break down the program.
The global declaration at the top is declaring the entry point for our program.
The _start:
is a label, used to split up our code.
The mov
is a keyword used to tell the computer that we want to move a value into a register.
The eax
is a general purpose register.
And lastly 1
, well you know what 1 is.
So in turn the syntax for mov
is:
mov <destination>, <value>
For just a second let’s talk about Cpu registers.
Take a look at the figure below:
It is a cut-down cpu memory map.
I don’t expect you to even grasp most of this yet but what I do want you to understand is that we will be using 4 general purpose registers quite often.
- eax
- ebx
- ecx
- edx
Anyway back to the code!
The mov command is very useful.
And it can be used in conjunction with other commands to produce great things.
And this brings us to our next section.
Section 2: basic interrupts
Type out this code in either a new file or clear out the code you wrote on the website and write this.
global _start _start: mov eax, 1 mov ebx, 0 int 0x80
What does this int 0x80
mean?
The int
tells the compiler that we want to perform a system interrupt (not create a variable).
The 0x80
is a hexadecimal code for a basic interruption.
Values starting with 0x
are hexadecimal.
Why are we moving values into eax
and ebx
?
The value in eax
is the code for the type of interruption we want to perform. This is not the case for different interruption codes.
It just so happens that this 1
is the code to exit the program.
This interrupt is dependent on ebx
because that is the exit code. 0
is the exit code that everything ran fine. This doesn’t effect the program at all but it can be used for debugging.
Anyway that is it for the most basic of interrupts.
Section 3: the text and data sections.
An assembly program has 3 sections and one of them is the data section.
Take a look at this program.
section .text global _start _start: mov eax, 1 section .data msg db “Hello World”, 0x0a
The code we have written so far can go in the text section or where our code goes.
The definition of constant pointers go in the data section.
A constant is a value that can not change.
A pointer is a beta-variable. It points to an address in memory.
What is the db
?
db
is a data size, typically 2 bytes long.
There is many more sizes. But we will worry about those later.
msg
is the name of the pointer.
And 0x0a
is a hexadecimal code used to represent a newline character or ‘\n’.
Farewell until part 2!
Anyway that is it for part 1 of the crash course upvote and give feedback.
YEAS!
An example of this would be to write out 5 in binary you would write 0101, but in assembly, it’s just 5.
But isn't binary often stored in ASCII, by bytes? Thus 5 would be 5, not 4 separate bytes to represent that number? That would mean -1u128
would take up 128 bytes, and not fit all into 16 bytes or less?
With NASM on Windows, I believe you also need GCC installed. Theoretically, you'd run nasm -f win32 *.asm
then run gcc *.obj
. Haven't tried it yet, just thought I'd mention it.
Ws only
why you made it.
this is cool.
btw i am making like assembly programming language.
Could you elaborate?
@TheForArkLDI will admit that it seems a bit sudden to learn interrupts in the very first tutorial, but I mean it works so idk if that's even helpful statment anyways that's my feedback thanks for making the series! :)
No problem! Thanks for the feedback :D
@Highwaymandarn. I was gonna learn it and then teach it oh well.
YAYYYYYYYYY!!!!!!!! :D AWESOME!!
I've always wanted to learn assembly, but it was to hard
This tutorial makes it easy. It will still be a challenge, but, you should have a much easier time.
@ApoorvAgrawalI also speak Assembly "beep bop boop beep"
:D lol
@DynamicSquid