OSDev, part 2
Welcome back! This tutorial is going to be a bit different, and I will explain why!
I have done some thinking, and over the past few days(maybe a week, haven't kept track of my days), I have came to a simple conclusion on how I am going to go about with this series.
As of the recent tutorials over my series of OSDev, I have described within the tutorials the tools to use to write a 16-bit OS. However, I have noticed many people following the series are more into writing an actual functional OS rather than an OS with the bare minimum of even the simplest OS ever written(whatever that might be).
So, rather than continue the series of 16-bit OSDev, I am going to rather lead off of the previous tutorials, and build up the knowledge of the good ol' 32-bit OSDev, but with a few twists. We've already got a start on the bootloader, which we will indeed complete in this tutorial, then I am going to follow up from that and teach you guys on how to jump into 32-bit protected mode, where we can write C code.
Perhaps I will follow up the 16-bit OSDev series as a sub series, and the pure intention of that series was just for the learning experience of quite literally doing everything...from scratch.
However, those ways of doing things are quite literally obsolete. It is, in fact, fun to do such a thing, but why do that when:
- You can write your own bootloader then jump into 32-bit protected mode
- Create your own file format that your specific kernel supports
- Create your own commands with simplicity without the need to write quite literally hundreds, if not thousands, of lines of assembly?
Yes, you read that correctly, creating your OWN file format. And yes, I will be going over that in this series.
But, reasons being, I want my series over OSDev to be useful, writing a 16-bit OS is quite hard and not so approachable, nevertheless, 16-bit OSDev does not have much to offer, thusly restating my point that I want to teach you guys how to write a functional, lightweight OS instead of a bare minimum hello world kernel in 16-bit OSDev or using crappy compilers that barely work on Linux and rather require development on Windows.
I have planned this out throughout the past few days, perhaps the past week. This series will be split into a few sub-series.
- 1st sub-series
- Going off of the last tutorial, jumping from the good ol' 16-bit bootloader to 32-bit protected mode
- Starting off the simple needs of any Kernel
- Creating your own file format
- Implementing this file format to be "executable" via your own package manager(to say the least)
- Implementing this file format to be fully functional in your OS
- 2nd sub-series:
- Further enhancing this file format to being capable of "booting" your OS
- Perhaps create a application where this file format "replaces" your current OS terminal with the terminal of the OS
- Perhaps creating our own "assembly" language to "assemble" down to this file format
- 3rd sub-series:
- Ever been interested in emulators? Perhaps we can learn how we can create our very own
This series will, without a doubt, take me some time to A. learn and B. post.
Lets get started
Forgive me, I am a bit rusty, I have been working hard on implementing my own "executable" file format, so I can thusly advance on terms of making my OS compatible with running it's own file format.
Here is where we should be in terms of where we have left off via the last tutorial:
Now, this program is capable of booting our OS. But, we want our OS to do more than just boot. Our OS will also require way more memory than just 512 bytes. So, what do we do about this? Simple!
Lets read some more sectors off the disk!
Yes, in 16-bit assembly, we can, in fact, read sectors from the disk. As stated in my previous tutorial, each sector consists of 512 bytes. And, I think we should all know by now that the boot sector resides in the first sector, meaning we will start the reading from the second sector.
The BIOS gives us a special interrupt for reading from the disk. With this interrupt, we can read as many sectors as we want.
Quick side note: In the installation of qemu for linux, I was rather wrong on how you install it. Instead of
sudo apt-get install qemu it is
sudo apt-get install qemu-system-i386. Apologies!
Before we continue, lets refactor the code a bit. To be safer, lets perform a few memory jumps so we know that we are, indeed, where we are meant to be in memory. In assembly, there are these things called "labels". You can think of labels as functions, they're just bodies of assembly code that we can call anywhere throughout our project.
Although we have the
[org 0x7C00], the memory segmentation starts from
0x0000. So, between the starting address
0x0000 through the code initializing the data registers, will of course..be the code initializing our data registers.
The code will look like so.
This is rather classified as "safe" assembly code. Long subject, perhaps I will go over it in a later tutorial.
Lets get into reading more sectors from the disk, which is also where understanding memory segmentation comes in.
When we read sectors from the disk, we're going to want to store the memory read somewhere. But, how exactly do we do that?
Lock yourselves in, you're in for a ride ;)
Memory segmentation, as seen before as
jmp start:end, is more complex than what you're seeing.
end are both memory segments. Each segment is calculated by
segment x 16.
So, lets say we have a memory segment
0x01C. Which is 28 in decimal. When addressing it to a linear address, it will be calculated as
0x01C x 16, which will thus be
0x1C0. Lets bring this knowledge to memory segmentation in assembly.
In assembly, memory segmentation is of the following syntax:
jmp es:bx. You can think of this as
jmp start:end. The two are added to get the final linear address. So, lets say we have
jmp 0x01C0:0x01E0. This will "jump" to the linear address
0x3A00, or in other words, it will add up to
14848. This is crucial to know, because with one wrong value, your program won't work.
Using this knowledge, knowing that the memory segment is
segment x 16, we're going to have to be cautious of the memory address we assign the sectors to reside in.
Lets load the sectors!
So, the BIOS loads the boot sector in at
0x7C00, so why not place our kernel just above it at
0x7E00? Why ask? That's exactly what we're doing. Applying the knowledge of
segment x 16, we'll have to assign the memory address for the sectors to be loaded at to
0x07E0. We do this by strictly setting the
bx registers. These registers can't be set manually however, in means of doing things like
mov es, 0xwhatever. So, this is where our beloved register
ax comes in! We can assign the value to ax, then assign
es to ax. bx will be zero, to make sure that we are, in fact, loading in at
xor is "faster" than using
mov. I don't know the technical reasons as to why, but that is just something I have learned through my journey of OSDev. So, with that code, we can now safely move onto reading sectors from the disk, and store them at the memory address we provided, being
segment x 16).
We are going to do a few things when reading from the disk:
- First, we are going to verify the sectors
- Secondly, we are going to "reset" the disk
- Then, we'll read them
Why do we want to do all that before reading? Well, first before all, we want to make sure that we'll actually be reading valid memory from the sectors, the we want to "reset" the disk to make sure we sufficiently read.
Through my journey of OSDev, I have had a bit of trouble when it came to this subject. I followed documentations and just couldn't seem to figure out why my bootloader wasn't reading the sectors. Turns out, there is a special value to be set to either read from a floppy, a hard drive etc. For this tutorial, we will be reading from the hard drive. Don't worry. This won't wipe your hard drive. It's just an emulator running the code.
The BIOS offers a special interrupt,
0x13, which allows us to read from the disk. If you remember,
ah is the register that stores the "function call". For each interrupt,
ah has a specific function call. For interrupt
0x13, the function call is
We will be using the BIOS interrupt
0x13 for each step of reading from the disk. Each function is as follows:
- To verify the sectors,
- To "reset" the disk,
- To read,
I personally like to above variables when using assembly, it helps me understand my code more when looking back at it. So, lets quickly kick up some variables to store some vivid information we'll need when reading from the disk.
SECTOR_START is where we want to start reading from, thus being the second sector.
SECTORS_TO_READ will be the total amount of sectors we want to read, being 32.
0x80. Yet again, I do not know the technical reasons as to why it is
0x80, but I do know that with this, we will be reading from the hard disk.
dd defines a 1 byte value(or 8 bits). So we have to explicitly put two zeros in front of the actual value, otherwise those zeros are appended after the values, and when we want
0x80 we'll instead have
I have gone the simple way with reading from the disk, but in this tutorial I will be approaching it with a more broad concept. We will read a single sector at a time.
So, with the BIOS interrupt
0x13, there are multiple different function codes that go along with it. So, lets take a look at what we need to read the sectors:
alis the number of sectors to read
clis the sector to start reading from
dh- head number(normally zero)
dl- drive number, in our case will be
So, knowing we will be reading a sector at a time, and that
cl is the sector we want to read, we will go ahead and put those right under where we defined where we want the sectors to load in at.
Afterwords, since this will have to be a loop until
cl reaches the total amount of sectors we want to read, we are going to have to manually put a loop here. So, after setting
cl, go ahead and put a "sub-label"(it's what I like to call them) underneath, beginning with a
.. I named mine
.loop. You can name yours however.
Now, the first thing we want to do is reset the disk. This is simple. All we need is to set
dl to the drive number, in our case being
Afterwords, we want to verify the sectors we are about to read. We don't want to think we are reading valid sectors and end up with an error later on in our code.
We verify the sectors by setting
0x04, now, verifying the sectors takes on the same steps to read sectors. We have already set
cl, so we only need to set
dh are both zero, and will be when it comes to reading the sectors.
When verifying the sectors, we have to have some way to know if it is valid or not. The return status of the verification is stored in
ah is not zero, then verification failed. It is probably good practice to go ahead and re-loop, however, we will just use the BIOS teletype
ah = 0x0e to simply display an "E" if an error occurred.
We do this by comparing
ah to the value
0x00. This is done through
cmp reg, val. After this line of code, we can use a variety of different directives:
- jne - Jump not equal
- je - Jump if equal
- jg - Jump if greater
- jl - Jump if lesser
- jge - Jump greater equal to
- jle - Jump less equal to
and probably a few others that I am missing, but that's the gist of it. So, we're going to want to compare
ah to zero, then use
jne if it is not the value zero. The same concept goes for reading the sectors.
Note, the label that we jump to that will display 'E' if there is an error verifying the sectors is placed underneath the
jmp $. That is because, if we were to place it above, the function code inside the label would run with or without our consent.
Now, lets read the sectors. To read, we need to do the same as we just did to verify, all we need to do is set
dh to zero, then set
After calling the BIOS interrupt
0x13, we are going to have the line of code as follows
jc failed. What this line does is, if the carry flag is set, then there was an error. This is crucial, because sometimes the
ah may show successful, but the carry flag is set.
Another thing we are going to add, just to be sure we have read the sectors, the total sectors read is saved in
al, so we will thus check
al and see if it is zero, and use
je to jump to failed if it is, in fact, zero. If it is zero, that means we read zero sectors.
Alright sweet! But wait, we're forgetting something. We're reading one sector at a time, meaning we need to add in some code to A. check the value of
cl against the total sectors we want to read, and B. increment
cl if the value is, in fact, not equal to the total amount of sectors.
Lets go ahead and add this, it's going to use the same logic as applied to checking
ah to zero. Only, we are checking
cl to the total sectors read, meaning, when it reaches that amount, we want to jump to another label that will finish off the bootloader. It's harder than it sounds though, and you'll find out why here shortly ;)
We can use the
add instruction to increment
cl by one throughout the loop, until it reached the amount of sectors we want to read. I set
0x0021 to make sure that we are efficiently reading 32 sectors, I tested
0x0020 and it, for some reason, stops at 30 sectors. Perhaps it's just the way I am using the
cmp instruction. Nevertheless, it is what it is, and it seems to work. That's the main thing you can ask for when it comes to assembly, "Please freaking work!".
Sweet, we now have a "function" that A. resets the drive through each loop B. verifies the sectors we are about to read and C. reads the sectors and re-loops if
cl is not the same amount as
SECTORS_TO_READ. Now, we have to add in the label
done that we jump to when
cl has reached
For now, lets just make the
done "function" display "D", then halt.
Now, one last thing, when we read sectors, we have to pad out the total bytes read. If we read and we don't pad any bytes for the sectors read, we will error. So, take 32 * 512, and you will(or should) get 16384. For the time being, we will just add
times 16386 db 0 at the end of our assembly code, we will want to, later on, for efficiency, move this
times instruction into the assembly file where we run 32-bit assembly. But for now, having it at the end of our current assembly code will work fine.
Remember when I said 512 bytes is plenty for the bootloader? I meant it. We are only 112 bytes in with the program above
You can go ahead and compile the assembly code and test your output by using qemu. I am going to go ahead and move on to the next thing.
Well, we now have sufficiently read sectors, and we now have 16 thousand bytes available to us(that's allot). But, now we are wanting to enter into this amazing world known as 32-bit protected mode. But, that's the problem. We're in 16-bit land. The minimal thought of 32-bit programs is like thinking of heaven(jokes). So, the question is, how on earth are we going to be able to transition from 16-bit to 32-bit?
It's not as simple as just creating another assembly file and compiling it as 32-bit. We have to transition over from the bootloader to the amazing world of 32-bit programming. This is where the ideal of the GDT comes in, or in other words, the Global Descriptor Table.
What is the GDT?
You can very simply think of the GDT as a data structure, which stores information of various memory areas used throughout the program, which thusly contains information such as:
- The base address
- The size
- Access rights
- And rights over execution and writing
The "memory areas" are known as segments. This is where the segment descriptor comes in at, which is really the "logic" behind the GDT.
Having a good understanding of the segment descriptor could help to further understand the functionality behind the GDT, as well as it's logic. When it comes to this low level stuff, it tends to become a trend to understand everything you do and why you do it, as well as having the understanding of each concept that you are implementing, in our case, the GDT, which happens to lead us into needing to understand a segment descriptor.
What is the segment descriptor?
A segment descriptor, in simple terms, can be thought of as a data structure holding values critical to describing the memory segment referred to in the logical address.
The segment descriptor can get confusing, but the gist of it is that it will store critical information over a current program in memory. The GDT can access up to 4 bytes, as of what I have seen. I am guessing we all have a PC that is modern from the era of the 80286, meaning that each segment descriptor is 8 bytes long.
A segment descriptor has critical values that of which are:
- The segment base address
- The segment limit
- Access rights
- Control bits
Lets take a visual look at what this "data structure" would look like:
Lets look over the above "data structure". Keep in mine, the above and the below are both one, I am just splitting them up to explain each part one at a time.
- The base address is a 32-bit memory address of the segment
- G stands for "Granularity", if it is clear then the limit is set in unit measure of bytes, with a max of 1048576 bytes. If set, limit is in unites of 4096-byte pages, with a max of 68719476736 bytes.
- D(Default operand size), if clear, is a 16-bit code segment. If set, it is a 32-bit code segment
- B(Big), if set, maximum offset size for data is increased to 32-bit 0xffffffff. If cear, maximum offset size for data is 16-bit 0x0000ffff.
- If set, this will be a 64-bit segment, D has to be clear(set to zero).
- L cannot be set if D/B is set
- AVL is for software use. It is not used by hardware
Lets review the above:
- P(present), if set, generates a "segment not present" exception.
- DPL(Descriptor Privilege Level) is the privilege level needed to access the current descriptor
- If set, it will be a code segment descriptor.
- If it is clear, it will be a data/stack segment descriptor
- This will thusly have "d" replaced by "c", "c" replaced by "e" and "r" replaced by "w"
- C(conforming) code in this segment can be called from less-privileged rights
- E(Expand down)
- If this is clear, the segment expands from the base address up to base + limit
- If set, the segment expands from max offset down to the limit
- If clear, the segment can be executed but not read from
- If clear, the data segment may be read but not written to
- Set to 1 by software when segment is accessed. It is also cleared by software
It's allot to take in, but don't worry. We'll go a bit deeper into this subject in the next tutorial.
So, we need the GDT in order to allow ourselves to jump into the amazing world of writing/using 32-bit code/executable's. The GDT's base starts at zero and leads up to a segment limit of 4GB. So, we need some assembly code that will create this GDT to open up this available 4GB of memory.
Lets apply this to our assembly code. Lets take a deeper look at this, shall we?
The first thing we need to do is set the segments, which is done via loading in our GDT. There are two required segments, which are the data segment and the code segment. Lets look for specifically at the information we'll need:
- First double word
- Bits 0-15 will define the segment limit
- bits 16-31 will be the first 16 bits of the base address
- Second double word
- Bits 0-7 will be the base address
- Bits 8-12 will be the segment type/attributes
- Bits 13-14 will be the privilege level
- 0 is the highest privilege level(OS)
- 3 is the lowest privilege level(user application)
- Bit 15 - Present flag
- 1 if segment is present
- Bits 16-19 will be the segment limiter
- Bits 20-22 will be the attributes
- Bit 23 will be the Granularity
- Bites 24-31 will be the last bits of the base address
Lets get the GDT going!
First things first, we want to set the first segment of our GDT to zero. This is known as the "null segment". I believe this is required, I am not sure. This is just what I have picked up. We set the first segment to zero by filling in 64-bits(or 8 bytes) with the value zero(simple enough).
First, before we implement the code to do so, lets define a few labels that will help us refer to our GDT later on when assigning the segments. I will go ahead and give the main label the name "gdt". You can name it accordingly, or however you please. Then, after this "main" label, I will define the first segment(being the "null segment"), to "null_seg". Again, you can name it however you please. After so, we can then use
dq to fill in 64-bits with zero.
Assembler directives db, dw, dd and dq
db stands for define byte. dw stand for define word(16 bits). dd stand for define double word(32 bits). dq stand for define quad work(64 bits)
Now, after we set the "null segment", lets go ahead and define the code segment. In the code segment, the first 16 bits will define the limit. Since we are aiming to accessing 4gb of memory, we will set the code segment limit to
0xffff. Then, the next 16 bits defines the memory address, which we will set to zero, which will be the start of memory.
The first 8 bits of the second double word continues off the base address, so, with simplicity, we'll set the first 8 bits to zero.
Now, we have the type bits. The eighth bit is is the access flag. We don't have any use case of this, so the eighth bit will be set to zero. The next bit sets if the segment should be readable. We'll leave this to 1 for now. The tenth bit is conforming. If this bit is set, less privileged code segments are allows to jump or call this segment. We don't want that, so we'll set this bit to zero. The last bit of the segment type specifies if this is code or data. We'll set this to 1 because this is the code segment, and we'll be setting the data segment later.
With this, we should have the following in binary:
Bit 12 is set if the segment is either a data or code segment, so we'll set this bit to 1. The next bit is the privilege level. The two bits contain a value 0-3. Since this segment is part of the OS, we'll set the two bits to zero. Then, the last bit is the present flag, which we'll set to one because this segment is present.
This will then have the following binary:
With this, we can apply that to our code segment in binary.
Note, the "b" at the end of the binary is needed to let the assembler know that those values are binary values.
Now, time for the leftover 16 bits in the second word.
Bit 0-3 are the last bits in the segment limit, which is
0x0F. However, we have to combine this value with the next four bits. So, the fourth bit represents the flag of "Available to System Programmers". You can use this bit with whatever purpose you have, I will ignore this in this tutorial. The fifth bit is reserved, so we'll set it to zero. The next bit is the Size bit and will be set in our case. This tells the CPU we have 32-bit code(wow, this is making more sense now on how we're jumping into 32-bit land!). The seventh bit, granularity, if set, the limiter multiplies the segment by 4kb. This is what we want, so we will set this to one.
Put this together, with the value of
0x0f, in binary being(
1111b), we should have
11001111b. Lets put this into our assembly code, with setting the last 8 bits of base address to zero.
Now, onto the data segment. For this one, we'll have the first double word exactly the same as the code segment:
The first 8 bits of the second double word is the same as the code segment, as well. Now, lets see what we need. The first bit stands for accessed. This is the same as the code segment, which we'll set to zero, accordingly. The ninth bit is different from the code segment. Rather than enabling read access, we'll enable write access, so we'll set it to one. The tenth bit handles the expand direction. We want to expand down, so we'll set this bit to zero. The eleventh bit is the same as the code segment, but rather we want a data segment, so this bit will be set to zero. Bits 12-15 will be the same as the code segment, which should lead to the following binary:
The last 16 bits are the same as the code segment. Lets implement this into our code:
Sweet! We now have the "null segment", the code segment and the data segment. Now, we just need to load in the GDT, so the CPU can find the segments. This is where we will use an instruction called
LGDT. This instruction will take a address to a GDT descriptor. This thusly tells the CPU where the GDT can be found. Lets add in a few more lines of code.
First, lets define a new label, I'll call it "end". Again, name it however you please. Within this label, we'll have another label that I will call "desc". This will contain code over the size of the GDT, and will thusly store the address of the GDT.
Time to load it in! First, lets make sure that we're the only one executing. So, we'll clear interrupts. We do this with
cli. Then, we need to clear out the ax register to make sure we align the
ds (data segment)register accordingly(being zero).
Then, after so, we'll use the
lgdt instruction to load the GDT. We defined the
desc label to help us define where we can locate the GDT, so we'll use that with the
Last thing, we need to set bit 0 to the
CR0 register. So, since we are in 32-bit mode, we have access to
eax. We'll set
eax to store
cr0, we'll then set the bit by using an
or instruction on
eax to the value 1. Then, we'll set
cr0 back to
Now, the current instruction pipelines is filled with 16-bit mode garbage. We'll need to make a far jump to right after the null segment. So we jump to a segment, which we'll multiply by eight to get our segment identifier. And since we are, officially, in 32-bit mode, we want to notify the assembler that we are, in fact, running 32-bit instructions. This is done by putting
bits 32 in our code.
The jump instruction, if you remember how memory segmentation works, will look like
Now, after making the jump, we have just a small thing to do. We have to fill the segment registers with the proper value for this jump. We'll set the
ax register to
0x8, then assign that value to the
Looking back to the assembly code we had beforehand, we'll need to modify it a bit:
First, where do we add the assembly code we discussed? We'll add it under the
done label we had created earlier. As well as all the other code thusly booting us into 32-bit mode. So, with simplicity, we just take the code we went over above and put it under the
With this, we now should be booted into 32-bit mode. The change from doing 16-bit OSDev to 32-bit OSDev was a hard choice to make, but I figured making a series for something useful would be better.
I enjoyed making this tutorial. I put a lot of hard work into it. If you have any questions, don't mind asking!
Perhaps you should be looking to add more C into this, the GDT can be initialized with almost plain C, with assembly only needed to execute the instructions to load it.
What is going to be the end goal in terms of programs it can run and features?
I have had plans on this, but for the time being I want to focus purely on getting things going from the ground up. Meaning, from making the boot loader from scratch, making the jump from 16 but to 32 bit within assembly, and making the transition from assembly to C. Perhaps I will make a sub series where I go over the things I’m doing with more C implementation.