In this tutorial we will be going over the JPEG file format.
I am planning on making this into a series:
Understanding the JPEG file format header
Decoding a JPEG file format header in C, Go and Rust
Writing a JPEG file in C, Go and Rust
This is going to be challenging, so stick with me. I am still getting the hang of it, but I have a pretty good understanding of the header. So, without further ado, lets get into it.
JPEG file format
The JEPG file format is quite a hefty format to consider working with. Not only is it hefty, but it differs from image to image.
But this is not the worry in this tutorial. The worry is understanding the header format of the JPEG file.
So, lets see this!
JPEG header format
The first two bytes of the header primarily just say, "Hey, the image is starting!".
The first two bytes are 0xFF 0xD8 respectively.
Lets add this to the stream: 0xFF 0xD8
Note: Throughout this tutorial, I will be referencing JPEG images that support the format I am going over. I will not be going over any odd formats. They're a pain.
The next two bytes represent what is called the "marker id". This tells us to be ready to read the 5 byte header primarily saying, "this is a JPEG file".
These two bytes are 0xFF 0xE0 respectively.
Note: JPEG file formats support what is called "Flags". "Flags" are defined with a 0xFF. So, anytime you see a 0xFF, you'll most likely be reading a flag. I will go over flags later on in this tutorial.
Lets add those two bytes to the stream: 0xFF 0xD8 0xFF 0xE0
Simple enough!
Next, we get the "header length". Now, this whole thing is the header of the file, but the "header" within the header primarily just gives information to the JPEG image.
The next two bytes tells us the length of the header, with a built-in of 5 bytes(you'll see why later).
So, by default, the header within the JPEG header has a built-in of 5 bytes. I am going to reference a image here in this tutorial that has a header of 16 bytes. 11 bytes of information after the 5 bytes assigned to reference "JFIF".
"JFIF" just tells the file, "Hey, we're a JPEG image".
I'm getting too ahead of myself. Lets take a look at what these two bytes will be if we have a header length of 16: 0010, or 0x00 0x10.
Lets add this: 0xFF 0xD8 0xFF 0xE0 0x00 0x10
Now, we have the first two bytes that tell us that the image is starting. We have the marker id that tells us the header is starting. Then we have two bytes that gives us the length of the header.
Sweet! Onto the next step.
Remember I said there are a built-in of 5 bytes? Those five bytes go to "JFIF". But wait, that's just 4 characters Moca!.
YES! It is. There is a byte of padding after "JFIF". So, really, there is a built in of 6 bytes. 6 bytes for "JFIF", and the rest are used to give information to the header.
What does "JFIF" have to do with anything?
Well, when I was studying the file format of a JPEG image, I ran into "JFIF" in almost every image. All it does is define the image as a "JPEG". It is also known as the "JFIF app segment".
Besides the fact, the bytes are as follows: 0x4a 0x46 0x49 0x49 0x00 -> "JFIF "
Yes, there is. That byte leads into a flag definition. Which is where the next part of this tutorial is headed! But, while we're here, lets add this flag: Note: I am referencing a JPEG image as I am doing this tutorial
This table consists of 132 bytes. Wow. Quite a large table. If I do remember correctly, the image I am referencing is quite a decent image. So that would make most sense.
The same goes for a Huffman table. Since the first table in the image I am referencing is a Quantization Table.
A Huffman table is defined using 0xFF 0xC4, then the two bytes after it define the tables length.
Normally, the first table you will see in a JPEG file is a Quantization Table. In fact, that is the first able you will see. A Quantization Table will always be the first table you will run into inside a JPEG image.
Now, this is all for this tutorial, but before I go I am going to leave you with some information for the next tutorial when we dive a bit deeper:
0xFF 0xC0 defines the start of frame
0xFF 0xC1 defines the start of frame
0xFF 0xDA defines the start of scan(pixel array)
0xFF 0xD9 defines the end of the image.
We will be putting the above information to use in the next tutorial when we dive a bit deeper into a JPEG file format and see how the tables work, and how the flags works within the file.
Oh it is. Especially since the format is different from image to image. I don't think there are two images with the exact same JPEG format as one another.
I hate it, but I also enjoy the thrill of coding a decoder and watching it actually work. There's nothing better lol
It is really stupid. According to the documentation over the JPEG file format, it's supposed to have the marker ID after 0xFF 0xD8, which every file should have if it's a JPEG since that's what starts the image array primarily.
But nope. Some files totally ignore the marker ID, as well as the JFIF which tells the computer, "Hey I'm a JPEG image".
It's probably the most confusing format to ever exist. I mean it has to be lol. But the format did come out in 1992 so I am sure there are multiple multiple newer versions which is why the format changes from image to image.
I'm taking a look at some documentation over the format rn.
I stayed up all night creating a programming language in Rust to see how it would go and it was pretty easy. Easily got about 400 lines of code and I haven't even gotten to the AST yet lol
I just use whatever is needed lol. Especially with Rust since there are more built-ins than there are C, you can't really stay away from it. But it's also low level so I enjoy it. Being able to use built-ins without the worry of slow runtimes(lets act like Rust doesn't suck at compiling)
@MocaCDeveloper Rust's speed is comparable to C at runtime, but rustc is the slowest LLVM compiler I've used (LLVM's main strength is that it's quite fast)
Oh wow. That's pretty interesting!
I wouldn't even be able to do as good as you did on it lol. I would've probably given up. That and because I don't really have the common knowledge to do something like that
@MocaCDeveloper It was actually pretty easy. You just load the file into heap memory as an array of bytes (char* works well for this), then just print out the bytes in rows of 16, with the offset at the start of each line and decoded text/numbers at the end.
@MocaCDeveloper It was a fun little 1-day project, and it's also really easy to expand on (I added integer decoding and exporting in C/C++ header format)
@MocaCDeveloperhmm...
I'm terrible with names. If you've looked at my repls, you can see I'm a big fan of just naming a repl after what it does or the language it's in (I have repls named c and rust, and I used to have more like that)
JPEG file format
Hi!
In this tutorial we will be going over the JPEG file format.
I am planning on making this into a series:
This is going to be challenging, so stick with me. I am still getting the hang of it, but I have a pretty good understanding of the header. So, without further ado, lets get into it.
JPEG file format
The JEPG file format is quite a hefty format to consider working with. Not only is it hefty, but it differs from image to image.
But this is not the worry in this tutorial. The worry is understanding the header format of the JPEG file.
So, lets see this!
JPEG header format
The first two bytes of the header primarily just say, "Hey, the image is starting!".
The first two bytes are 0xFF 0xD8 respectively.
Lets add this to the stream:
0xFF 0xD8
Note: Throughout this tutorial, I will be referencing JPEG images that support the format I am going over. I will not be going over any odd formats. They're a pain.
The next two bytes represent what is called the "marker id". This tells us to be ready to read the 5 byte header primarily saying, "this is a JPEG file".
These two bytes are 0xFF 0xE0 respectively.
Note: JPEG file formats support what is called "Flags". "Flags" are defined with a 0xFF. So, anytime you see a 0xFF, you'll most likely be reading a flag. I will go over flags later on in this tutorial.
Lets add those two bytes to the stream:
0xFF 0xD8 0xFF 0xE0
Simple enough!
Next, we get the "header length". Now, this whole thing is the header of the file, but the "header" within the header primarily just gives information to the JPEG image.
The next two bytes tells us the length of the header, with a built-in of 5 bytes(you'll see why later).
So, by default, the header within the JPEG header has a built-in of 5 bytes. I am going to reference a image here in this tutorial that has a header of 16 bytes. 11 bytes of information after the 5 bytes assigned to reference "JFIF".
"JFIF" just tells the file, "Hey, we're a JPEG image".
I'm getting too ahead of myself. Lets take a look at what these two bytes will be if we have a header length of 16:
0010
, or0x00 0x10
.Lets add this:
0xFF 0xD8 0xFF 0xE0 0x00 0x10
Now, we have the first two bytes that tell us that the image is starting. We have the marker id that tells us the header is starting. Then we have two bytes that gives us the length of the header.
Sweet! Onto the next step.
Remember I said there are a built-in of 5 bytes? Those five bytes go to "JFIF". But wait, that's just 4 characters Moca!.
YES! It is. There is a byte of padding after "JFIF". So, really, there is a built in of 6 bytes. 6 bytes for "JFIF", and the rest are used to give information to the header.
What does "JFIF" have to do with anything?
Well, when I was studying the file format of a JPEG image, I ran into "JFIF" in almost every image. All it does is define the image as a "JPEG". It is also known as the "JFIF app segment".
Besides the fact, the bytes are as follows:
0x4a 0x46 0x49 0x49 0x00
-> "JFIF "Lets add this:
0xFF 0xD8 0xFF 0xE0 0x00 0x10 0x4a 0x46 0x49 0x49 0x00
Now, lets cover the remaining 10 bytes. Don't worry what these values are. They are just information needed for the image. Lets add it!
0xFF 0xD8 0xFF 0xE0 0x00 0x10 0x4a 0x46 0x49 0x49 0x00 01 0100 0001 0001 0000
Sweet! But wait Moca. There is still one byte!
Yes, there is. That byte leads into a flag definition. Which is where the next part of this tutorial is headed! But, while we're here, lets add this flag:
Note: I am referencing a JPEG image as I am doing this tutorial
0xFF 0xD8 0xFF 0xE0 0x00 0x10 0x4a 0x46 0x49 0x49 0x00 01 0100 0001 0001 0000 0xFF 0xDB
Tables
In a JPEG image, there are two special tables. These tables are called a Quantization Table and Huffman Table.
These tables give vivid information about the image.
The table definitions take in 4 bytes. 2 byes for the flag definition, 2 bytes for the length of the table.
But first, lets see what "flag" defined what "table":
Alright, now that that's out of the way. Lets implement the two bytes for the length of this table:
0xFF 0xD8 0xFF 0xE0 0x00 0x10 0x4a 0x46 0x49 0x49 0x00 01 0100 0001 0001 0000 0xFF 0xDB 0x00 0x84
This table consists of 132 bytes. Wow. Quite a large table. If I do remember correctly, the image I am referencing is quite a decent image. So that would make most sense.
The same goes for a Huffman table. Since the first table in the image I am referencing is a Quantization Table.
A Huffman table is defined using 0xFF 0xC4, then the two bytes after it define the tables length.
Normally, the first table you will see in a JPEG file is a Quantization Table. In fact, that is the first able you will see. A Quantization Table will always be the first table you will run into inside a JPEG image.
Now, this is all for this tutorial, but before I go I am going to leave you with some information for the next tutorial when we dive a bit deeper:
We will be putting the above information to use in the next tutorial when we dive a bit deeper into a JPEG file format and see how the tables work, and how the flags works within the file.
Until then, MocaCDeveloper logging out :)
Cool! (I've always been too scared to work with the jpeg format; I heard somewhere that it's even harder than png)
@ANDREWVOSS
Oh it is. Especially since the format is different from image to image. I don't think there are two images with the exact same JPEG format as one another.
I hate it, but I also enjoy the thrill of coding a decoder and watching it actually work. There's nothing better lol
@MocaCDeveloper
Ok that part's just kinda stupid
@ANDREWVOSS
It is really stupid. According to the documentation over the JPEG file format, it's supposed to have the marker ID after 0xFF 0xD8, which every file should have if it's a JPEG since that's what starts the image array primarily.
But nope. Some files totally ignore the marker ID, as well as the JFIF which tells the computer, "Hey I'm a JPEG image".
It's probably the most confusing format to ever exist. I mean it has to be lol. But the format did come out in 1992 so I am sure there are multiple multiple newer versions which is why the format changes from image to image.
@MocaCDeveloper The same kinda happens with PNGs. You can totally mangle the PNG's header data, but most programs are smart enough to still detect it.
@ANDREWVOSS
Well png images should be easier for our project. And then, just maybe we can add support for JPEG.
@MocaCDeveloper Agreed.
@ANDREWVOSS
I'm taking a look at some documentation over the format rn.
I stayed up all night creating a programming language in Rust to see how it would go and it was pretty easy. Easily got about 400 lines of code and I haven't even gotten to the AST yet lol
@MocaCDeveloper Dang. I'm still pretty unfamiliar with non-PEG parsers. (Rust has the best PEG library tho)
@ANDREWVOSS
I like building things from scratch. Ohh the pros of coming from the C family, lol
@MocaCDeveloper My general mantra with projects is to use at most 2 libraries unless it is absolutely necessary to use more
@ANDREWVOSS
I just use whatever is needed lol. Especially with Rust since there are more built-ins than there are C, you can't really stay away from it. But it's also low level so I enjoy it. Being able to use built-ins without the worry of slow runtimes(lets act like Rust doesn't suck at compiling)
@MocaCDeveloper Rust's speed is comparable to C at runtime, but rustc is the slowest LLVM compiler I've used (LLVM's main strength is that it's quite fast)
@ANDREWVOSS
Agreed. Rust is extremely slow, sadly. But hey, it's a powerful language for being low-level. Can't complain.
Also, I found this documentation over PNG images. It has some pretty good explanations.
If you're to where you can access a linux terminal, type in:
xxd -g -1 img_name.png
to see the raw data of the image@MocaCDeveloper Yeah I use
xxd
a lot (I even made a terrible clone of it)@ANDREWVOSS
Oh wow. That's pretty interesting!
I wouldn't even be able to do as good as you did on it lol. I would've probably given up. That and because I don't really have the common knowledge to do something like that
@MocaCDeveloper It was actually pretty easy. You just load the file into heap memory as an array of bytes (
char*
works well for this), then just print out the bytes in rows of 16, with the offset at the start of each line and decoded text/numbers at the end.@ANDREWVOSS
Not gonna lie, I might attempt that in Rust sometime.
@MocaCDeveloper It was a fun little 1-day project, and it's also really easy to expand on (I added integer decoding and exporting in C/C++ header format)
@ANDREWVOSS
We should attempt at it using Rust
@MocaCDeveloper I'd give that a shot
@ANDREWVOSS
We have 3 projects on our bucket list to work on together. Sweet!
@MocaCDeveloper Great, because I have literally nothing else to do :D
@ANDREWVOSS
Same I am literally just creating my own security application out of boredom. It has absolutely no use and I am treating it like a JPEG file. :)
I am ready to work whenever you're available
@MocaCDeveloper I'm available tonight (although I could probably write some code right now)
@ANDREWVOSS
I can too lol it's up to you honestly. I am available whenever
@MocaCDeveloper Send me a repl invite ig. I'll probably be on discord in about 8 hours
@ANDREWVOSS
Name of project?
@MocaCDeveloper hmm...
I'm terrible with names. If you've looked at my repls, you can see I'm a big fan of just naming a repl after what it does or the language it's in (I have repls named
c
andrust
, and I used to have more like that)@ANDREWVOSS
I'll just name it image lol
@MocaCDeveloper that's what i would have done :p
@ANDREWVOSS
work tonight?
@MocaCDeveloper Sure, I can work now.
@ANDREWVOSS
Alright give me a few i'll be on in a minute
@ANDREWVOSS
Lets talk on discord: MocaCDeveloper#5328