NOTE: Ext4 is available from machines having kernel version 2.6.19
Ext3 filesystem was designed to use indirect block mapping scheme, which is efficient for small files but causes high metadata over head and poor performance when dealing with large files especially while performing delete or truncate operations because the mapping keeps a entry for every single block, and big files have many blocks which will led to huge mappings that will be slow to handle.
A new feature called extent has been added to ext4 filesystem. An extent is a single descriptor that represents a range of contiguous blocks. A single extent in ext4 can represent up to 128 MB
Using indirect block mapping schecme, 1 block = 4KB
So for 100MB file 100*1024/4 = 25600 blocks
1 extent can represent upto 128MB so a single extent can be used for mapping
Extents bring about a 25% throughput gain in large sequential I/O workloads when compared with ext3 hence increasing the overall performance of the filesystem.
Large FileSystem Support
One of the most important limitation of ext3 was 16TB filesystem size since it was using 32-bit block numbers and has a default 4k block size. This was overcome by ext4 filesystem theoritically supporting maximum filesystem size of 1EB (1 million TB i.e 1 EB = 1024 PB, 1 PB = 1024 TB, 1 TB = 1024 GB). This change was made possible with the combination of extent patches which uses 48-bit physical block numbers. Other metadata changes, such as in the super-block structure, were also made to support the 48-bit block number.
A journaling filesystem is a filesystem that maintains a special file called a journal that is used to repair any inconsistencies that occur as the result of an improper shutdown of a computer.
In order to support more than 32-bit block numbers in the journaling block layer (JBD), JBD2 was forked from JBD at the same time that ext4 was cloned.
In Ext4 you get an additional advantage of disabling journaling feature which can help slightly improve the performance of machine for users with special requirements and lesser workloads.
Multiple Block Allocation
Block allocator is the one that decides which free blocks will be used to write the data. Ext3 allocator allocates one block at a time for any data in the filesystem so you can imagine the amount of CPU and time occupied while writing a data for 100MB as shown in above calculation.
Ext4 uses Multiblock Allocator which allows many blocks to be allocated to a file in a single operation, in order to dramatically reduce the amount of CPU usage searching for many free blocks in the filesystem. Also, because many file blocks are allocated at the same time, a much better decision can be made to find a chunk of free space where all of the blocks will fit.
This is a feature where writing new data on the filesystem is delayed as much as possible as compared to ext3 filesystems which immediately starts looking for free block and aloocates as soon as possible.
Combined with Multiblock Allocation a large no of block can be allocated at the same time by knowing the size of block required, a suitable chunk of free space can be looked for and allocated to it instead of picking up a single free block everytime.
This will reduce CPU time spent in block allocation increasing the performance.
Ext4 will support online fragmentation which is performed by creating a temporary inode, using multiple block allocation to allocate contiguous blocks to the inode, reading all data from the original file to the page cache, then flushing the data to disk and migrating the newly allocated blocks over to the original inode.
Why we need defragmentation?
There happens a case when you have multiple files in your filesystem. Now these data are stored as small blocks. For example you have a single data file with 1GB so the kernel will place all the blocks for this single file at one place but eventually as the filesystem size goes full the blocks near that 1GB file will also be occupied. But what if you add more contents to the 1GB data file and its size is increased to 3GB. In that case kernel won't find any free blocks near that data file and will assign random free blocks on the filesystem which slows down the I/O performance. So we perform defragmentation to arange all the blocks for each data available on the filesystem.
Inode related tweeks
In ext3 the default inode size is 128 bytes but in ext4 the default inode size can be 256, 512, 1024, etc. up to filesystem blocksize. This will provide space for the new fields needed for the planned features, nanosecond time stamps, and inode versioning.
To increase directory scalability the directory indexing feature, available in ext3, will be turned on by default in ext4. Directory indexing uses a specialized Btree-like structure to store directory entries, rather than a linked list with linear access times. This significantly improves performance on
certain applications with very large directories.
Faster Repair and Recovery
In ext4 unallocated block groups and sections of the inode table are marked as such. This enables e2fsck to skip them entirely and greatly reduces the time it takes to check the file system. Linux 2.6.24 implements this feature.
Unlimited subdirectory limit
Utilising the B-Tree indexing feature the ext4 filesystem has overcome the maximum limit of subdirectories which was 32,768 in ext3. Unlimited directories can be created in ext4 filesystem.
A brief comparison chart between ext3 and ext4
1EB (1 EB = 1024 PB, 1 PB = 1024 TB)
Default inode size
Sub Directory Limit
For extent file
Multiple Block Allocation
What is GRUB Boot Loader ?
Understanding Partition Scheme MBR vs GPT
What is swappiness and how do we change its value?
What is a Kernel in Linux?
What is virtual memory, paging and swap space?