File systems 101: NTFS
Tuesday, July 11, 2006 01:42 PM
Scott Lowe continues his tutorial series on file systems with this look at NTFS.
In my previous article on FAT file systems, I went over the details of the ubiquitous FAT file system. In this article, I'll delve into Microsoft's other--and better utilized--file system, NTFS.
Remember, I'm not going to delve so deeply into the concepts that your eyes glaze over! If you're a file system programmer, this article series is not for you.
NTFS stability and security
Microsoft gets a bad rap for
a lot of the things it does. But one thing it has done extremely well is create
a robust, scalable, efficient file system. NTFS (New Technology File System)
was introduced along with Windows NT in 1993 with significant improvements over
Microsoft's older FAT file system. NTFS is a journaling file system, which
means that, in addition to writing information to the disk, the file system
also maintains a log of all changes made. This feature makes NTFS particularly
robust when it comes to recovering from various kinds of failures, such as a
power loss or a system crash. In one of these events, the file system can
quickly recover with no loss of data. In very rare instances--Microsoft claims
that less than 1 percent of outages fall into this category--a crash requires that
you run the CHKDSK repair program to maintain volume integrity.
One of NTFS' other significant features--security--is the primary reason that NTFS is the file system of choice on most Windows-based networks. NTFS' security system is quite robust and basically allows or denies access to file system objects on a very granular basis. The NTFS Master File Table (MFT) contains a record for each and every file on an NTFS partition. Attached to each MFT entry is a particular kind of metadata tag with the title of "security descriptor" (SD) which, appropriately enough, contains information about who is allowed access to a file or folder. Each SD tag includes a list of users--called an Access Control List (ACL)--that is allowed to access the object.
Note the use of the word "metadata" in the previous paragraph. NTFS is very metadata-driven. In fact, when you first create an NTFS partition, a number of metadata files are created, each designed to help track one particular aspect of the file system. I mentioned in the previous paragraph that an NTFS partition has a Master File Table--the associated file is named $MFT. In fact, NTFS creates two MFT files. The first one--$MFT--is stored at the beginning of an NTFS partition. In order to improve reliability, NTFS partitions also have a second MFT file, named $MFTMirr. This file, under Windows NT 4.0 and later, is stored at the end of the partition. Under Windows NT 3.51 and earlier, this MFT mirror is stored right in the middle of the partition. This file exists just in case the primary MFT becomes corrupt, which is one of the reasons that NTFS stores this backup as far away from the primary as possible.
NTFS features
NTFS has a remarkable
feature set--and some drawbacks. The list below is a quick view of some of NTFS'
more common features.
Encryption: Recent versions of NTFS include the ability to transparently encrypt files on the disk without requiring the end user to intervene. EFS can help to prevent access to files in the event that a laptop is stolen, for example. Depending on the version of NTFS/Windows, up to three encryption algorithms are provided. Windows 2000 and later support DESX encryption; Windows XP and Windows Server 2003 support 3DES; while Windows XP SP1+ and Windows Server 2003 also support AES.
Quotas: Even though NTFS has contained a file named $QUOTA--the name of the metadata file that manages disk quotas--since Windows NT 3.5, quotas were not natively introduced until Windows 2000 (NTFS 5) was released. Disk quotas are used to monitor and limit the use of disk space by users. Quotas in NTFS 5 can be handled on a per-user or per-volume basis and support both hard-limit and warning-level type scenarios.
Volume Shadow Services: Windows Server 2003 (NTFS 5.1) introduced the ability to quickly create snapshots of data, allowing easier and more reliable backups and recovery of data, even while files are open.
Reparse Points: A reparse point is a collection of user-defined data that enables a significant amount of functionality in NTFS 5.0 and above, including Volume Mount Points and Junction Points.
Volume Mount Points: Volume mount points allow separate volumes to be mounted as subdirectories on another volume.
Junction Points: For UNIX users in the crowd, symbolic links are old hat--they are basically references in the file system that point to files elsewhere. For Windows, it's a slightly different story. While NTFS has supported a type of symbolic link called a Junction Point since Windows 2000, the feature is still not well-supported in the GUI, and add-ons are often used to make use of this functionality. Note that shortcuts in Windows are not akin to symbolic links since file operations that reference the shortcut will actually affect the shortcut (.lnk) file and not the target file. Under Windows 2000, Windows XP, and Windows Server 2003, an NTFS Junction Point can be used to reference only folders and volumes (not individual files). There is word that Vista and Longhorn Server are supposed to support true symbolic linking like UNIX and Linux. I should note that NTFS also supports the concept of "hard links", which do provide services that allow a single file to be referenced in many directories. Unlike symbolic links, however, hard-linked files are susceptible to deletion if the last hard link to a file is removed.
Sparse Files: NTFS 5.0 with Windows 2000 introduced the concept of sparse files, which allow programs to store very large files that are stored on the disk using small chunks of data. Some people describe the use of sparse files as being similar to volume compression, but without the performance hit.
File Compression: NTFS provides transparent file compression services that can help reduce the amount of space consumed by files. However, compression can introduce a significant additional load on a system, and so should be carefully evaluated before implementing. File compression does not function on NTFS volumes on which the cluster size is greater than 4KB.
Summary
I'm sure that Microsoft has
some surprises (real symbolic links, anyone?) in store for NTFS in future
versions of Windows. I've summarized below the high-level information I outlined
in this article.
- Maximum volume size (theoretical): 16EB w/ 64KB clusters / 16TB w/ 4KB clusters
- Maximum volume size (implemented): 2TB/256TB (>2TB requires dynamic volumes)
- Maximum file size (theoretical): 16 EB
- Maximum file size (implemented): 16 TB
- Maximum files per volume: 4,294,967,295
|
Common version |
-- |
NTFS 4.0 |
NTFS 5.0 |
NTFS 5.1 |
|
Windows version |
Windows NT 3.1 |
Windows NT 3.51/4.0 |
Windows 2000 |
Windows XP & Windows 2003 |
|
Compression |
|
X |
X |
X |
|
Volume mount points |
|
|
X |
X |
|
Reparse points |
|
|
X |
X |
|
Volume Shadow Services |
|
|
|
X |
|
Quotas |
|
|
X |
X |
|
DESX Encryption |
|
|
X |
X |
|
3DES Encryption |
|
|
|
X |
|
AES Encryption |
|
|
|
X* (XP SP1+, W2K3) |
Back on metadata--NTFS' use of metadata and metadata files to describe components of the file system makes it much easier to add features to the file system and to maintain backward compatibility with older versions of NTFS.
I'm not going to delve much deeper into the inner workings of NTFS in this article, but, if there is interest from the readership, I will do so in a future article.
NTFS cluster size
NTFS is extremely efficient
when it comes to making use of disk space. With FAT, for example, clusters
range in size from 2 KB to 32 KB, depending on the size of the disk. NTFS
cluster sizes also grow as the size of the disk increases, but, in Windows NT
3.51 and later, those sizes top out at a default of 4KB. Unfortunately, small
cluster sizes can result in a performance hit due to the need to read so many
clusters to pull data from and write data to the disk. While this isn't a huge
problem with today's super-fast disk systems, in environments in which speed is
king (i.e., heavy transactional data processing), the cluster size can be
increased up to 64 KB. While this will result in more wasted disk space due to
cluster waste/slack, and the inability to make use of certain features such as
compression, performance may be increased. Note that, under NTFS 1.0 (Windows
NT 3.1), the cluster size did not top out at a default of 4 KB and, instead,
continued to grow beyond 4KB as disk size grew--up to 32 KB.
The list below gives you a look at the default cluster size of NTFS volumes. As you can imagine, just about every disk in use today uses the 4KB default cluster size.
|
Drive Size |
Cluster Size |
|
7 MB–512 MB |
512 bytes |
|
513 MB–1,024 MB |
1 KB |
|
1,025 MB–2 GB |
2 KB |
|
2 GB–2 TB |
2 GB–2 TB |



There are currently no comments for this post.