File systems 101: NTFS

 

Summary

Scott Lowe continues his tutorial series on file systems with this look at NTFS.

Events

Echelon 2012
June 11 and 12, 2012

University Cultural Centre, National University of Singapore

Startup Asia Jakarta 2012
June 7 and 8, 2012

12th Floor, Annex Building, Wisma Nusantara Complex, Jl. M.H. Thamrin No. 59 Jakarta 10350, Indonesia

MMA Forum Singapore
April 23-25, 2012

Grand Hyatt Singapore

In my previous article on FAT file systems, I went over the details of the ubiquitous FAT file system. In this article, I'll delve into Microsoft's other--and better utilized--file system, NTFS.

Remember, I'm not going to delve so deeply into the concepts that your eyes glaze over! If you're a file system programmer, this article series is not for you.

NTFS stability and security
Microsoft gets a bad rap for a lot of the things it does. But one thing it has done extremely well is create a robust, scalable, efficient file system. NTFS (New Technology File System) was introduced along with Windows NT in 1993 with significant improvements over Microsoft's older FAT file system. NTFS is a journaling file system, which means that, in addition to writing information to the disk, the file system also maintains a log of all changes made. This feature makes NTFS particularly robust when it comes to recovering from various kinds of failures, such as a power loss or a system crash. In one of these events, the file system can quickly recover with no loss of data. In very rare instances--Microsoft claims that less than 1 percent of outages fall into this category--a crash requires that you run the CHKDSK repair program to maintain volume integrity.

One of NTFS' other significant features--security--is the primary reason that NTFS is the file system of choice on most Windows-based networks. NTFS' security system is quite robust and basically allows or denies access to file system objects on a very granular basis. The NTFS Master File Table (MFT) contains a record for each and every file on an NTFS partition. Attached to each MFT entry is a particular kind of metadata tag with the title of "security descriptor" (SD) which, appropriately enough, contains information about who is allowed access to a file or folder. Each SD tag includes a list of users--called an Access Control List (ACL)--that is allowed to access the object.

Note the use of the word "metadata" in the previous paragraph. NTFS is very metadata-driven. In fact, when you first create an NTFS partition, a number of metadata files are created, each designed to help track one particular aspect of the file system. I mentioned in the previous paragraph that an NTFS partition has a Master File Table--the associated file is named $MFT. In fact, NTFS creates two MFT files. The first one--$MFT--is stored at the beginning of an NTFS partition. In order to improve reliability, NTFS partitions also have a second MFT file, named $MFTMirr. This file, under Windows NT 4.0 and later, is stored at the end of the partition. Under Windows NT 3.51 and earlier, this MFT mirror is stored right in the middle of the partition. This file exists just in case the primary MFT becomes corrupt, which is one of the reasons that NTFS stores this backup as far away from the primary as possible.

NTFS features
NTFS has a remarkable feature set--and some drawbacks. The list below is a quick view of some of NTFS' more common features.

Encryption: Recent versions of NTFS include the ability to transparently encrypt files on the disk without requiring the end user to intervene. EFS can help to prevent access to files in the event that a laptop is stolen, for example. Depending on the version of NTFS/Windows, up to three encryption algorithms are provided. Windows 2000 and later support DESX encryption; Windows XP and Windows Server 2003 support 3DES; while Windows XP SP1+ and Windows Server 2003 also support AES.

Quotas: Even though NTFS has contained a file named $QUOTA--the name of the metadata file that manages disk quotas--since Windows NT 3.5, quotas were not natively introduced until Windows 2000 (NTFS 5) was released. Disk quotas are used to monitor and limit the use of disk space by users. Quotas in NTFS 5 can be handled on a per-user or per-volume basis and support both hard-limit and warning-level type scenarios.

Volume Shadow Services: Windows Server 2003 (NTFS 5.1) introduced the ability to quickly create snapshots of data, allowing easier and more reliable backups and recovery of data, even while files are open.

Reparse Points: A reparse point is a collection of user-defined data that enables a significant amount of functionality in NTFS 5.0 and above, including Volume Mount Points and Junction Points.

Volume Mount Points: Volume mount points allow separate volumes to be mounted as subdirectories on another volume.

Junction Points: For UNIX users in the crowd, symbolic links are old hat--they are basically references in the file system that point to files elsewhere. For Windows, it's a slightly different story. While NTFS has supported a type of symbolic link called a Junction Point since Windows 2000, the feature is still not well-supported in the GUI, and add-ons are often used to make use of this functionality. Note that shortcuts in Windows are not akin to symbolic links since file operations that reference the shortcut will actually affect the shortcut (.lnk) file and not the target file. Under Windows 2000, Windows XP, and Windows Server 2003, an NTFS Junction Point can be used to reference only folders and volumes (not individual files). There is word that Vista and Longhorn Server are supposed to support true symbolic linking like UNIX and Linux. I should note that NTFS also supports the concept of "hard links", which do provide services that allow a single file to be referenced in many directories. Unlike symbolic links, however, hard-linked files are susceptible to deletion if the last hard link to a file is removed.

Sparse Files: NTFS 5.0 with Windows 2000 introduced the concept of sparse files, which allow programs to store very large files that are stored on the disk using small chunks of data. Some people describe the use of sparse files as being similar to volume compression, but without the performance hit.

File Compression: NTFS provides transparent file compression services that can help reduce the amount of space consumed by files. However, compression can introduce a significant additional load on a system, and so should be carefully evaluated before implementing. File compression does not function on NTFS volumes on which the cluster size is greater than 4KB.

Summary
I'm sure that Microsoft has some surprises (real symbolic links, anyone?) in store for NTFS in future versions of Windows. I've summarized below the high-level information I outlined in this article.

  • Maximum volume size (theoretical): 16EB w/ 64KB clusters / 16TB w/ 4KB clusters
  • Maximum volume size (implemented): 2TB/256TB (>2TB requires dynamic volumes)
  • Maximum file size (theoretical): 16 EB
  • Maximum file size (implemented): 16 TB
  • Maximum files per volume: 4,294,967,295

Common version

--

NTFS 4.0

NTFS 5.0

NTFS 5.1

Windows version

Windows NT 3.1

Windows NT 3.51/4.0

Windows 2000

Windows XP &

Windows 2003

Compression

 

X

X

X

Volume mount points

 

 

X

X

Reparse points

 

 

X

X

Volume Shadow Services

 

 

 

X

Quotas

 

 

X

X

DESX Encryption

 

 

X

X

3DES Encryption

 

 

 

X

AES Encryption

 

 

 

X*

(XP SP1+, W2K3)

Back on metadata--NTFS' use of metadata and metadata files to describe components of the file system makes it much easier to add features to the file system and to maintain backward compatibility with older versions of NTFS.

I'm not going to delve much deeper into the inner workings of NTFS in this article, but, if there is interest from the readership, I will do so in a future article.

NTFS cluster size
NTFS is extremely efficient when it comes to making use of disk space. With FAT, for example, clusters range in size from 2 KB to 32 KB, depending on the size of the disk. NTFS cluster sizes also grow as the size of the disk increases, but, in Windows NT 3.51 and later, those sizes top out at a default of 4KB. Unfortunately, small cluster sizes can result in a performance hit due to the need to read so many clusters to pull data from and write data to the disk. While this isn't a huge problem with today's super-fast disk systems, in environments in which speed is king (i.e., heavy transactional data processing), the cluster size can be increased up to 64 KB. While this will result in more wasted disk space due to cluster waste/slack, and the inability to make use of certain features such as compression, performance may be increased. Note that, under NTFS 1.0 (Windows NT 3.1), the cluster size did not top out at a default of 4 KB and, instead, continued to grow beyond 4KB as disk size grew--up to 32 KB.

The list below gives you a look at the default cluster size of NTFS volumes. As you can imagine, just about every disk in use today uses the 4KB default cluster size.

Drive Size

Cluster Size

7 MB–512 MB

512 bytes

513 MB–1,024 MB

1 KB

1,025 MB–2 GB

2 KB

2 GB–2 TB

2 GB–2 TB

Talkback

Add your opinion

In order to post a comment, you need to be registered. (Sign In or register below)

Post your comment

ZDNet Asia Live

China solar cell makers seek Taiwan partnerships http://t.co/p5Hh7kJD

Big data acquisitions pave way to fast, effective innovation http://t.co/hdiEfBsz via @zdnetasia

Integration, focused investments to propel Windows Phone: By Kevin Kwang , ZDNet Asia on May 23, 2012 (2 hours a... http://t.co/E7tsZbHJ

Integration, focused investments to propel Windows Phone http://t.co/u9TqjQ8C

ZDNet Asia IT Salary Benchmark 2012 http://t.co/rVwYlV7H

AsiaClassifiedToday. Integration, focused investments to propel Windows Phone - ZDNet Asia: S... http://t.co/47tdjZyG #asia #google #biz

Malaysian organizations are apathetic about information security and fail to realize they are potentially under... http://t.co/XeuvbXrs

Big data acquisitions pave way to fast, effective innovation - ZDNet Asia News http://t.co/vDZpl0lu

"Big data acquisitions pave way to fast, effective innovation" including @Vivisimo_Inc (client) in @ZDnetAsia http://t.co/yNSdPqbb

Homegrown smartphone OSes gaining favor in China: 59 Jakarta 10350, Indonesia Locally-made mobile operating syst... http://t.co/BruP98Es

RT @MDMGeek: Big data acquisitions pave way to fast, effective innovation - ZDNet Asia http://t.co/ky8YgPAn #Bigdata #analytics via @ciropuglisi

Integration, focused investments to propel Windows Phone http://t.co/6JkDa9sB

RT @AsianFashionLaw: Malaysia offers some manufacturing benefits over China http://t.co/bMquIFiX

Acquisitions in the Big Data market increasingly important to enterprises… http://t.co/Br4BkXyZ

Experience trumps content in apps monetization http://t.co/iaCY5ebX

So much as we know , MTK6575 extremely integrated frequency1GHz ARM Cortex-A9 processor, the superiority of 3G / HSPA Modem, and help the...

1 day ago by y15822137359 on 5 SaaS adoption speed bumps to avoid

I reckon your view: "CRM is strategy, not software", if a company replicating the approach uses in ERP implementation into CRM, what they...

2 days ago by wykoong on Gartner: Mobile CRM gives better ROI than social

This video will teach you about the Excel fill handle but also provide you with a workook to download... http://www.youtube.com/watch?v=...

3 days ago by TradeBrother on A quick fill handle trick for Microsoft Excel

waiting...

5 days ago by eapete on What should count in a company's market value?

Boy, you've opened a can of worms now.

Wait for the rants & raves.

5 days ago by eapete on What should count in a company's market value?

I was puzzling before this whether to replicate the success formula we executed for a financial institute, and come out with a standard s...

5 days ago by wykoong on Drop the egos, copy ideas, then innovate