What is Mac OS X?

© Amit Singh. All Rights Reserved. Written in December 2003

Mac OS X Filesystems

Like most modern day operating system implementations, Mac OS X uses an object-oriented vnode layer. xnu's VFS layer is based on FreeBSD's, although there are numerous minor differences (for example, while FreeBSD uses mutexes, xnu uses simple locks; XNU's unified buffer cache is integrated with Mach's virtual memory layer, and so on).

Local Filesystems

HFS

HFS (Hierarchical File System) was the primary filesystem format used on the Macintosh Plus and later models, until Mac OS 8.1, when HFS was replaced by HFS Plus.

This section briefly describes the various filesystems supported by "stock" Mac OS X.

HFS+

HFS+ is the preferred filesystem on Mac OS X. It supports journaling, quotas, byte-range locking, Finder information in metadata, multiple encodings, hard and symbolic links, aliases, support for hiding file extensions on a per-file basis, etc. HFS+ uses B-Trees heavily for many of its internals.

Like most current journaling filesystems, HFS+ only journals meta-data. Journaling support was retrofitted into HFS+ via a simple VFS journaling layer in XNU that's actually filesystem independent. The journal files on an HFS+ volume are called .journal and .journal_info_block (type jrnl and creator code hfs+). HFS+, although not a cutting-edge filesystem, supports some unique features and has worked well for Apple.

Similar to HFS

HFS+ is architecturally similar to HFS, with several important improvements such as:

Aliases

Aliases are similar to symbolic links in the sense that it allows multiple references to a file or directory. However, if you move the target (without replacing it), a symlink would break, while an alias would not. This is possible because under HFS+, each file/directory has a unique, persistent identity, which is stored along with the pathname. If one of the two (pathname or unique identity) is wrong (the file cannot be found using it), the alias updates it with the "right" one (using which the file could be found). This feature is the reason why you can keep moving applications to different places on your disk without having to worry about breaking their Dock "shortcuts".

In order to make use of aliases, an application must use either Carbon or Cocoa APIs, since this feature is not available through the POSIX API.

Optimizations

HFS+ also has a few specific optimizations. When a file is opened on an HFS+ volume, the following conditions are tested:

If all the above are satisfied, the file is relocated (de-fragmented) - on-the-fly.

Another optimization is "Hot File Clustering". This is a multi-staged (the stages being DISABLED, IDLE, BUSY, RECORDING, EVALUATION, EVICTION and ADOPTION) clustering scheme that records "hot" files (except journal files, and ideally quota files) on a volume, and moves "hot" files to the "hot" space on the disk (0.5% of the total filesystem size located at the end of the default metadata zone - at the start of the volume). The scheme uses an on-disk B-Tree file for tracking (/.hotfiles.btree on a volume):

# ls -l /.hotfiles.btree -rw------- 1 root admin 196608 17 Dec 10:09 /.hotfiles.btree

At most 5000 files, and only files less than 10 MB in size are "adopted" under this scheme.

Multiple Forks

HFS+/HFS files have two "forks" - traditionally called the data and resource forks, either of which may be empty. Historically, a resource fork has been used for various things such as custom icons, preferences, license information, etc. As expected, this is incompatible with traditional Unix filesystems, and care must be taken while moving files across filesystems. The article Command Line Archival in Mac OS X takes a brief look at the issue and some solutions. From a BSD command line, a file's resource fork can be accessed thus:

% ls -l Icon 128 -rwxrwx--- 1 amit amit 0 11 Jun 2003 Icon # this means the data fork is empty % ls -l Icon/rsrc 128 -rwxrwx--- 1 amit amit 65535 11 Jun 2003 Icon/rsrc # the resource fork has is 65535 bytes

Thus, unlike aliases, multiple forks can be accesses via the POSIX API.

Device Driver Partitions

Although this is not related to HFS+, Mac OS X can load block device drivers from various places: the ROM, a USB or FireWire device and special partitions on a fixed disk. In order to support multiple operating systems or other features, a disk can have more than one device driver installed, each in its own partition. For example, viewing your disk drive's information in /Applications/Utilities/Disk Utility.app might tell you that Mac OS 9 drivers are installed on this disk. A "standard" partition scheme on the PowerBook might look like the following:

# pdisk /dev/rdisk0 -dump /dev/rdisk0 map block size=512 #: type name length base ( size ) 1: Apple_partition_map Apple 63 @ 1 2: Apple_Driver43*Macintosh 56 @ 64 3: Apple_Driver43*Macintosh 56 @ 120 4: Apple_Driver_ATA*Macintosh 56 @ 176 5: Apple_Driver_ATA*Macintosh 56 @ 232 6: Apple_FWDriver Macintosh 512 @ 288 7: Apple_Driver_IOKit Macintosh 512 @ 800 8: Apple_Patches Patch Partition 512 @ 1312 9: Apple_HFS Untitled 156299656 @ 1824 ( 74.5G) 0: Apple_Free 0+@ 156301480

In the above dump, the Apple_partition_map is a meta-data partition describing the partitions on the disk. The patch partition is a meta-data partition containing patches that must be applied to the system before it can boot. The various "Driver" partitions contain drivers (Apple_Driver43 contains SCSI Manager 4.3, for example).

Insensitive!

Note that HFS+ is a case preserving, case insensitive filesystem, which can be rather jarring in situations such as the following:

# tar -tf freebsd.tar FreeBSD.txt freebsd.txt # The tar file contains two files # tar -xvf freebsd.tar FreeBSD.txt freebsd.txt # ls *.txt freebsd.txt

The Apple Technical Note titled HFS Plus Volume Format describes HFS+ internals in great detail.

ISO9660

ISO9660 is a system-independent file system for read-only data CDs. Apple has its own set of ISO9660 extensions. Moreover, you would likely run into Mac HFS/ISO9660 hybrid discs that contain both a valid HFS and a valid ISO9660 filesystem. Both filesystems can be read on a Mac, while on "other" systems, you would typically read the ISO9660 data. Note that this doesn't mean there has to be redundant data on the disc: usually the data that needs to be accessed from both Macs and PCs is kept on the ISO9660 volume, and is aliased on the HFS volume.

MSDOS

Mac OS X includes support for MSDOS filesystem (FAT12, FAT16 and FAT32).

NTFS

Mac OS X includes read-only support for NTFS.

UDF

UDF (Universal Disk Format) is the filesystem used by DVD-ROM (including DVD-video and DVD-audio) discs, and by many CD-R/RW packet-writing programs. Note that at the time of this writing, Mac OS X "Panther" only supports UDF 1.5, and not UDF 2.0.

UFS

Darwin's implementation of UFS is similar to that on *BSD, as was NEXTSTEP's, but they are not really compatible. Currently, only NetBSD supports it. Apple's UFS is big endian (as was NeXT's) - even on x86 hardware. It includes the new Directory Allocation Algorithm for FFS (DirPref). The author of the algorithm offers more details, including some test results, on his site.

Network Filesystems

AFP

The Apple Filing Protocol (AFP) is an Apple proprietary protocol for file sharing over the network. A comparison of AFP and NFS is outside of the scope of this document, but there exists software that enables the two to co-exist (AFP shares can be made to look like NFS shares and vice-versa).

/usr/sbin/AppleFileServer is the AFP daemon, which is launched when you select the "Personal File Sharing" checkbox under System Preferences/Sharing.

FTP

The mount_ftp command mounts locally a directory on an FTP server. Note that this functionality is read-only currently. It works transparently through the Finder and the Web Browser.

mount_ftp ftp://user:password@hostname/directory/path node

NFS

Mac OS X includes NFS client and server support (version 3) from BSD, including the "NQ" (NFS with leases) extensions. The usual supporting daemons (rpc.lockd, rpc.statd, nfsiod, etc.) are present as well.

SMB/CIFS

Mac OS X "Panther" includes Samba 3.0 to support SMB/CIFS.

WebDAV

A WebDAV enabled directory located at a server specified by an appropriate URL can be mounted as a filesystem via the mount_webdav command. Since a .Mac account's iDisk is available through WebDAV, it can be mounted this way.

Note that Mac OS X has a (preliminary) system level event notification framework built around FreeBSD's kqueue/kevent. This can allow graceful mounts and unmounts of network volumes based on changes in network connectivity.

Other/Pseudo Filesystems

cddafs

The cdda filesystem is used to make the tracks of an audio CD appear as aiff files. Moreover, if the track names can be looked up successfully, the track "files" have corresponding names.

When you insert an audio CD in the drive, Mac OS X "mounts" it using cddafs by default, or you can manually do so as follows:

# mount_cddafs /dev/disk<N> /tmp/audiocd

deadfs

When the underlying filesystem is disassociated from a vnode (in the vclean() operation), its vnode operations vector is set to that of the dead filesystem. All operations in the dead filesystem fail, except for close().

deadfs essentially facilitates revocation (of access to the controlling terminal, to a forcibly unmounted filesystem, etc.) Consider a situation where you want a backgrounded job of a logged out user to finish what it's doing, and yet have no access to its (erstwhile) controlling terminal. This is achieved by detaching the terminal from the vnode and replacing it with deadfs.

devfs

devfs, the device filesystem, provides access to the kernel's device namespace in the global filesystem namespace. devfs is typically mounted on /dev and allows entries in there to be built automatically.

devfs, if enabled, is mounted from within the Mac OS X kernel during BSD initialization, although instances of it can be mounted later on using mount_devfs.

# mount -t devfs devfs /tmp/dev

fdesc

The fdesc filesystem is typically mounted on /dev/fd. It's functionality is similar to /proc/<pid>/fd (or simply /proc/self/fd) on Linux, that is, it provides a list of all active file descriptors for the currently running process. Note that a typical Linux system has /dev/fd symbolically linked to /proc/self/fd.

/etc/rc mounts the fdesc filesystem during system startup:

# mount -t fdesc -o union stdin /dev

fifofs

The purpose of fifofs is similar to specfs

loop*

Functionality similar to Linux's "loop" mounts (or "lofi" on Solaris) is available via the Finder (or simply on the Desktop) - simply double-clicking on a disk image file mounts its filesystem (if supported). The command line utility hdid can be used for a finer grained control of this functionality:

# hdid floppy.img /dev/disk3 # hdid http://127.0.0.1/disk.img /dev/disk4

The "disks" disk3 and disk4 can be accessed as regular disks. Note that if the disk image to be mounted using HTTP is a dual-fork file, then it is trickier to use it.

Moreover, hdid can be directed to use only a subset (a range of sectors) of a disk image. There is also support for encryption, and more importantly, shadowing, wherein a "shadow" file can be used to which all writes can be redirected. When a read occurs in such a case, blocks present in the shadow file have precedence over the ones in the image.

nullfs

The null mount filesystem is a stackable filesystem in 4.4BSD. It allows mounting of one part of the filesystem in a different location. This can be used to join together multiple directories into a new directory tree. Thus, filesystem hierarchies on various disks can be presented as one directory tree, subtrees of a writable filesystem can be made read-only, and so on.

Note that this is slightly different (less seamless) from a union mount (see below). While the latter essentially combines seamlessly the filesystems of the mount point and the mounted, nullfs simply intercepts VFS/vnode operations and passes them through (with the exception of vop_getattr(), vop_lock(), vop_unlock(), vop_inactive(), vop_reclaim(), and vop_print() to the original filesystem (of the mount point).

Note that the null filesystem layer also serves as a prototype filesystem, and new layers can be implemented by using the null layer as a template.

Finally, it should be noted that although nullfs is present in the bsd subtree of Darwin's kernel source, null mounts are not really used by Mac OS X.

ramfs

A ram filesystem can be created under Mac OS X as follows:

# hdid -nomount ram://1024 /dev/disk3

The above command creates a ram disk with 1024 sectors (sector size being 512), and prints the name of the resultant device on the standard output. Thereafter, a filesystem can be created on this device (the corresponding raw device, technically) as follows:

# newfs_msdos /dev/rdisk3 /dev/rdisk3: 985 sectors in 985 FAT12 clusters \ (512 bytes/cluster) \ bps=512 spc=1 res=1 nft=2 rde=512 sec=1024 mid=0xf0 \ spf=3 spt=32 hds=16 hid=0

The disk can be mounted as usual:

# mount -t msdos /dev/disk3 /tmp/msdos # mount /dev/disk3 on /private/tmp/msdos (local) # df /tmp/msdos Filesystem 512-blocks Used Avail Capacity Mounted on /dev/disk3 987 2 985 0% /private/tmp/msdos

Finally, you can get rid of the ram disk as follows:

# hdiutil detach /dev/disk3 "disk3" unmounted. "disk3" ejected.

specfs

Devices (the so called "special" files) and FIFOs can reside on any arbitrary filesystem (that can house such files). This means their names and attributes are maintained by this "host" filesystem. However, their operations cannot be handled by this filesystem - accesses to such device files need to be mapped to their underlying devices (more specifically, the respective device drivers). Moreover, device aliases (for example, same major/minor numbers, but different pathnames on a filesystem, or maybe even different filesystems) need to be detected and handled appropriately.

The specfs layer facilitates the above. Note that specfs is not a user visible filesystem, and it's not "mounted" anywhere.

synthfs

synthfs is a pseudo (in-memory) filesystem used to create arbitrary directory trees (if you wanted to "synthesize" mount points for random things, for example). synthfs is not derived from FreeBSD.

A synthfs mount is similar to a typical pseudo filesystem mount:

# mount -t synthfs synthfs /tmp/synthfs

union

A detailed description of 4.4BSD's "union" mounts, including a short history of similar filesystems, can be found in the USENIX paper titled Union Mounts in 4.4BSD-Lite. In the simplest terms, the union mount filesystem extends the null filesystem by not hiding the files in the "mounted on" directory. It merges the two directories (and their trees) into a single view. Note that duplicate names are suppressed and a lookup locates the logically topmost entity with that name.

Consider the following sequence of commands that illustrates the basic concepts of union mounts:

# hdiutil create /tmp/msdos1 -volname one \ -megabytes 1 -fs MS-DOS ... created: /tmp/msdos1.dmg # hdiutil create /tmp/msdos2 -volname two \ -megabytes 1 -fs MS-DOS ... created: /tmp/msdos2.dmg # hdid -nomount /tmp/msdos1.dmg /dev/disk3 # hdid -nomount /tmp/msdos2.dmg /dev/disk4 # mount -t msdos /dev/disk3 /tmp/union # echo "msdos1: a" > /tmp/union/a.txt # umount /dev/disk3 # mount -t msdos /dev/disk4 /tmp/union # echo "msdos2: a" > /tmp/union/a.txt # echo "msdos2: b" > /tmp/union/b.txt # umount /dev/disk4 # mount -t msdos -o union /dev/disk3 /tmp/union # mount -t msdos -o union /dev/disk4 /tmp/union # ls /tmp/union a.txt b.txt # cat /tmp/union/a.txt msdos2: a # umount /dev/disk4 # ls /tmp/union a.txt # cat /tmp/union/a.txt msdos1: a # umount /dev/disk3 # mount -t msdos -o union /dev/disk4 /tmp/union # mount -t msdos -o union /dev/disk3 /tmp/union # cat /tmp/union/a.txt msdos1: a

As a real-life example, /etc/rc mounts the "descriptor" filesystem as a union mount:

# mount -t fdesc -o union stdin /dev

volfs

volfs, the "volume" filesystem, is a virtual filesystem that exists over the HFS+ (or a filesystem that supports volfs) VFS and serves the needs of two differing APIs (POSIX/Unix pathnames and Mac OS <Volume ID><Directory><File Name>). It is there to support the Carbon File Manager APIs on top of the BSD filesystem.

The filesystems that support volfs are HFS+, HFS, ISO9660 and UDF.

Consider the following example:

# mount /dev/disk0s9 on / (local, journaled) devfs on /dev/ (local) fdesc on /dev (union) <volfs> on /.vol ... # ls -l /.vol total 0 dr--r--r-- 2 root wheel 64 25 Dec 12:45 234881033

The entry in /.vol is nothing but a representation of /. Each mounted volume (a partition, if you will), such as those on external storage devices, would have a representation under /.vol. Consider a file, say /mach_kernel:

# ls -li /mach_kernel 1045670 -rw-r--r-- 1 root wheel 3824080 11 Dec 16:20 /mach_kernel

This file would be accessed under /.volfs as follows (note that 1045670 is the file's inode number):

# ls -li /.vol/234881033/1045670 1045670 -rw-r--r-- 1 root wheel 3824080 11 Dec 16:20 /.vol/234881033/1045670

volfs is mounted during system startup (in /etc/rc):

# mkdir -p -m 0555 /.vol && chmod 0555 /.vol && mount_volfs /.vol

What about XYZ?

While Mac OS X supports many filesystems, you might run into some that are not supported. Linux's ext2/3 and Reiser, for example, are not supported, although you can find an open source implementation of ext2 for Mac OS X.

Interestingly, BootX, the Mac OS X bootloader, does understand the ext2 filesystem, and can load kernels from it.

An important (though not necessarily critical) omission is that of the proc filesystem. This issue, including description of a minimal port of /proc from FreeBSD to Mac OS X, is discussed in /proc on Mac OS X.

<<< Mac OS X System Startup main Programming on Mac OS X >>>