Software development, quantified

Splitting filesystems and drives for performance

There are several reasons to split a Linux’s virtual filesystem (VFS) across many separate filesystems and drives. It’s common in some industries to ensure compliance with security or privacy guidelines. For example, the Secure Technical Implementation Guides (STIGs) for Linux distributions require that /var, /var/log, and /var/log/audit each be on separate filesystems.

However, there are some good performance reasons to split filesystems, as well.

Tuned I/O schedulers

If all I/O is hitting a single drive, most systems will have a wide variety of competing workloads. Process management, system logs, and application I/O will all be contending for the same drive. Often, the BFQ scheduler would be the best option in such circumstances.

On the other hand, splitting those different workloads onto different drives allows the administrator to select the perfect scheduler for each workload and achieve better performance, even if the combined throughput or IOPS of the new drives is less than the original.

For example, if you’re running a PostgreSQL database, you might arrange your drives and filesystems to have one for each of:

The drive holding the root filesystem will see a mixed but (likely) light I/O load; it’s scheduler could probably be left at default.

The drive holding the logs will see mostly appending writes, and the scheduler can be matched to your syslogger’s behavior of using synchronous writes versus asynchronous writes, and for single-threaded writing versus multi-threaded writing.

The general database files will see mostly a mix of synchronous reads, asynchronous writes, all multi-process; the deadline scheduler might be ideal for it.

The WAL will largely receive single-threaded synchronous writes; the none scheduler might be perfect, here.

You can read about the available I/O schedulers here:

Localize encryption or integrity requirements

If only some portions of your filesystem need protection by encryption or data-integrity verification (i.e. verity filesystems), then splitting those parts out into separate filesystems and maybe drives makes a lot of sense.

Although Linux’s LUKS and dm-crypt encryption are amazingly fast, they are not free; applying encryption to all files will slow down all I/O. Split out the files needing encryption to a separate filesystem.

Verity is also very fast, though not free in terms of CPU and also of disk, as the integrity signatures will require additional read requests. Splitting files needing verity out to separate filesystem makes sense, too.

Tuned filesystem and mount options

Some filesystem and mount options can also be changed based on the needs of the filesystem’s files and controlling applications.

For example, although a file-based mail store might need correct access times (atimes), a relational database might not need atimes at all, saving some writes to file metadata for what could otherwise be a read-only operation.

As another example, a filesystem with lots of small files could see reduced I/O pressure with a small allocation size, while one with predominantly large files would see reduced pressure with a large allocation size.

Finally, while most filesystems benefit from journaling, a filesystem that’s read-only or could easily be rebuilt on failure doesn’t need one. Some applications that do their own journaling may also do so in a way that doesn’t need support from a filesystem journal in order to ensure crash-recoverability, and omitting the filesystem journal could significantly cut write I/O pressure.