Using the XFS file system on CentOS / RHEL 7.x


With the release of CentOS / RHEL 7.x, it is important to understand the XFS file system.

XFS is a highly scalable, high-performance file system which was originally designed at Silicon Graphics, Inc. XFS is the default file system for Red Hat Enterprise Linux 7

Main Features

XFS supports metadata journaling, which facilitates quicker crash recovery. The XFS file system can also be defragmented and enlarged while mounted and active. In addition, Red Hat Enterprise Linux 7 supports backup and restore utilities specific to XFS.
Allocation Features
XFS features the following allocation schemes:
  • Extent-based allocation
  • Stripe-aware allocation policies
  • Delayed allocation
  • Space pre-allocation
Delayed allocation and other performance optimizations affect XFS the same way that they do ext4. Namely, a program’s writes to an XFS file system are not guaranteed to be on-disk unless the program issues an fsync() call afterwards.
  • XFS is designed for large file systems and large file handling.
  • XFS has on-line defragmentation tools
  • XFS dramatically reduces start-up time by avoiding fsck delays.
  • XFS has fast file system creation.
  • XFS formatted disk capacity is greater than Ext3/4 even after removing the reserved blocks from the Ext3/4 file system
Other XFS Features
The XFS file system support:
Extended attributes (xattr)
This allows the system to associate several additional name/value pairs per file. It is enabled by default.
Quota journaling
This avoids the need for lengthy quota consistency checks after a crash.
Project/directory quotas
This allows quota restrictions over a directory tree.
Subsecond timestamps
This allows timestamps to go to the subsecond

Common Commands for ext3/4 and XFS

Task ext3/4 XFS
Create a file system mkfs.ext4 or mkfs.ext3 mkfs.xfs
File system check e2fsck xfs_repair
Resizing a file system resize2fs xfs_growfs
Save an image of a file system e2image xfs_metadump and xfs_mdrestore
Label or tune a file system tune2fs xfs_admin
Backup a file system dump and restore xfsdump and xfsrestore

Creating an XFS File System with Tuning options

To create an XFS file system, use the mkfs.xfs /dev/device command. In general, the default options are optimal for common use.
When using mkfs.xfs on a block device containing an existing file system, use the -f option to force an overwrite of that file system.
su=value
Specifies a stripe unit or RAID chunk size. The value must be specified in bytes, with an optional k,m, or g suffix.
sw=value
Specifies the number of data disks in a RAID device, or the number of stripe units in the stripe.
The following example specifies a chunk size of 64k on a RAID device containing 4 stripe units:
# mkfs.xfs -d su=64k,sw=4 /dev/device
For < 1TB XFS file system
mkfs.xfs -l lazy-count=1,version=2,size=128m -i attr=2 -d agcount=4 -L VolumeName <dev>
For > 1TB XFS filesystem
mkfs.xfs -l lazy-count=1,version=2,size=128m -i attr=2 -d agcount=16 -L VolumeName <dev>

NOTE

After an XFS file system is created, its size cannot be reduced. However, it can still be enlarged using the xfs_growfs command
For striped block devices (for example, RAID5 arrays), the stripe geometry can be specified at the time of file system creation. Using proper stripe geometry greatly enhances the performance of an XFS filesystem.
When creating filesystems on LVM or MD volumes, mkfs.xfs chooses an optimal geometry. This may also be true on some hardware RAIDs that export geometry information to the operating system.

Mounting XFS with Tuning options

Further performance optimizations can be gained but specifying some additional mount options for your XFS file systems. By default, XFS uses write barriers to ensure file system integrity even when power is lost to a device with write caches enabled. For devices without write caches, or with battery-backed write caches, disable the barriers by using the nobarrier option

To manually mount a XFS file system with, optimal mount options, use the following:

mount -t xfs -o noatime,osyncisosync,logbsize=256k,logbufs=8 <dev> <mtpt>

The ‘/etc/fstab’ entry

UUID=xxxxxxxxxxx...x <mtpt> xfs noatime,osyncisosync,logbsize=256k,logbufs=8 0 2

The logsbsize and logbufsoptions address the often sited limitation of XFS when handling lots of small files and large number of file deletions.

Every time a file is accessed (read or write) the default for most file systems is to append the metadata associated with that file with an updated access time. Thus, even read operations incur an overhead associated with a write to the file system. This can lead to a significant degradation in performance in some usage scenarios. Appending noatime to the fstab line for any file system stops this action from happening. The above assumes you don't require atime. Not using atime provides a significant performance benefit.

Access time is not the same as the last-modified time. Disabling access time will still enable you to see when files were last modified by a write operation.

Default atime behavior is relatime

Relatime is on by default for XFS. It has almost no overhead compared to noatime while still maintaining sane atime values.

Increasing the size of an XFS file system

An XFS file system may be grown while mounted using the xfs_growfs command:
# xfs_growfs /mount/point -D size
The -D size option grows the file system to the specified size (expressed in file system blocks). Without the -D size option, xfs_growfs will grow the file system to the maximum size supported by the device.
Before growing an XFS file system with -D size, ensure that the underlying block device is of an appropriate size to hold the file system later

Repairing an XFS File System

# xfs_repair /dev/device

The xfs_repair utility is designed to repair even large file systems with many inodes efficiently. Unlike other Linux file systems, xfs_repair does not run at boot time, even when an XFS file system was not cleanly unmounted. In the event of an unclean unmount, xfs_repair simply replays the log at mount time, ensuring a consistent file system.

WARNING

The xfs_repair utility cannot repair an XFS file system with a dirty log. To clear the log, mount and unmount the XFS file system. If the log is corrupt and cannot be replayed, use the -L option (“force log zeroing”) to clear the log, that is, xfs_repair -L /dev/device. Be aware that this may result in further corruption or data loss

Suspending an XFS File System

To suspend or resume write activity to a file system, use xfs_freeze. Suspending write activity allows hardware-based device snapshots to be used to capture the file system in a consistent state.

NOTE

The xfs_freeze utility is provided by the xfsprogs package, which is only available on x86_64.
To suspend (that is, freeze) an XFS file system, use:
# xfs_freeze -f /mount/point
To unfreeze an XFS file system, use:
# xfs_freeze -u /mount/point
When taking an LVM snapshot, it is not necessary to use xfs_freeze to suspend the file system first. Rather, the LVM management tools will automatically suspend the XFS file system before taking the snapshot

Backup and Restore of XFS File System

XFS file system backup and restoration involves two utilities: xfsdump and xfsrestore.
To backup or dump an XFS file system, use the xfsdump utility. Red Hat Enterprise Linux 7 supports backups to tape drives or regular file images, and also allows multiple dumps to be written to the same tape. The xfsdump utility also allows a dump to span multiple tapes, although only one dump can be written to a regular file. In addition, xfsdump supports incremental backups, and can exclude files from a backup using size, subtree, or inode flags to filter them.
In order to support incremental backups, xfsdump uses dump levels to determine a base dump to which a specific dump is relative. The -l option specifies a dump level (0-9). To perform a full backup, perform a level 0 dump on the file system (that is, /path/to/filesystem), as in:
# xfsdump -l 0 -f /dev/device /path/to/filesystem

NOTE

The -f option specifies a destination for a backup. For example, the /dev/st0 destination is normally used for tape drives. An xfsdump destination can be a tape drive, regular file, or remote tape device.
In contrast, an incremental backup will only dump files that changed since the last level 0 dump. A level 1 dump is the first incremental dump after a full dump; the next incremental dump would be level 2, and so on, to a maximum of level 9. So, to perform a level 1 dump to a tape drive:
# xfsdump -l 1 -f /dev/st0 /path/to/filesystem
Conversely, the xfsrestore utility restores file systems from dumps produced by xfsdump. The xfsrestore utility has two modes: a default simple mode, and a cumulative mode. Specific dumps are identified by session ID or session label. As such, restoring a dump requires its corresponding session ID or label.
To display the session ID and labels of all dumps (both full and incremental), use the -I option:
# xfsrestore -I
The simple mode allows users to restore an entire file system from a level 0 dump. After identifying a level 0 dump’s session ID (that is, session-ID), restore it fully to /path/to/destination using:
# xfsrestore -f /dev/st0 -S session-ID /path/to/destination

NOTE

The -f option specifies the location of the dump, while the -S or -L option specifies which specific dump to restore. The -S option is used to specify a session ID, while the -L option is used for session labels. The -I option displays both session labels and IDs for each dump.
The cumulative mode of xfsrestore allows file system restoration from a specific incremental backup, for example, level 1 to level 9. To restore a file system from an incremental backup, simply add the -roption:
# xfsrestore -f /dev/st0 -S session-ID -r /path/to/destination
The xfsrestore utility also allows specific files from a dump to be extracted, added, or deleted. To use xfsrestore interactively, use the -i option, as in:
xfsrestore -f /dev/st0 -i /destination/directory
The interactive dialogue will begin after xfsrestore finishes reading the specified device. Available commands in this dialogue include cd, ls, add, delete, and extract.

XFS userspace tools

Once the OS is installed, XFS userspace tools are installed by using yum.

yum install xfsdump xfsprogs

Other XFS File System Utilities

xfs_fsr
Used to defragment mounted XFS file systems. When invoked with no arguments, xfs_fsrdefragments all regular files in all mounted XFS file systems. This utility also allows users to suspend a defragmentation at a specified time and resume from where it left off later.
In addition, xfs_fsr also allows the defragmentation of only one file, as in xfs_fsr /path/to/file. Red Hat advises not to periodically defrag an entire file system because XFS avoids fragmentation by default. System wide defragmentation could cause the side effect of fragmentation in free space.
xfs_bmap
Prints the map of disk blocks used by files in an XFS filesystem. This map lists each extent used by a specified file, as well as regions in the file with no corresponding blocks (that is, holes).
xfs_info
Prints XFS file system information.
xfs_admin
Changes the parameters of an XFS file system. The xfs_admin utility can only modify parameters of unmounted devices or file systems.
xfs_copy
Copies the contents of an entire XFS file system to one or more targets in parallel.
The following utilities are also useful in debugging and analyzing XFS file systems:
xfs_metadump
Copies XFS file system metadata to a file. Red Hat only supports using the xfs_metadump utility to copy unmounted file systems or read-only mounted file systems; otherwise, generated dumps could be corrupted or inconsistent.
xfs_mdrestore
Restores an XFS metadump image (generated using xfs_metadump) to a file system image.
xfs_db
Debugs an XFS file system.

De-fragmenting XFS

To view information about a mounted XFS file system use xfs_db
The closer the fragmentation factor is to 0% the better.
Defragmenting XFS file systems can be done on a live system.

sudo xfs_db c frag r /dev/sda3

The file system reorganizer for XFS is xfs_fsr. Typically, I instruct xfs_fsr to reorganise /dev/sda3 with a timeout (-t) of 6hrs (60 * 60 * 6 = 21600) which is specified in seconds. But for the purposes of this example I used a timeout of 15 mins.
sudo xfs_fsr -t 300 /dev/sda3 -v

When the defrag is finished check how well the file system reorganized

sudo xfs_db -c frag -r /dev/sda3
actual 2155648, ideal 254512, fragmentation factor 88.19%

Defragmenting for 15 mins doesn’t help much, xfs_fsr needs several hours or more.

A better solution is to schedule a cron job to run periodically.

It is also possible to de-fragment a single file. To determine if a file is in need of defragmenting run the following…

xfs_bmap -v /srv/A320/PGQAR.DAT | wc -l

This will output a number which showing the number of extents the file is using.

95280

This number should be close to 1

sudo xfs_fsr -v /srv/A320/PGQAR.DAT

References

Performance Tuning XFS References

Leave a comment