Overview
File system performance encompasses the speed and efficiency of operations that read from and write to persistent storage. These operations form a critical bottleneck in many applications because disk access operates orders of magnitude slower than memory access. A typical hard disk seek time ranges from 5-15 milliseconds, while RAM access completes in nanoseconds. Even modern SSDs, despite eliminating mechanical seek time, still require microseconds for access—thousands of times slower than RAM.
The performance characteristics of file operations affect applications across domains. Database systems rely on optimized file I/O for transaction logs and data files. Web servers must efficiently serve static assets. Data processing pipelines depend on high-throughput file operations to handle large datasets. Build systems perform thousands of file reads during compilation. Backup utilities require maximum throughput for large file transfers.
File system performance optimization operates at multiple levels: hardware characteristics (HDD vs SSD vs NVMe), operating system I/O schedulers and caching, file system implementations (ext4, XFS, APFS), application-level buffering strategies, and access pattern optimization. Each level presents opportunities for performance improvements.
Ruby applications interact with file systems through the File and IO classes, which provide abstractions over operating system I/O calls. The Ruby interpreter adds its own buffering layer on top of OS buffering, creating multiple caching levels. Understanding these layers helps developers write performant file I/O code.
Key Principles
Buffering reduces system call overhead by batching small operations into larger chunks. Each system call involves context switching between user space and kernel space, which incurs overhead. Reading one byte at a time makes 1,000 system calls for a 1KB file. Buffering reads the entire file in one call, reducing overhead by three orders of magnitude. Ruby's IO class maintains internal buffers that default to 8KB, reducing the number of underlying read/write system calls.
Sequential access patterns outperform random access because storage devices optimize for sequential operations. Hard drives eliminate seek time when reading consecutive blocks. SSDs achieve higher throughput with sequential reads through internal parallelism. The operating system's read-ahead mechanism pre-fetches data for sequential patterns but provides no benefit for random access. Applications that process files linearly achieve significantly better performance than those jumping between file positions.
Operating system page cache serves as a transparent performance layer by keeping recently accessed file data in RAM. The kernel maintains this cache using available memory, evicting old pages when memory pressure increases. When an application reads a file, the OS serves subsequent reads from cache without touching disk. Write operations go to the page cache first, with the OS flushing dirty pages to disk asynchronously. This mechanism makes repeatedly accessed files perform at RAM speed.
I/O scheduling algorithms in the kernel optimize disk access patterns by reordering and merging requests. The elevator algorithm services requests in disk sector order, minimizing seek time. Deadline schedulers prevent request starvation by setting time limits. CFQ (Completely Fair Queuing) allocates bandwidth fairly across processes. These schedulers operate transparently but affect application performance, especially under concurrent I/O load.
File system metadata operations carry distinct performance costs separate from data operations. Creating, deleting, or stat-ing files requires updating directory entries, inodes, and file system metadata structures. These operations often require synchronous disk writes for consistency. Applications creating thousands of temporary files experience metadata bottlenecks even with small file sizes. Modern file systems like XFS optimize metadata operations through techniques like delayed allocation.
Direct I/O bypasses the page cache for applications that manage their own caching. Database systems often use direct I/O to prevent double-buffering—once in the page cache and again in the database's buffer pool. Direct I/O requires aligned memory buffers and imposes alignment restrictions but provides predictable performance without cache eviction effects. The trade-off involves losing the kernel's sophisticated caching algorithms.
Memory-mapped files enable treating file contents as memory by mapping file regions into the process address space. The OS handles paging data between RAM and disk automatically. This approach eliminates explicit read/write calls and buffering, relying instead on page faults and the page cache. Memory mapping excels for random access patterns and file modifications but introduces complexity around synchronization and error handling.
Write-ahead logging separates throughput from latency by making durability asynchronous. Applications write to fast sequential logs first, then acknowledge operations before applying changes to primary data structures. The log provides crash recovery. This pattern appears in databases (PostgreSQL's WAL), file systems (ext4's journal), and application-level implementations. The trade-off involves recovery complexity and storage overhead for the log.
Ruby Implementation
Ruby provides file system access through the File class, which inherits from IO. The IO class implements buffering internally, maintaining read and write buffers that reduce system call frequency. Buffer size defaults to 8KB but can be configured through various methods.
# Standard buffered reading
File.open('data.txt', 'r') do |file|
file.each_line do |line|
process(line)
end
end
# Ruby maintains 8KB buffer, reads ahead from disk
The each_line method reads data into internal buffers and returns lines incrementally. This approach balances memory usage with I/O efficiency. For files smaller than the buffer size, Ruby performs a single read system call.
# Reading entire file into memory
content = File.read('data.txt')
# Single read operation for small files
# Multiple buffered reads for large files
# Binary reading with explicit buffer size
File.open('large.bin', 'rb') do |file|
while chunk = file.read(65536)
process_binary(chunk)
end
end
# Reads 64KB chunks, good for large binary files
The read method with a size argument controls buffer granularity. Larger chunks reduce system call overhead but increase memory usage. Binary mode ('rb') disables text encoding conversion and newline translation, improving performance for non-text data.
# Buffered writing
File.open('output.txt', 'w') do |file|
1000.times do |i|
file.puts "Line #{i}"
end
end
# Writes buffered in memory, flushed automatically at close
# Explicit flushing for durability
File.open('critical.log', 'a') do |file|
file.puts log_entry
file.flush # Force OS to write buffer
file.fsync # Force OS to flush to disk
end
The flush method writes Ruby's buffer to the operating system, while fsync ensures the OS flushes data to physical storage. Database applications and logging systems use fsync for durability guarantees, accepting the performance cost. Each fsync call blocks until the disk completes the write operation.
# Memory-mapped files using external gem
require 'mmap'
mmap = Mmap.new('data.dat', 'r')
# Access file as byte array
byte_at_1000 = mmap[1000]
# OS handles paging automatically
mmap = Mmap.new('shared.dat', 'rw')
mmap[0, 10] = 'new data'
# Modifies file through memory operations
mmap.munmap
The mmap gem provides memory mapping functionality. Random access to mapped files performs efficiently because the OS handles caching through its page management. Multiple processes can map the same file for shared memory communication.
# Directory iteration with low memory overhead
Dir.foreach('/var/log') do |filename|
next if filename == '.' || filename == '..'
path = File.join('/var/log', filename)
stat = File.stat(path)
puts "#{filename}: #{stat.size} bytes"
end
# Efficient iteration, one entry at a time
Directory operations involve metadata reads. The stat system call retrieves file information including size, permissions, and timestamps. Calling stat on thousands of files becomes a bottleneck. Modern file systems cache metadata, but the cache has limits.
# Optimized batch file reading
def read_files_efficiently(pattern)
Dir.glob(pattern).each do |path|
# Pre-sort by size to optimize cache usage
next unless File.size(path) < 1_000_000
content = File.read(path)
process(content)
end
end
# Sequential processing optimizes OS read-ahead
files = Dir.glob('logs/*.log').sort
files.each do |path|
File.open(path, 'r') do |file|
file.each_line do |line|
analyze(line)
end
end
end
Processing files in sorted order improves cache locality. The OS read-ahead mechanism works best with predictable access patterns. Checking file size before reading prevents loading unexpectedly large files into memory.
# Lazy enumeration for large files
File.open('huge.log', 'r') do |file|
file.lazy.each_line.grep(/ERROR/).first(10).each do |line|
puts line
end
end
# Stops reading after finding 10 matches
# Does not load entire file into memory
The lazy enumerator prevents materializing the entire file in memory. This pattern excels for early-exit scenarios where processing stops after finding specific data. Regular each_line processes the entire file regardless of early termination.
Performance Considerations
Sequential vs random access demonstrates dramatic performance differences. Sequential reads on hard drives achieve 100-200 MB/s, while random reads drop to 1-2 MB/s due to seek time overhead. SSDs reduce this gap but still favor sequential access—500-550 MB/s sequential versus 300-400 MB/s random. Applications should structure data for sequential access when performance matters.
# Poor: Random access pattern
file = File.open('data.bin', 'rb')
offsets = [1000, 500, 5000, 2000, 100]
offsets.each do |offset|
file.seek(offset)
data = file.read(100)
process(data)
end
# Each seek causes disk repositioning
# Better: Sequential access after sorting
file = File.open('data.bin', 'rb')
offsets.sort.each do |offset|
file.seek(offset)
data = file.read(100)
process(data)
end
# Sorted seeks minimize disk head movement
Buffer size tuning balances memory usage against system call overhead. Small buffers increase system call frequency. Large buffers waste memory and reduce cache effectiveness. The optimal size depends on file size distribution and available memory. For reading multiple small files, keep buffers small. For processing large files sequentially, use larger buffers.
# Measuring impact of buffer size
require 'benchmark'
def read_with_buffer(path, buffer_size)
File.open(path, 'rb') do |file|
while chunk = file.read(buffer_size)
# Processing
end
end
end
Benchmark.bm do |x|
x.report('4KB:') { read_with_buffer('large.bin', 4096) }
x.report('64KB:') { read_with_buffer('large.bin', 65536) }
x.report('1MB:') { read_with_buffer('large.bin', 1048576) }
end
# Results show diminishing returns above 64KB-256KB
Write buffering and durability create a trade-off between throughput and data safety. Buffered writes batch small operations but risk data loss on crashes. The fsync system call forces physical writes but reduces throughput significantly. Applications must choose based on durability requirements.
# High-throughput logging without durability guarantees
logger = File.open('app.log', 'a')
logger.sync = false # Disable auto-flushing
10000.times { logger.puts "Log entry" }
logger.close
# Fast but data loss risk on crash
# Durable logging with performance cost
logger = File.open('critical.log', 'a')
10000.times do
logger.puts "Transaction committed"
logger.fsync # Ensure disk write
end
logger.close
# Slow but guaranteed durability
Operating system caching makes the second read of a file dramatically faster than the first. Applications processing the same files repeatedly benefit enormously from the page cache. The cache operates system-wide, so other processes reading the same files contribute to cache warming.
require 'benchmark'
path = 'test.bin'
File.write(path, 'x' * 10_000_000)
Benchmark.bm do |x|
x.report('First read:') { File.read(path) }
x.report('Second read:') { File.read(path) }
end
# Second read completes 10-100x faster from cache
# Clearing cache (Linux)
system('sync; echo 3 > /proc/sys/vm/drop_caches')
Benchmark.bm do |x|
x.report('After cache clear:') { File.read(path) }
end
Metadata operation costs dominate when working with many small files. Creating, deleting, or stat-ing files involves filesystem metadata updates. These operations often require synchronous disk writes for consistency. Applications creating thousands of temporary files should batch operations or use alternative designs.
# Poor: Creating many small files
10000.times do |i|
File.write("temp/file_#{i}.txt", "data")
end
# Each write requires directory update
# Better: Single file with offsets
File.open('combined.dat', 'wb') do |file|
10000.times do |i|
file.write("data for record #{i}\n")
end
end
# Single file creation, minimal metadata overhead
Concurrent I/O from multiple threads or processes can improve throughput on modern storage, especially SSDs with high I/O parallelism. However, contention for the same files creates bottlenecks. Operating system locking and cache thrashing reduce benefits. Partitioning data across files enables parallel processing without contention.
# Parallel file processing
require 'parallel'
files = Dir.glob('data/*.txt')
results = Parallel.map(files, in_threads: 4) do |path|
File.read(path).scan(/pattern/).count
end
# Multiple threads reading different files
# OS can parallelize I/O operations
# Avoiding contention on shared files
threads = 4.times.map do |i|
Thread.new do
File.open("output_#{i}.log", 'w') do |file|
# Each thread writes to separate file
process_and_log(file, i)
end
end
end
threads.each(&:join)
Practical Examples
Large log file analysis requires streaming processing to handle files larger than available RAM. Reading the entire file with File.read causes memory exhaustion. Line-by-line iteration with proper buffering maintains constant memory usage regardless of file size.
def analyze_large_log(path)
error_count = 0
warning_count = 0
File.open(path, 'r') do |file|
file.each_line do |line|
error_count += 1 if line.include?('ERROR')
warning_count += 1 if line.include?('WARN')
end
end
{ errors: error_count, warnings: warning_count }
end
# Processing 10GB log file with constant memory usage
stats = analyze_large_log('/var/log/application.log')
puts "Found #{stats[:errors]} errors, #{stats[:warnings]} warnings"
Batch file export generates multiple output files from processed data. Naive implementations open and close files repeatedly, incurring overhead. Keeping files open during batch processing and buffering writes improves throughput.
# Inefficient: Opening files repeatedly
records.each do |record|
File.open("output/#{record.category}.csv", 'a') do |file|
file.puts record.to_csv
end
end
# File open/close overhead on every record
# Efficient: Keep files open, buffer writes
output_files = {}
begin
records.each do |record|
category = record.category
output_files[category] ||= File.open("output/#{category}.csv", 'w')
output_files[category].puts record.to_csv
end
ensure
output_files.each_value(&:close)
end
# Opens each file once, buffers automatically
Configuration file reloading monitors files for changes and reloads when modified. Checking file modification time on every operation creates excessive stat calls. Caching mtime and checking periodically reduces overhead.
class ConfigLoader
def initialize(path)
@path = path
@mtime = nil
@config = nil
@check_interval = 5 # seconds
@last_check = Time.now - 10
end
def load
now = Time.now
# Throttle stat calls
if now - @last_check > @check_interval
current_mtime = File.mtime(@path)
if @mtime.nil? || current_mtime > @mtime
@config = YAML.load_file(@path)
@mtime = current_mtime
end
@last_check = now
end
@config
end
end
# Usage avoids excessive filesystem checks
config = ConfigLoader.new('app.yml')
loop do
settings = config.load
# Uses cached config between check intervals
process_with_settings(settings)
sleep 0.1
end
Binary file parsing reads structured binary data efficiently by reading fixed-size chunks and unpacking. This approach minimizes system calls and leverages buffering.
def parse_binary_records(path)
records = []
record_size = 64 # bytes per record
File.open(path, 'rb') do |file|
while chunk = file.read(record_size)
break if chunk.size < record_size
# Unpack binary data into fields
id, timestamp, value = chunk.unpack('L>L>d>')
records << { id: id, time: timestamp, value: value }
end
end
records
end
# Processes binary file with minimal overhead
data = parse_binary_records('sensor_data.bin')
Directory tree traversal walks large directory hierarchies efficiently by avoiding redundant stat calls and processing files in batches.
def find_large_files(root, min_size)
large_files = []
Find.find(root) do |path|
next unless File.file?(path)
begin
stat = File.stat(path)
large_files << [path, stat.size] if stat.size > min_size
rescue Errno::EACCES
# Skip permission denied
end
end
large_files.sort_by { |_, size| -size }
end
# Scan filesystem once, collect results
files = find_large_files('/home/user', 100_000_000)
files.first(10).each do |path, size|
puts "#{path}: #{size / 1_000_000}MB"
end
Common Pitfalls
Reading entire files into memory causes out-of-memory errors for large files. Developers often use File.read without considering file size. The solution involves streaming processing or checking size before loading.
# Problematic pattern
def process_file(path)
content = File.read(path) # Crashes on 5GB file
content.lines.each { |line| analyze(line) }
end
# Correct approach
def process_file_safe(path)
File.open(path, 'r') do |file|
file.each_line { |line| analyze(line) }
end
end
Ignoring buffer flushing leads to data loss when processes terminate abnormally. Developers assume writes complete immediately, but buffering delays actual disk writes. Critical data requires explicit flushing.
# Data loss risk
file = File.open('transactions.log', 'a')
file.puts "Transaction: #{tx_id}"
# Process crashes here - transaction not on disk
# Safe pattern
file = File.open('transactions.log', 'a')
file.puts "Transaction: #{tx_id}"
file.flush # Ensure write to OS
file.fsync # Ensure write to disk
Excessive fsync calls destroy write performance. Developers sometimes fsync after every write operation, reducing throughput by 100x. Batching writes before syncing balances durability and performance.
# Performance disaster
1000.times do |i|
File.open('log.txt', 'a') do |file|
file.puts "Entry #{i}"
file.fsync # 1000 fsync calls
end
end
# Reasonable approach
File.open('log.txt', 'a') do |file|
1000.times do |i|
file.puts "Entry #{i}"
end
file.fsync # Single fsync
end
Not handling EAGAIN errors causes failures in non-blocking I/O. Non-blocking file operations return EAGAIN when data is unavailable. Applications must retry or use select/poll mechanisms.
# Missing error handling
file = File.open('/dev/input/event0', File::RDONLY | File::NONBLOCK)
data = file.read(100) # Raises EAGAIN if no data
# Proper handling
require 'io/wait'
file = File.open('/dev/input/event0', File::RDONLY | File::NONBLOCK)
begin
data = file.read(100)
rescue IO::WaitReadable
IO.select([file])
retry
end
Memory mapping excessive file regions exhausts virtual address space on 32-bit systems. Developers map entire large files without considering address space limits. The solution involves mapping smaller windows or using 64-bit systems.
Forgetting to close files exhausts file descriptors. Each process has a file descriptor limit (typically 1024). Applications processing many files must close them properly or use block form.
# Leaks file descriptors
1000.times do |i|
file = File.open("data_#{i}.txt", 'r')
process(file.read)
# Missing file.close
end
# Eventually raises "Too many open files"
# Correct pattern
1000.times do |i|
File.open("data_#{i}.txt", 'r') do |file|
process(file.read)
end
# Automatic close at block end
end
Assuming atomic writes leads to partial file reads. File writes are not atomic above filesystem block size (typically 4KB). Readers may see partially written data during concurrent access. Atomic file replacement requires write-then-rename patterns.
# Non-atomic update
File.write('config.json', new_config.to_json)
# Readers may see partial JSON during write
# Atomic update
temp_file = 'config.json.tmp'
File.write(temp_file, new_config.to_json)
File.rename(temp_file, 'config.json')
# Readers see either old or new file, never partial
Tools & Ecosystem
strace traces system calls on Linux, revealing actual I/O patterns and helping identify performance problems. The tool shows every read, write, open, and close call with timing information.
# Trace Ruby script I/O
strace -e trace=open,read,write,close -T ruby script.rb
# Count system calls
strace -c ruby script.rb
# Shows calls/time breakdown by syscall type
# Filter specific file operations
strace -e trace=open,stat ruby script.rb 2>&1 | grep config.yml
iotop monitors per-process I/O bandwidth on Linux, showing which processes consume disk I/O. This tool helps identify I/O bottlenecks in production systems.
# Real-time I/O monitoring
iotop -o # Show only processes doing I/O
# Batch mode for logging
iotop -b -n 10 > io_usage.txt
ruby-prof profiles Ruby code including I/O operations. The flat and graph profiles show time spent in file operations.
require 'ruby-prof'
RubyProf.start
# Code to profile
process_files('data/*.txt')
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)
# Shows time in File.read, File.open, etc.
Benchmark gem measures execution time for comparing different file I/O approaches. The module provides convenient methods for A/B testing performance optimizations.
require 'benchmark'
Benchmark.bmbm do |x|
x.report('read all:') do
File.read('large.txt')
end
x.report('read chunks:') do
File.open('large.txt', 'r') do |f|
while chunk = f.read(65536)
# Process
end
end
end
end
iostat reports system-wide I/O statistics including throughput, IOPS, and latency. The tool helps understand storage device performance and bottlenecks.
# Monitor I/O statistics
iostat -x 1 # Extended stats every second
# Focus on specific device
iostat -x sda 2
inotify (Linux) and FSEvents (macOS) provide efficient file system change notification. These APIs avoid polling by getting event notifications from the kernel. Ruby gems like listen and rb-inotify wrap these mechanisms.
require 'listen'
listener = Listen.to('config/') do |modified, added, removed|
puts "Modified: #{modified}"
puts "Added: #{added}"
puts "Removed: #{removed}"
reload_configuration
end
listener.start
sleep
mmap gem enables memory-mapped file access in Ruby. Memory mapping provides efficient random access and shared memory between processes.
require 'mmap'
# Read-only mapping
map = Mmap.new('data.bin', 'r')
byte = map[1000] # Random access
map.munmap
# Read-write mapping with shared access
map = Mmap.new('shared.dat', 'rw', Mmap::MAP_SHARED)
map[0, 4] = [1234].pack('L') # Write integer
map.msync # Force sync to disk
disk cache monitoring tools show page cache hit rates and help evaluate caching effectiveness. Commands like vmstat and free show cache usage.
# View cache memory usage
free -h
# Shows buffers and cache size
# Monitor cache hit/miss rates
vmstat 1
# Watch 'bi' (blocks in) and 'bo' (blocks out)
fio (Flexible I/O Tester) benchmarks storage devices with configurable access patterns. The tool generates realistic workloads for capacity planning.
# Random read test
fio --name=random-read --ioengine=libaio --rw=randread \
--bs=4k --numjobs=4 --size=1g --runtime=60
# Sequential write test
fio --name=seq-write --ioengine=libaio --rw=write \
--bs=1m --size=10g --runtime=60
Reference
File System Performance Characteristics
| Storage Type | Sequential Read | Random Read | Sequential Write | Random Write | Latency |
|---|---|---|---|---|---|
| HDD (7200 RPM) | 100-200 MB/s | 1-2 MB/s | 100-160 MB/s | 1-2 MB/s | 5-15 ms |
| SATA SSD | 500-550 MB/s | 300-400 MB/s | 450-520 MB/s | 250-350 MB/s | 50-150 μs |
| NVMe SSD | 2000-3500 MB/s | 1500-3000 MB/s | 1500-3000 MB/s | 1000-2500 MB/s | 10-50 μs |
| RAM | 20000+ MB/s | 20000+ MB/s | 15000+ MB/s | 15000+ MB/s | <100 ns |
Ruby File I/O Methods
| Method | Operation | Buffering | Use Case |
|---|---|---|---|
| File.read | Read entire file | Automatic | Small files only |
| File.open with block | Stream access | 8KB default | Large files |
| IO#read(size) | Read fixed bytes | Custom size | Binary files |
| IO#each_line | Line iteration | Automatic | Text processing |
| IO#flush | Write buffer to OS | N/A | Before fsync |
| IO#fsync | Force disk write | N/A | Durability needs |
| File.write | Write entire content | Atomic | Small files |
System Call Performance
| System Call | Cost (μs) | Operation | Optimization |
|---|---|---|---|
| read (cached) | 1-2 | Read from page cache | Automatic |
| read (uncached) | 5000-15000 | Read from HDD | Use SSD |
| write (buffered) | 1-2 | Write to page cache | Default behavior |
| fsync | 5000-50000 | Force physical write | Batch updates |
| stat | 1-5 | Get file metadata | Cache results |
| open/close | 5-20 | File handle creation | Keep files open |
Optimal Buffer Sizes
| File Size | Buffer Size | Rationale |
|---|---|---|
| < 100 KB | 8-16 KB | Minimize memory, default works |
| 100 KB - 10 MB | 64-256 KB | Balance memory and syscalls |
| 10 MB - 1 GB | 256 KB - 2 MB | Reduce syscall overhead |
| > 1 GB | 1-4 MB | Maximum throughput |
| Binary formats | 64 KB - 1 MB | Match record/block size |
| Network files | 16-64 KB | Network latency dominates |
File Access Pattern Optimization
| Pattern | Good Practice | Poor Practice |
|---|---|---|
| Sequential read | Read full buffer size | Read single bytes |
| Random read | Sort offsets before seeking | Seek in arbitrary order |
| Sequential write | Buffer writes, flush at end | Flush after each write |
| Random write | Memory map or buffered | Direct write per update |
| Many small files | Combine into single file | Create thousands of files |
| Repeated access | Keep file open | Open and close repeatedly |
Durability vs Performance Trade-offs
| Approach | Throughput | Durability | Use Case |
|---|---|---|---|
| Buffered writes | High | Low | Logs, temporary data |
| Flush after batch | Medium | Medium | Application data |
| fsync per operation | Very low | High | Financial transactions |
| Journaled filesystem | Medium | High | System files |
| Write-ahead log | Medium-high | High | Databases |
| Async replication | High | Medium | Distributed systems |
Common Performance Bottlenecks
| Bottleneck | Symptom | Solution |
|---|---|---|
| Small reads | High syscall overhead | Increase buffer size |
| Excessive fsync | Low write throughput | Batch before syncing |
| Random access | High latency | Sort access pattern |
| Metadata operations | Slow file creation | Reduce file count |
| Memory exhaustion | Process crash | Stream processing |
| Cache misses | Inconsistent performance | Sequential access |
| File descriptor leak | Open file errors | Use block form |
Profiling Commands
| Command | Purpose | Example Output |
|---|---|---|
| strace -c | Count system calls | calls/time by syscall |
| strace -T | Time each syscall | read(3, ...) = 8192 <0.001> |
| time | Overall execution time | real 0m2.5s user 0m1.2s sys 0m0.8s |
| iostat -x | Device I/O stats | await, util%, throughput |
| iotop | Process I/O bandwidth | Per-process MB/s read/write |
| vmstat | Cache hit rates | Blocks in/out per second |