Overview
Ruby's IO class methods offer direct access to file system operations and stream handling without requiring explicit IO object instantiation. These class-level methods interact directly with the operating system's file descriptors, providing efficient mechanisms for reading, writing, and manipulating files and streams.
The IO class methods fall into several categories: file reading operations (IO.read
, IO.readlines
, IO.foreach
), file writing operations (IO.write
, IO.binwrite
), file testing operations (IO.exist?
, IO.size
), and advanced operations (IO.copy_stream
, IO.pipe
, IO.select
). These methods handle encoding conversion, buffering strategies, and system-level error reporting.
Ruby implements these operations as direct calls to the underlying operating system APIs, making them suitable for high-performance file processing and system integration tasks. The methods accept various options for encoding specification, offset positioning, and length limiting.
# Direct file reading without IO object creation
content = IO.read('/etc/hosts')
# => "127.0.0.1\tlocalhost\n..."
# Conditional file operations
lines = IO.readlines('data.txt') if IO.exist?('data.txt')
# => ["first line\n", "second line\n"]
# Binary data handling
IO.binwrite('output.dat', "\x00\x01\x02\x03")
# => 4
Basic Usage
IO class methods handle the most common file operations through direct class method calls. The IO.read
method provides complete file reading with optional encoding, offset, and length parameters. This method loads entire file contents into memory, making it suitable for smaller files and configuration data.
# Complete file reading
config = IO.read('config.json')
parsed_config = JSON.parse(config)
# Partial file reading with offset and length
header = IO.read('binary_file.dat', 512, 0) # First 512 bytes
footer = IO.read('binary_file.dat', 256, -256) # Last 256 bytes
# Encoding-specific reading
utf8_content = IO.read('unicode.txt', encoding: 'UTF-8')
The IO.readlines
method splits file content into an array of lines, handling different line ending conventions automatically. This method preserves line terminators unless specified otherwise through options.
# Line-by-line file processing
log_lines = IO.readlines('/var/log/app.log')
error_lines = log_lines.select { |line| line.include?('ERROR') }
# Custom line separator handling
records = IO.readlines('data.csv', chomp: true) # Remove line endings
paragraphs = IO.readlines('document.txt', "\n\n") # Paragraph separator
File writing operations use IO.write
for text content and IO.binwrite
for binary data. These methods create files if they don't exist and truncate existing files by default.
# Text file writing with encoding
IO.write('output.txt', "Hello, World!\n", encoding: 'UTF-8')
# Binary file writing
binary_data = [0xFF, 0xD8, 0xFF, 0xE0].pack('C*')
IO.binwrite('image_header.bin', binary_data)
# Append mode writing
IO.write('log.txt', "New entry\n", mode: 'a')
Stream copying operations handle efficient data transfer between files, network sockets, and other IO objects without loading complete content into memory.
# File-to-file copying
bytes_copied = IO.copy_stream('source.txt', 'destination.txt')
puts "Copied #{bytes_copied} bytes"
# Partial stream copying with offset
IO.copy_stream('large_file.dat', 'extract.dat', 1024, 512) # Copy 1KB from offset 512
Error Handling & Debugging
IO class methods raise specific exception types that indicate different failure modes. Understanding these exceptions enables proper error recovery and user feedback mechanisms.
File access errors generate Errno::ENOENT
for missing files, Errno::EACCES
for permission issues, and Errno::EISDIR
when attempting file operations on directories. These system-level exceptions include errno codes and descriptive messages.
# Comprehensive file reading with error handling
def safe_read_file(filename)
begin
IO.read(filename)
rescue Errno::ENOENT
puts "File not found: #{filename}"
nil
rescue Errno::EACCES
puts "Permission denied: #{filename}"
nil
rescue Errno::EISDIR
puts "Cannot read directory as file: #{filename}"
nil
rescue SystemCallError => e
puts "System error reading #{filename}: #{e.message}"
nil
end
end
# Usage with fallback behavior
content = safe_read_file('config.txt') || safe_read_file('default_config.txt')
Encoding errors occur when file content doesn't match specified encoding parameters. Ruby raises Encoding::InvalidByteSequenceError
for malformed byte sequences and Encoding::UndefinedConversionError
for characters that cannot be represented in the target encoding.
# Encoding error handling with fallback strategies
def read_with_encoding_fallback(filename)
encodings = ['UTF-8', 'ISO-8859-1', 'ASCII-8BIT']
encodings.each do |encoding|
begin
return IO.read(filename, encoding: encoding)
rescue Encoding::InvalidByteSequenceError, Encoding::UndefinedConversionError
next # Try next encoding
end
end
# Last resort: binary mode
IO.read(filename, encoding: 'ASCII-8BIT')
rescue StandardError => e
puts "Failed to read #{filename} with any encoding: #{e.message}"
nil
end
Disk space and file system errors manifest as Errno::ENOSPC
for insufficient space and Errno::EROFS
for read-only file systems. These conditions require different recovery strategies.
# Write operation with disk space monitoring
def safe_write_file(filename, content)
begin
IO.write(filename, content)
rescue Errno::ENOSPC
# Check available space and clean temporary files
available_space = `df #{File.dirname(filename)}`.split("\n")[1].split[3].to_i
puts "Insufficient disk space. Available: #{available_space}KB"
false
rescue Errno::EROFS
puts "Cannot write to read-only file system"
false
rescue SystemCallError => e
puts "Write failed: #{e.message} (errno: #{e.errno})"
false
end
end
# Validation before write operations
def validate_write_operation(filename, content)
directory = File.dirname(filename)
unless File.directory?(directory)
puts "Target directory does not exist: #{directory}"
return false
end
unless File.writable?(directory)
puts "No write permission for directory: #{directory}"
return false
end
# Estimate required space (content size + metadata overhead)
required_space = content.bytesize + 4096 # Add filesystem metadata overhead
available_space = `df #{directory}`.split("\n")[1].split[3].to_i * 1024
if required_space > available_space
puts "Insufficient space. Required: #{required_space}, Available: #{available_space}"
return false
end
true
end
Performance & Memory
IO class methods exhibit different performance characteristics based on file size, system resources, and access patterns. Understanding these patterns enables optimization for specific use cases.
Reading strategies impact memory usage significantly. The IO.read
method loads complete file content into memory, while IO.foreach
processes files line-by-line with minimal memory overhead. Large file processing requires streaming approaches to avoid memory exhaustion.
# Memory-efficient large file processing
def process_large_file(filename)
line_count = 0
error_count = 0
IO.foreach(filename) do |line|
line_count += 1
error_count += 1 if line.include?('ERROR')
# Process line immediately, don't accumulate
if line_count % 10000 == 0
puts "Processed #{line_count} lines, #{error_count} errors found"
end
end
{ total_lines: line_count, errors: error_count }
end
# Memory comparison: whole file vs streaming
def compare_memory_usage(filename)
# High memory usage - loads complete file
start_memory = `ps -o rss= -p #{$$}`.to_i
content = IO.read(filename)
after_read_memory = `ps -o rss= -p #{$$}`.to_i
puts "Memory after IO.read: #{after_read_memory - start_memory}KB increase"
# Low memory usage - streaming processing
start_memory = after_read_memory
line_count = 0
IO.foreach(filename) { |line| line_count += 1 }
after_foreach_memory = `ps -o rss= -p #{$$}`.to_i
puts "Memory after IO.foreach: #{after_foreach_memory - start_memory}KB increase"
puts "Lines processed: #{line_count}"
end
Binary operations typically outperform text operations due to reduced encoding overhead. The IO.binread
and IO.binwrite
methods bypass encoding conversion, providing maximum throughput for binary data.
# Performance comparison: text vs binary operations
require 'benchmark'
def performance_comparison(filename, data)
Benchmark.bm(15) do |x|
x.report("Text write:") { IO.write(filename, data) }
x.report("Binary write:") { IO.binwrite(filename, data) }
x.report("Text read:") { IO.read(filename) }
x.report("Binary read:") { IO.binread(filename) }
end
end
# Stream copying performance optimizations
def optimized_copy_stream(source, destination, buffer_size = 64 * 1024)
File.open(source, 'rb') do |src|
File.open(destination, 'wb') do |dst|
# Use larger buffer for better performance
IO.copy_stream(src, dst) # Ruby optimizes internally
end
end
end
# Batch file operations for better performance
def batch_file_operations(file_list)
results = {}
# Batch existence checks
existing_files = file_list.select { |f| IO.exist?(f) }
# Batch size calculations
file_sizes = existing_files.each_with_object({}) do |filename, sizes|
sizes[filename] = IO.size(filename) if IO.exist?(filename)
end
# Process files sorted by size for memory optimization
existing_files.sort_by { |f| file_sizes[f] }.each do |filename|
if file_sizes[filename] < 1_000_000 # Files under 1MB
results[filename] = IO.read(filename)
else
# Stream process large files
results[filename] = process_large_file(filename)
end
end
results
end
Buffer size optimization affects I/O performance significantly. Ruby's default buffer sizes work well for most cases, but specific workloads benefit from tuning.
# Buffer size optimization testing
def test_buffer_sizes(filename, data)
buffer_sizes = [4096, 8192, 16384, 32768, 64536]
buffer_sizes.each do |size|
time = Benchmark.realtime do
File.open(filename, 'wb', buffer: size) do |f|
f.write(data)
end
end
puts "Buffer size #{size}: #{time.round(4)}s"
end
end
Production Patterns
Production environments require robust file handling patterns that account for concurrent access, system failures, and resource constraints. IO class methods integrate with monitoring, logging, and deployment workflows.
Configuration file management uses atomic write operations to prevent corruption during updates. This pattern ensures configuration consistency across application restarts and deployments.
# Atomic configuration updates
class ConfigManager
def self.update_config(filename, new_config)
temp_filename = "#{filename}.tmp.#{Process.pid}"
begin
# Write to temporary file first
IO.write(temp_filename, new_config.to_json)
# Verify written content
verification = JSON.parse(IO.read(temp_filename))
raise "Configuration verification failed" unless verification == new_config
# Atomic rename operation
File.rename(temp_filename, filename)
puts "Configuration updated successfully"
true
rescue StandardError => e
# Clean up temporary file
File.unlink(temp_filename) if File.exist?(temp_filename)
puts "Configuration update failed: #{e.message}"
false
end
end
def self.load_config_with_fallback(primary_config, fallback_config)
if IO.exist?(primary_config)
JSON.parse(IO.read(primary_config))
elsif IO.exist?(fallback_config)
puts "Using fallback configuration: #{fallback_config}"
JSON.parse(IO.read(fallback_config))
else
raise "No configuration file found"
end
end
end
Log file rotation and monitoring patterns handle growing log files without interrupting application operation. These patterns integrate with system log rotation tools and monitoring systems.
# Production log management
class LogManager
MAX_LOG_SIZE = 100 * 1024 * 1024 # 100MB
MAX_LOG_FILES = 10
def self.write_log_entry(log_file, entry)
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
log_line = "[#{timestamp}] #{entry}\n"
# Check if rotation is needed
if IO.exist?(log_file) && IO.size(log_file) > MAX_LOG_SIZE
rotate_log_file(log_file)
end
IO.write(log_file, log_line, mode: 'a')
end
def self.rotate_log_file(log_file)
# Move existing log files
(MAX_LOG_FILES - 1).downto(1) do |i|
old_file = "#{log_file}.#{i}"
new_file = "#{log_file}.#{i + 1}"
if IO.exist?(old_file)
File.rename(old_file, new_file)
end
end
# Move current log file
File.rename(log_file, "#{log_file}.1") if IO.exist?(log_file)
end
def self.analyze_recent_logs(log_file, hours_back = 24)
return [] unless IO.exist?(log_file)
cutoff_time = Time.now - (hours_back * 3600)
recent_entries = []
IO.foreach(log_file) do |line|
if line =~ /^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\]/
entry_time = Time.strptime($1, '%Y-%m-%d %H:%M:%S')
recent_entries << line if entry_time >= cutoff_time
end
end
recent_entries
end
end
Data export and backup operations require efficient file handling patterns that minimize resource usage and provide progress feedback for long-running operations.
# Production data export patterns
class DataExporter
def self.export_to_csv(data_source, output_file, batch_size = 1000)
total_records = data_source.count
processed_records = 0
File.open(output_file, 'w') do |file|
# Write CSV header
file.write("id,name,created_at,status\n")
data_source.find_in_batches(batch_size: batch_size) do |batch|
csv_lines = batch.map do |record|
"#{record.id},\"#{record.name}\",#{record.created_at},#{record.status}"
end
file.write(csv_lines.join("\n") + "\n")
processed_records += batch.size
# Progress reporting
progress = (processed_records.to_f / total_records * 100).round(1)
puts "Export progress: #{progress}% (#{processed_records}/#{total_records})"
end
end
puts "Export completed: #{output_file} (#{IO.size(output_file)} bytes)"
end
def self.verify_export(original_count, export_file)
return false unless IO.exist?(export_file)
line_count = 0
IO.foreach(export_file) { |line| line_count += 1 }
# Subtract header line from count
exported_count = line_count - 1
if exported_count == original_count
puts "Export verification successful: #{exported_count} records"
true
else
puts "Export verification failed: expected #{original_count}, found #{exported_count}"
false
end
end
end
Common Pitfalls
IO class methods exhibit behavior that frequently causes issues in production applications. Understanding these pitfalls prevents common errors and performance problems.
File encoding assumptions cause data corruption when files contain different character encodings than expected. Ruby's default encoding behavior may not match file content, leading to garbled text or encoding exceptions.
# Encoding detection and handling
def robust_file_reading(filename)
# Attempt to detect encoding from BOM or content
first_bytes = IO.binread(filename, 4)
encoding = case first_bytes
when /^\xEF\xBB\xBF/ # UTF-8 BOM
'UTF-8'
when /^\xFF\xFE/ # UTF-16 LE BOM
'UTF-16LE'
when /^\xFE\xFF/ # UTF-16 BE BOM
'UTF-16BE'
else
# Fallback: try UTF-8, then system encoding
begin
IO.read(filename, 100, encoding: 'UTF-8')
'UTF-8'
rescue Encoding::InvalidByteSequenceError
Encoding.default_external.to_s
end
end
IO.read(filename, encoding: encoding)
rescue Encoding::InvalidByteSequenceError => e
puts "Encoding error in #{filename}: #{e.message}"
# Last resort: read as binary and force encoding
IO.read(filename, encoding: 'ASCII-8BIT').force_encoding('UTF-8')
end
# Common mistake: assuming UTF-8 everywhere
def demonstrate_encoding_pitfall
# This fails if file contains non-UTF-8 content
begin
content = IO.read('mixed_encoding.txt', encoding: 'UTF-8')
rescue Encoding::InvalidByteSequenceError
puts "UTF-8 assumption failed - file contains different encoding"
# Correct approach: read as binary first, then handle encoding
binary_content = IO.binread('mixed_encoding.txt')
content = binary_content.encode('UTF-8', invalid: :replace, undef: :replace)
end
end
File system assumptions about path separators, case sensitivity, and filename restrictions cause cross-platform compatibility issues. Code that works on Unix systems may fail on Windows and vice versa.
# Cross-platform file operations
def platform_safe_file_ops(base_path, filename)
# Normalize path separators
safe_filename = filename.gsub(/[<>:"|?*]/, '_') # Remove Windows-invalid chars
full_path = File.join(base_path, safe_filename)
# Handle case sensitivity differences
if File.exist?(full_path)
IO.read(full_path)
else
# Case-insensitive search on case-sensitive systems
directory = File.dirname(full_path)
target_name = File.basename(full_path).downcase
if File.directory?(directory)
matching_file = Dir.entries(directory).find do |entry|
entry.downcase == target_name
end
if matching_file
actual_path = File.join(directory, matching_file)
IO.read(actual_path)
else
raise Errno::ENOENT, "File not found: #{full_path}"
end
end
end
end
# Handling long paths and special characters
def safe_path_handling(path)
# Windows path length limitation
if RUBY_PLATFORM =~ /mswin|mingw|cygwin/
if path.length > 260
puts "Warning: Path exceeds Windows MAX_PATH limit"
return false
end
end
# Check for problematic characters
problematic_chars = /[^\w\s\-\.\/\\]/
if path.match(problematic_chars)
puts "Warning: Path contains special characters: #{path}"
end
true
end
Memory exhaustion occurs when processing large files with methods that load complete content into memory. This mistake is common when upgrading from small test files to production data volumes.
# Demonstrating memory pitfalls
def memory_pitfall_example
# WRONG: Will fail with large files
def process_large_log_wrong(filename)
all_lines = IO.readlines(filename) # Loads entire file into memory
error_lines = all_lines.select { |line| line.include?('ERROR') }
error_lines.each { |line| puts line }
end
# CORRECT: Streaming approach
def process_large_log_correct(filename)
IO.foreach(filename) do |line|
puts line if line.include?('ERROR') # Process immediately
end
end
# Memory usage comparison
filename = 'large_log.txt'
puts "Wrong approach (high memory):"
memory_before = `ps -o rss= -p #{$$}`.to_i
process_large_log_wrong(filename) if IO.size(filename) < 10_000_000 # Safety check
memory_after = `ps -o rss= -p #{$$}`.to_i
puts "Memory used: #{memory_after - memory_before}KB"
puts "Correct approach (low memory):"
memory_before = memory_after
process_large_log_correct(filename)
memory_after = `ps -o rss= -p #{$$}`.to_i
puts "Memory used: #{memory_after - memory_before}KB"
end
Race conditions occur in concurrent file access scenarios. Multiple processes writing to the same file simultaneously can cause data corruption or loss.
# File locking for concurrent access
def concurrent_safe_append(filename, content)
File.open(filename, 'a') do |file|
begin
file.flock(File::LOCK_EX) # Exclusive lock
file.write("#{Time.now}: #{content}\n")
file.flush # Ensure immediate write
ensure
file.flock(File::LOCK_UN) # Release lock
end
end
end
# Atomic file replacement pattern
def atomic_file_update(filename, new_content)
temp_file = "#{filename}.tmp.#{Process.pid}.#{Thread.current.object_id}"
begin
IO.write(temp_file, new_content)
File.rename(temp_file, filename) # Atomic operation on same filesystem
rescue StandardError => e
File.unlink(temp_file) if File.exist?(temp_file)
raise e
end
end
# Directory creation race condition
def safe_directory_creation(directory_path)
begin
Dir.mkdir(directory_path) unless Dir.exist?(directory_path)
rescue Errno::EEXIST
# Another process created the directory - this is fine
puts "Directory already exists: #{directory_path}"
end
end
Reference
Core Reading Methods
Method | Parameters | Returns | Description |
---|---|---|---|
IO.read(name, **opts) |
name (String), options (Hash) |
String |
Reads entire file content into memory |
IO.binread(name, length=nil, offset=0) |
name (String), length (Integer), offset (Integer) |
String |
Reads binary data without encoding conversion |
IO.readlines(name, **opts) |
name (String), options (Hash) |
Array<String> |
Reads file lines into array |
IO.foreach(name, **opts) {block} |
name (String), options (Hash), block |
Enumerator |
Iterates through file lines |
Core Writing Methods
Method | Parameters | Returns | Description |
---|---|---|---|
IO.write(name, string, **opts) |
name (String), string (String), options (Hash) |
Integer |
Writes text content to file |
IO.binwrite(name, string, offset=0) |
name (String), string (String), offset (Integer) |
Integer |
Writes binary data to file |
File Information Methods
Method | Parameters | Returns | Description |
---|---|---|---|
IO.exist?(name) |
name (String) |
Boolean |
Tests file existence |
IO.size(name) |
name (String) |
Integer |
Returns file size in bytes |
IO.empty?(name) |
name (String) |
Boolean |
Tests if file is empty |
Stream Operations
Method | Parameters | Returns | Description |
---|---|---|---|
IO.copy_stream(src, dst, copy_length=nil, src_offset=0) |
src (IO/String), dst (IO/String), copy_length (Integer), src_offset (Integer) |
Integer |
Copies data between streams |
IO.pipe(**opts) |
options (Hash) | Array<IO> |
Creates connected read/write pipe |
IO.select(read_array, write_array=nil, error_array=nil, timeout=nil) |
Arrays of IO objects, timeout (Numeric) |
Array or nil |
Monitors multiple IO objects |
Common Options Hash Keys
Option | Type | Default | Description |
---|---|---|---|
:encoding |
String/Encoding | System default | Character encoding for text operations |
:mode |
String | 'r' | File access mode ('r', 'w', 'a', etc.) |
:offset |
Integer | 0 | Starting position for read operations |
:length |
Integer | nil | Maximum bytes to read |
:chomp |
Boolean | false | Remove line endings from readlines |
:binmode |
Boolean | false | Use binary mode |
Exception Hierarchy
Exception Class | Condition | Recovery Strategy |
---|---|---|
Errno::ENOENT |
File not found | Check file path, create file, use fallback |
Errno::EACCES |
Permission denied | Check file permissions, run with elevated privileges |
Errno::EISDIR |
Target is directory | Use directory-specific operations |
Errno::ENOSPC |
No disk space | Clean temporary files, alert administrators |
Errno::EROFS |
Read-only filesystem | Use temporary location, alert user |
Encoding::InvalidByteSequenceError |
Invalid encoding | Try different encoding, use replacement characters |
Encoding::UndefinedConversionError |
Character conversion failed | Use replacement characters, change target encoding |
File Mode Strings
Mode | Description | File Position | Truncates | Creates |
---|---|---|---|---|
'r' | Read only | Beginning | No | No |
'w' | Write only | Beginning | Yes | Yes |
'a' | Write only | End | No | Yes |
'r+' | Read/Write | Beginning | No | No |
'w+' | Read/Write | Beginning | Yes | Yes |
'a+' | Read/Write | End | No | Yes |
Encoding Names Reference
Encoding | Aliases | Description |
---|---|---|
'UTF-8' | 'utf8' | Unicode 8-bit encoding |
'ASCII-8BIT' | 'binary' | Binary data, no encoding |
'ISO-8859-1' | 'latin1' | Western European encoding |
'UTF-16' | 'utf16' | Unicode 16-bit encoding |
'UTF-32' | 'utf32' | Unicode 32-bit encoding |
'Shift_JIS' | 'sjis' | Japanese encoding |
Performance Characteristics
Operation | Memory Usage | CPU Usage | Disk I/O | Best For |
---|---|---|---|---|
IO.read |
High | Low | Single | Small files |
IO.foreach |
Low | Medium | Streaming | Large files |
IO.readlines |
High | Low | Single | Line processing |
IO.binread |
High | Low | Single | Binary data |
IO.copy_stream |
Low | Low | Streaming | File copying |