Overview
Tempfile extends File to create temporary files that get automatically deleted when the program exits or when explicitly closed and unlinked. Ruby's implementation handles the complex task of generating unique filenames, placing files in appropriate system directories, and managing cleanup operations.
The class creates files in the system's temporary directory (/tmp
on Unix-like systems, determined by Dir.tmpdir
) with automatically generated unique names. Each Tempfile instance maintains a reference to both the file handle and the filesystem path, allowing standard File operations while tracking cleanup responsibilities.
require 'tempfile'
# Create a temporary file
temp = Tempfile.new('myapp')
temp.write('temporary data')
temp.rewind
puts temp.read # => "temporary data"
temp.close
Tempfile inherits from File, providing access to all standard file operations including reading, writing, seeking, and truncating. The key distinction lies in lifecycle management - temporary files register themselves for cleanup through Ruby's at_exit
handler and provide explicit cleanup methods.
# File operations work normally
temp = Tempfile.new(['prefix_', '.txt'])
temp.puts "Line 1"
temp.puts "Line 2"
temp.rewind
temp.each_line { |line| puts "Read: #{line}" }
temp.close
The constructor accepts either a simple basename string or an array containing a prefix and suffix. Ruby generates the unique portion of the filename automatically, ensuring no conflicts with existing files. The temporary directory location follows system conventions but can be overridden through the constructor's directory parameter.
Basic Usage
Creating temporary files requires calling Tempfile.new
with a basename parameter. The basename can be a string for simple naming or an array specifying both prefix and suffix components. Ruby handles the unique identifier generation automatically.
require 'tempfile'
# Simple basename
basic_temp = Tempfile.new('logfile')
puts basic_temp.path # => "/tmp/logfile20241130-12345-abcdef"
# Prefix and suffix
named_temp = Tempfile.new(['data_', '.json'])
puts named_temp.path # => "/tmp/data_20241130-12345-ghijkl.json"
# Custom directory
custom_temp = Tempfile.new('cache', '/var/tmp')
puts custom_temp.path # => "/var/tmp/cache20241130-12345-mnopqr"
Writing data to temporary files follows standard File patterns. The file remains open for operations until explicitly closed or the program terminates. Ruby buffers write operations according to standard I/O buffering rules.
temp = Tempfile.new('output')
temp.write('Initial content')
temp.puts 'Additional line'
temp.print 'More ', 'content'
# Force buffer flush
temp.flush
# Read back the data
temp.rewind
content = temp.read
puts content
# => Initial content
# => Additional line
# => More content
Reading operations require positioning the file pointer appropriately. The rewind
method returns to the beginning, while seek
provides precise positioning. Ruby maintains the file position across read operations like any standard file handle.
temp = Tempfile.new('data')
temp.puts 'First line'
temp.puts 'Second line'
temp.puts 'Third line'
# Read from beginning
temp.rewind
first_line = temp.gets.chomp # => "First line"
# Read remaining content
remaining = temp.read # => "Second line\nThird line\n"
# Position-based reading
temp.rewind
temp.seek(11, IO::SEEK_SET) # Skip "First line\n"
second_line = temp.gets.chomp # => "Second line"
Cleanup operations provide both automatic and manual control over file deletion. The close
method closes the file handle but keeps the file on disk, while close!
(or unlink
after closing) removes the file immediately. Ruby registers all temporary files for automatic deletion when the program exits.
temp = Tempfile.new('temp_data')
temp.write('some data')
path = temp.path
# File exists and is accessible
puts File.exist?(path) # => true
# Close but keep file
temp.close
puts File.exist?(path) # => true
# Delete the file
temp.unlink
puts File.exist?(path) # => false
Error Handling & Debugging
Tempfile operations raise exceptions for various filesystem and permission issues. The most common exceptions include Errno::EACCES
for permission problems, Errno::ENOSPC
for insufficient disk space, and Errno::EROFS
for read-only filesystems.
Directory permission issues occur when the temporary directory lacks write access or when specifying custom directories without appropriate permissions. Ruby raises Errno::EACCES
in these scenarios, requiring fallback strategies or permission corrections.
def create_temp_file_safely(basename, preferred_dir = nil)
dirs_to_try = [preferred_dir, Dir.tmpdir, '/tmp', '.'].compact
dirs_to_try.each do |dir|
begin
return Tempfile.new(basename, dir)
rescue Errno::EACCES => e
puts "Cannot write to #{dir}: #{e.message}"
next
rescue Errno::ENOTDIR => e
puts "Not a directory #{dir}: #{e.message}"
next
end
end
raise "No writable directory found for temporary file"
end
# Usage with fallback
begin
temp = create_temp_file_safely('myapp', '/restricted/tmp')
rescue => e
puts "Failed to create temporary file: #{e.message}"
end
Disk space exhaustion during write operations raises Errno::ENOSPC
, requiring error handling that accounts for partial writes and cleanup of unusable files. The temporary file may exist but contain incomplete data.
def write_with_space_check(tempfile, data)
begin
tempfile.write(data)
tempfile.flush # Ensure data reaches disk
rescue Errno::ENOSPC => e
# Clean up partial file
tempfile.close!
raise "Insufficient disk space: #{e.message}"
rescue => e
tempfile.close! if tempfile && !tempfile.closed?
raise
end
end
temp = Tempfile.new('large_data')
begin
write_with_space_check(temp, "x" * 1_000_000)
rescue => e
puts "Write failed: #{e.message}"
end
File descriptor exhaustion becomes problematic when creating many temporary files without proper cleanup. Ruby has system-imposed limits on open file descriptors, typically around 1,024 for user processes. Exceeding these limits raises Errno::EMFILE
.
def process_multiple_files_safely(count)
files = []
begin
count.times do |i|
temp = Tempfile.new("batch_#{i}")
temp.write("data for file #{i}")
files << temp
# Periodically close files to manage descriptors
if files.length > 100
files.shift.close!
end
end
rescue Errno::EMFILE => e
puts "Too many open files: #{e.message}"
# Clean up all opened files
files.each(&:close!)
raise
ensure
# Final cleanup
files.each { |f| f.close! unless f.closed? }
end
end
Debugging temporary file issues requires tracking file paths, permissions, and cleanup status. The path
method returns the filesystem location, while closed?
indicates file handle status. Combining these with filesystem checks provides comprehensive debugging information.
def debug_tempfile_state(tempfile)
puts "Path: #{tempfile.path}"
puts "Closed?: #{tempfile.closed?}"
puts "File exists?: #{File.exist?(tempfile.path)}"
puts "File size: #{File.size(tempfile.path)} bytes" if File.exist?(tempfile.path)
if File.exist?(tempfile.path)
stat = File.stat(tempfile.path)
puts "Permissions: #{sprintf('%o', stat.mode & 0777)}"
puts "Owner: #{stat.uid}"
puts "Modified: #{stat.mtime}"
end
rescue => e
puts "Debug error: #{e.message}"
end
Thread Safety & Concurrency
Tempfile creation itself is thread-safe because Ruby generates unique filenames using process ID, thread ID, and atomic counters. Multiple threads can create temporary files simultaneously without filename conflicts or race conditions in the naming mechanism.
require 'thread'
threads = 10.times.map do |i|
Thread.new do
temp = Tempfile.new("thread_#{i}")
temp.write("Data from thread #{i}")
puts "Thread #{i}: #{temp.path}"
temp.close!
end
end
threads.each(&:join)
However, sharing Tempfile instances across threads requires synchronization for write operations. Multiple threads writing to the same file handle can interleave data unpredictably, corrupting the file contents. Ruby's file I/O operations are not atomic at the application level.
require 'thread'
temp = Tempfile.new('shared')
mutex = Mutex.new
threads = 5.times.map do |i|
Thread.new do
10.times do |j|
mutex.synchronize do
temp.puts "Thread #{i}, iteration #{j}"
temp.flush # Ensure immediate write
end
end
end
end
threads.each(&:join)
temp.rewind
puts temp.read
temp.close!
Cleanup operations in concurrent environments require careful coordination. If one thread closes and unlinks a Tempfile while another thread attempts to use it, the second thread encounters Errno::ENOENT
errors. Proper synchronization prevents these race conditions.
class ThreadSafeTempfile
def initialize(basename)
@tempfile = Tempfile.new(basename)
@mutex = Mutex.new
@closed = false
end
def write(data)
@mutex.synchronize do
raise "File already closed" if @closed
@tempfile.write(data)
end
end
def read
@mutex.synchronize do
raise "File already closed" if @closed
@tempfile.rewind
@tempfile.read
end
end
def close!
@mutex.synchronize do
return if @closed
@tempfile.close!
@closed = true
end
end
def path
@tempfile.path
end
end
# Usage across threads
safe_temp = ThreadSafeTempfile.new('concurrent')
writer = Thread.new do
100.times { |i| safe_temp.write("Line #{i}\n") }
safe_temp.close!
end
reader = Thread.new do
sleep 0.1 # Let some writing happen
begin
content = safe_temp.read
puts "Read #{content.lines.count} lines"
rescue => e
puts "Read error: #{e.message}"
end
end
[writer, reader].each(&:join)
Background cleanup threads can monitor and remove abandoned temporary files, but require careful lifecycle management to avoid removing files still in use. Implementing reference counting or explicit registration prevents premature deletion.
class TempfileManager
def initialize
@files = {}
@mutex = Mutex.new
start_cleanup_thread
end
def create_temp(basename)
temp = Tempfile.new(basename)
@mutex.synchronize do
@files[temp.path] = { file: temp, created: Time.now }
end
temp
end
def remove_temp(tempfile)
@mutex.synchronize do
@files.delete(tempfile.path)
end
tempfile.close!
end
private
def start_cleanup_thread
@cleanup_thread = Thread.new do
loop do
sleep 60 # Check every minute
cleanup_old_files
end
end
end
def cleanup_old_files
cutoff = Time.now - 3600 # 1 hour old
@mutex.synchronize do
@files.select { |_, info| info[:created] < cutoff }.each do |path, info|
begin
info[:file].close!
@files.delete(path)
puts "Cleaned up old temp file: #{path}"
rescue => e
puts "Cleanup error for #{path}: #{e.message}"
end
end
end
end
end
Production Patterns
Web applications commonly use temporary files for upload processing, report generation, and data transformation tasks. Proper lifecycle management becomes critical in production environments where memory leaks and disk space exhaustion can impact service availability.
class FileUploadProcessor
def initialize(max_size: 100.megabytes, cleanup_age: 1.hour)
@max_size = max_size
@cleanup_age = cleanup_age
@active_files = {}
setup_cleanup_monitoring
end
def process_upload(uploaded_file)
validate_file_size(uploaded_file)
temp = Tempfile.new(['upload_', '.tmp'], Rails.root.join('tmp'))
@active_files[temp.path] = Time.current
begin
# Process uploaded content
temp.binmode
uploaded_file.rewind
IO.copy_stream(uploaded_file, temp)
temp.flush
# Perform processing operations
result = transform_file_content(temp)
# Store result and cleanup
store_processed_result(result)
ensure
cleanup_temp_file(temp)
end
end
private
def validate_file_size(file)
if file.size > @max_size
raise "File too large: #{file.size} bytes exceeds #{@max_size} bytes"
end
end
def transform_file_content(tempfile)
tempfile.rewind
# Perform transformations
processed_data = tempfile.read.upcase # Example transformation
processed_data
end
def cleanup_temp_file(tempfile)
@active_files.delete(tempfile.path)
tempfile.close! unless tempfile.closed?
rescue => e
Rails.logger.error "Tempfile cleanup failed: #{e.message}"
end
def setup_cleanup_monitoring
Thread.new do
loop do
sleep 300 # Check every 5 minutes
cleanup_stale_files
end
end
end
def cleanup_stale_files
cutoff = Time.current - @cleanup_age
@active_files.select { |_, created_at| created_at < cutoff }.each do |path, _|
begin
File.unlink(path) if File.exist?(path)
@active_files.delete(path)
Rails.logger.info "Cleaned up stale temp file: #{path}"
rescue => e
Rails.logger.error "Failed to cleanup #{path}: #{e.message}"
end
end
end
end
Report generation systems require careful resource management when creating large temporary files. Implementing streaming writes and memory-conscious processing prevents excessive memory usage while maintaining good performance.
class ReportGenerator
def generate_csv_report(query_params)
temp = Tempfile.new(['report_', '.csv'])
begin
# Write CSV headers
temp.puts generate_headers(query_params).to_csv
# Stream data in batches to control memory usage
batch_size = 1000
offset = 0
loop do
records = fetch_records(query_params, limit: batch_size, offset: offset)
break if records.empty?
records.each do |record|
temp.puts format_record_as_csv(record)
end
temp.flush # Ensure data reaches disk
offset += batch_size
# Memory management
GC.start if offset % 10000 == 0
end
# Finalize file
temp.rewind
file_size = temp.size
# Return file info for download
{
path: temp.path,
size: file_size,
filename: "report_#{Time.current.strftime('%Y%m%d_%H%M%S')}.csv"
}
rescue => e
temp.close! if temp && !temp.closed?
raise "Report generation failed: #{e.message}"
end
end
def cleanup_report_file(file_path)
File.unlink(file_path) if File.exist?(file_path)
rescue => e
Rails.logger.error "Failed to cleanup report file #{file_path}: #{e.message}"
end
private
def generate_headers(params)
['ID', 'Name', 'Created At', 'Status'] # Example headers
end
def fetch_records(params, limit:, offset:)
# Database query with pagination
# This is a placeholder - implement actual query logic
[]
end
def format_record_as_csv(record)
[record.id, record.name, record.created_at, record.status].to_csv.chomp
end
end
Monitoring temporary file usage helps prevent disk space issues and identifies resource leaks. Implementing metrics collection and alerting provides operational visibility into temporary file patterns.
class TempfileMonitor
def self.collect_metrics
temp_dir = Dir.tmpdir
pattern = File.join(temp_dir, '*')
files = Dir.glob(pattern)
ruby_tempfiles = files.select { |f| File.basename(f).match?(/\A\w+\d{8}-\d+-\w+/) }
total_size = ruby_tempfiles.sum { |f| File.size(f) rescue 0 }
file_count = ruby_tempfiles.count
oldest_file_age = if ruby_tempfiles.any?
Time.current - ruby_tempfiles.map { |f| File.mtime(f) rescue Time.current }.min
else
0
end
{
temp_file_count: file_count,
temp_files_size_bytes: total_size,
oldest_temp_file_age_seconds: oldest_file_age.to_i,
temp_directory: temp_dir
}
end
def self.alert_on_excessive_usage(max_files: 1000, max_size_mb: 500)
metrics = collect_metrics
if metrics[:temp_file_count] > max_files
alert("Too many temporary files: #{metrics[:temp_file_count]} > #{max_files}")
end
size_mb = metrics[:temp_files_size_bytes] / 1_048_576
if size_mb > max_size_mb
alert("Temporary files using too much space: #{size_mb}MB > #{max_size_mb}MB")
end
end
def self.alert(message)
Rails.logger.error "[TEMPFILE ALERT] #{message}"
# Send to monitoring system, email, etc.
end
end
Performance & Memory
Tempfile performance depends primarily on the underlying filesystem and disk I/O characteristics. SSD storage provides better random access performance for temporary files compared to traditional spinning disks, especially for workloads involving frequent seeks and small writes.
Memory usage patterns differ significantly between text and binary modes. Text mode processing involves character encoding conversions that consume additional memory, while binary mode (binmode
) provides direct byte access with minimal memory overhead.
require 'benchmark'
def benchmark_write_modes(data_size)
data = 'x' * data_size
Benchmark.bm(15) do |bm|
bm.report('text mode') do
temp = Tempfile.new('text_test')
temp.write(data)
temp.close!
end
bm.report('binary mode') do
temp = Tempfile.new('binary_test')
temp.binmode
temp.write(data)
temp.close!
end
end
end
# Test with 10MB of data
benchmark_write_modes(10 * 1024 * 1024)
Large file processing benefits from streaming approaches that minimize memory footprint. Reading entire temporary files into memory can cause issues with large datasets, while line-by-line or chunk-based processing maintains consistent memory usage.
class MemoryEfficientProcessor
CHUNK_SIZE = 64 * 1024 # 64KB chunks
def process_large_tempfile(tempfile)
tempfile.rewind
processed_bytes = 0
while chunk = tempfile.read(CHUNK_SIZE)
break if chunk.empty?
# Process chunk without loading entire file
process_chunk(chunk)
processed_bytes += chunk.bytesize
# Periodic memory cleanup
if processed_bytes % (1024 * 1024) == 0 # Every MB
GC.start
puts "Processed #{processed_bytes / 1024 / 1024}MB"
end
end
processed_bytes
end
private
def process_chunk(chunk)
# Example processing - count characters
chunk.each_char.count { |c| c.match?(/[a-zA-Z]/) }
end
end
# Usage with large file
temp = Tempfile.new('large_data')
temp.write('A' * 50_000_000) # 50MB file
temp.flush
processor = MemoryEfficientProcessor.new
bytes_processed = processor.process_large_tempfile(temp)
puts "Total processed: #{bytes_processed} bytes"
temp.close!
Buffer management affects both performance and memory usage. Ruby's default buffering behavior works well for most cases, but explicit buffer control can optimize specific scenarios like high-frequency small writes or large sequential transfers.
def compare_buffering_strategies(write_count, data_per_write)
data = 'x' * data_per_write
# Strategy 1: Default buffering
time1 = Benchmark.realtime do
temp1 = Tempfile.new('default_buffer')
write_count.times { temp1.write(data) }
temp1.close!
end
# Strategy 2: Explicit flushing
time2 = Benchmark.realtime do
temp2 = Tempfile.new('flush_each')
write_count.times do
temp2.write(data)
temp2.flush
end
temp2.close!
end
# Strategy 3: Batch writing
time3 = Benchmark.realtime do
temp3 = Tempfile.new('batch_write')
batch_data = data * write_count
temp3.write(batch_data)
temp3.close!
end
puts "Default buffering: #{time1.round(3)}s"
puts "Flush each write: #{time2.round(3)}s"
puts "Batch writing: #{time3.round(3)}s"
end
# Test with many small writes
compare_buffering_strategies(10_000, 100)
File descriptor management becomes performance-critical in applications creating many temporary files. Each open Tempfile consumes a file descriptor from the system's limited pool. Proper cleanup prevents descriptor exhaustion and associated performance degradation.
class FileDescriptorTracker
def self.current_fd_count
if RUBY_PLATFORM =~ /linux/
Dir['/proc/self/fd/*'].count
elsif RUBY_PLATFORM =~ /darwin/
`lsof -p #{Process.pid} | wc -l`.to_i
else
-1 # Unknown platform
end
rescue
-1
end
def self.monitor_fd_usage
initial_count = current_fd_count
yield
final_count = current_fd_count
if final_count > 0 && initial_count > 0
puts "File descriptor change: #{final_count - initial_count}"
end
end
end
# Monitor FD usage during temp file operations
FileDescriptorTracker.monitor_fd_usage do
temps = 100.times.map { Tempfile.new('fd_test') }
temps.each(&:close!)
end
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
Tempfile.new(basename, tmpdir=Dir.tmpdir, mode: 0, **options) |
basename (String/Array), tmpdir (String), mode (Integer), options (Hash) |
Tempfile |
Creates new temporary file with unique name |
#close(unlink_now=false) |
unlink_now (Boolean) |
nil |
Closes file handle, optionally deletes file |
#close! |
None | nil |
Closes and immediately deletes temporary file |
#unlink |
None | Tempfile |
Removes file from filesystem, keeps handle open |
#path |
None | String |
Returns full filesystem path to temporary file |
#size |
None | Integer |
Returns current file size in bytes |
Lifecycle Management
Method | Parameters | Returns | Description |
---|---|---|---|
#rewind |
None | 0 |
Sets file position to beginning |
#flush |
None | Tempfile |
Forces buffered data to disk |
#fsync |
None | 0 |
Synchronizes file data and metadata to disk |
#closed? |
None | Boolean |
Returns true if file handle is closed |
#binmode |
None | Tempfile |
Sets binary mode for file operations |
Class Methods
Method | Parameters | Returns | Description |
---|---|---|---|
Tempfile.create(basename, tmpdir=Dir.tmpdir, **options) {block} |
basename (String/Array), tmpdir (String), block |
Block result | Creates tempfile, yields to block, ensures cleanup |
Tempfile.open(*args) {block} |
Same as new |
Block result | Alias for create method |
Constructor Options
Option | Type | Default | Description |
---|---|---|---|
:mode |
Integer | 0 |
File permission mode (combined with umask) |
:suffix |
String | '' |
File extension (alternative to array basename) |
:prefix |
String | '' |
Filename prefix (alternative to array basename) |
:tmpdir |
String | Dir.tmpdir |
Directory for temporary file creation |
Basename Format Options
Format | Example | Generated Filename |
---|---|---|
String | 'myapp' |
myapp20241130-1234-5678ab |
Array with suffix | ['data_', '.json'] |
data_20241130-1234-5678ab.json |
Array without suffix | ['prefix_', ''] |
prefix_20241130-1234-5678ab |
Common Exceptions
Exception | Trigger Condition | Typical Cause |
---|---|---|
Errno::EACCES |
Permission denied | Insufficient write permissions to temp directory |
Errno::ENOSPC |
No space left | Disk full during file creation or write |
Errno::EMFILE |
Too many open files | Exceeded process file descriptor limit |
Errno::ENOENT |
File not found | Attempting to access unlinked temporary file |
Errno::EROFS |
Read-only filesystem | Temp directory on read-only mount |
Cleanup Behavior
Scenario | Automatic Cleanup | Manual Cleanup Required |
---|---|---|
Program exit | Yes (via at_exit ) |
No |
Exception during processing | No | Yes (use ensure blocks) |
Long-running processes | No | Yes (call close! or unlink ) |
Thread termination | No | Yes |
Garbage collection | No | Yes |
Performance Characteristics
Operation | Typical Performance | Memory Usage |
---|---|---|
File creation | O(1) + filesystem overhead | Minimal (file handle only) |
Sequential write | O(n) with data size | Buffered (8KB default) |
Random access | O(1) + seek time | Position-dependent |
Cleanup | O(1) + filesystem overhead | None after cleanup |
Thread Safety Matrix
Operation | Thread Safe | Notes |
---|---|---|
Creating new Tempfile | Yes | Unique name generation is atomic |
Writing to same instance | No | Requires external synchronization |
Reading from same instance | No | File position is shared |
Closing/unlinking | No | Race conditions possible |
Path access | Yes | Path string is immutable |