Overview
StringIO creates an IO-like object that reads from and writes to strings instead of files or network connections. Ruby implements StringIO as part of the standard library, providing the same interface as File and IO classes but operating entirely in memory. The class maintains an internal string buffer and position pointer, mimicking file descriptor behavior without filesystem interaction.
StringIO inherits from IO and supports most standard IO operations including reading, writing, positioning, and mode management. The internal string grows dynamically as data is written, and the position pointer tracks the current read/write location within the buffer.
require 'stringio'
# Create empty StringIO object
io = StringIO.new
io.write("Hello World")
io.rewind
puts io.read # => "Hello World"
Ruby creates StringIO instances in read-write mode by default, but accepts mode strings identical to File.open. The class supports text and binary modes, positioning operations, and buffer manipulation methods that mirror filesystem I/O behavior.
# Initialize with content and mode
io = StringIO.new("Initial content", "r")
io.read(7) # => "Initial"
# Binary mode
binary_io = StringIO.new("".b)
binary_io.write([0xFF, 0xFE].pack('C*'))
StringIO serves three primary purposes: testing I/O operations without files, building strings using IO methods, and providing IO interfaces to string processing pipelines. The class maintains internal state including current position, mode flags, and the underlying string buffer accessible through the string
method.
Basic Usage
StringIO construction accepts an optional string argument and mode specification. When no string is provided, Ruby creates an empty internal buffer. The mode parameter controls read/write permissions and text/binary handling using the same format as File operations.
require 'stringio'
# Empty StringIO
empty = StringIO.new
empty.write("First line\n")
empty.write("Second line\n")
empty.rewind
empty.read # => "First line\nSecond line\n"
Writing operations append data at the current position, potentially overwriting existing content. The internal string expands automatically when writing beyond current boundaries. Position tracking behaves identically to file handles, advancing after each read or write operation.
# Writing at specific positions
io = StringIO.new("ABCDEFGH")
io.pos = 3
io.write("XYZ")
io.string # => "ABCXYZGH"
# Position advances after operations
io.pos # => 6
io.write("123")
io.string # => "ABCXYZ123"
Reading methods extract data from the current position, returning partial content when insufficient data remains. The read
method without arguments returns all data from current position to end, while read(n)
returns exactly n bytes or nil when no data is available.
io = StringIO.new("Hello World Programming")
io.read(5) # => "Hello"
io.read(6) # => " World"
io.read # => " Programming"
io.read # => ""
io.read(1) # => nil
Line-oriented operations support text processing workflows. The gets
method reads until newline characters, while readline
raises EOFError when no more lines exist. The each_line
method provides enumeration over line boundaries.
text = "Line 1\nLine 2\nLine 3\n"
io = StringIO.new(text)
# Reading individual lines
io.gets # => "Line 1\n"
io.gets # => "Line 2\n"
# Enumerating all lines
io.rewind
io.each_line { |line| puts "Processing: #{line.chomp}" }
Position management uses familiar file operations. The rewind
method resets position to zero, seek
moves to absolute or relative positions, and tell
returns current position. These operations enable random access patterns within the string buffer.
Error Handling & Debugging
StringIO raises IOError exceptions when operations conflict with current mode settings. Attempting to write to read-only instances or read from write-only instances triggers immediate errors. The class validates mode permissions before executing operations, providing clear error messages for debugging.
# Read-only mode restrictions
read_only = StringIO.new("Content", "r")
begin
read_only.write("More")
rescue IOError => e
puts e.message # => "not opened for writing"
end
# Write-only mode restrictions
write_only = StringIO.new("", "w")
begin
write_only.read
rescue IOError => e
puts e.message # => "not opened for reading"
end
Position errors occur when seeking beyond valid boundaries or using invalid whence parameters. StringIO accepts the same seek constants as File (IO::SEEK_SET, IO::SEEK_CUR, IO::SEEK_END) and raises Errno::EINVAL for invalid combinations.
io = StringIO.new("Short")
begin
io.seek(-10, IO::SEEK_SET)
rescue Errno::EINVAL => e
puts "Invalid seek position"
end
# Valid seek operations
io.seek(0, IO::SEEK_END) # Move to end
io.seek(-2, IO::SEEK_CUR) # Move back 2 positions
EOFError exceptions occur during read operations when no data remains. The readline
and readlines
methods raise these errors when encountering end-of-file conditions, distinguishing between empty results and actual EOF states.
io = StringIO.new("Single line")
io.read # Consume all content
begin
io.readline
rescue EOFError => e
puts "Reached end of string"
end
# Check EOF state
io.eof? # => true
Debugging StringIO operations requires understanding internal state changes. The pos
, eof?
, and closed?
methods provide visibility into current object state. Logging position changes and mode flags helps identify unexpected behavior in complex I/O sequences.
def debug_stringio(io, operation)
puts "Before #{operation}: pos=#{io.pos}, eof=#{io.eof?}, closed=#{io.closed?}"
yield
puts "After #{operation}: pos=#{io.pos}, eof=#{io.eof?}, closed=#{io.closed?}"
puts "Content: #{io.string.inspect}"
puts "---"
end
io = StringIO.new("Debug content")
debug_stringio(io, "read(5)") { io.read(5) }
debug_stringio(io, "write('X')") { io.write('X') }
Encoding issues manifest when mixing binary and text operations or handling non-ASCII content. StringIO preserves string encoding throughout operations, but binary writes can corrupt text content. Always specify binary mode for non-text data.
# Encoding preservation
utf8_io = StringIO.new("UTF-8 content: ñoño")
utf8_io.string.encoding # => #<Encoding:UTF-8>
# Binary mode for mixed content
binary_io = StringIO.new("".b)
binary_io.write("Text")
binary_io.write([0x80, 0xFF].pack('C*'))
binary_io.string.encoding # => #<Encoding:ASCII-8BIT>
Performance & Memory
StringIO provides significant performance advantages over file I/O when working with temporary data or testing scenarios. Memory-based operations eliminate filesystem overhead, system call costs, and disk I/O latency. Benchmark comparisons show 10-100x performance improvements for small to medium datasets.
require 'benchmark'
require 'tempfile'
data = "Line of data\n" * 10000
Benchmark.bm(15) do |x|
x.report("StringIO write") do
1000.times do
io = StringIO.new
io.write(data)
io.string
end
end
x.report("File write") do
1000.times do
file = Tempfile.new
file.write(data)
file.rewind
file.read
file.close
file.unlink
end
end
end
Memory consumption scales linearly with content size, but StringIO avoids file system buffer overhead. The internal string doubles capacity when space is exhausted, similar to Array growth patterns. Large datasets benefit from pre-sizing the internal buffer through initial string allocation.
# Memory-efficient initialization for known size
large_data = " " * 1_000_000 # Pre-allocate 1MB
io = StringIO.new(large_data)
io.rewind
io.truncate(0) # Clear content but keep capacity
# Measure memory usage during operations
start_memory = GC.stat[:heap_allocated_pages]
io.write("A" * 500_000)
end_memory = GC.stat[:heap_allocated_pages]
puts "Memory pages used: #{end_memory - start_memory}"
String concatenation through multiple write operations creates memory pressure due to intermediate string allocations. Ruby's string implementation optimizes for single large writes over many small writes. Collecting data before writing improves both performance and memory efficiency.
# Inefficient: many small writes
def build_content_slow(lines)
io = StringIO.new
lines.each { |line| io.write("#{line}\n") }
io.string
end
# Efficient: single large write
def build_content_fast(lines)
content = lines.map { |line| "#{line}\n" }.join
io = StringIO.new
io.write(content)
io.string
end
Positioning operations within StringIO execute in constant time since no disk seeking occurs. Random access patterns perform consistently regardless of string size, unlike file-based I/O where disk seek times vary with distance. This characteristic makes StringIO ideal for algorithms requiring frequent position changes.
# Performance comparison: sequential vs random access
content = "0123456789" * 100000
io = StringIO.new(content)
# Sequential reading (fast for both StringIO and File)
Benchmark.measure { 1000.times { io.rewind; io.read(100) } }
# Random seeking (fast only for StringIO)
Benchmark.measure do
1000.times do
pos = rand(content.length - 100)
io.seek(pos)
io.read(100)
end
end
Testing Strategies
StringIO serves as a primary tool for testing I/O-dependent code without creating temporary files or network connections. Test isolation improves when replacing actual I/O streams with StringIO objects, eliminating filesystem dependencies and external resource coordination.
class FileProcessor
def initialize(input_stream, output_stream)
@input = input_stream
@output = output_stream
end
def process_lines
@input.each_line do |line|
processed = line.strip.upcase
@output.write("#{processed}\n")
end
end
end
# Testing without files
def test_file_processor
input = StringIO.new("hello\nworld\n")
output = StringIO.new
processor = FileProcessor.new(input, output)
processor.process_lines
expected = "HELLO\nWORLD\n"
assert_equal expected, output.string
end
Mock object integration uses StringIO to simulate various I/O conditions including partial reads, write failures, and positioning errors. Custom StringIO subclasses can introduce controlled failures for testing error handling paths.
class FailingStringIO < StringIO
def initialize(string = "", fail_after: nil)
super(string)
@fail_after = fail_after
@operation_count = 0
end
def read(*args)
@operation_count += 1
raise IOError, "Simulated failure" if @fail_after && @operation_count > @fail_after
super
end
end
# Test error handling
def test_read_failure_handling
failing_io = FailingStringIO.new("content", fail_after: 1)
# First read succeeds
result1 = failing_io.read(4)
assert_equal "cont", result1
# Second read fails
assert_raises(IOError) { failing_io.read(4) }
end
Captured output testing involves redirecting STDOUT or STDERR to StringIO objects, enabling verification of program output without polluting test output streams. This technique works particularly well for testing command-line utilities and logging functionality.
class Logger
def initialize(output = STDOUT)
@output = output
end
def log(level, message)
@output.puts "[#{level.upcase}] #{Time.now}: #{message}"
end
end
def test_logger_output
captured_output = StringIO.new
logger = Logger.new(captured_output)
logger.log(:info, "Test message")
logger.log(:error, "Error occurred")
lines = captured_output.string.lines
assert lines[0].include?("[INFO]")
assert lines[0].include?("Test message")
assert lines[1].include?("[ERROR]")
assert lines[1].include?("Error occurred")
end
Data-driven testing scenarios benefit from StringIO's ability to simulate various input formats and sizes. Test cases can programmatically generate input data, process it through StringIO objects, and verify output without external file dependencies.
def test_csv_processing_with_various_inputs
test_cases = [
{ input: "name,age\nAlice,30\nBob,25", expected_count: 2 },
{ input: "name,age\n", expected_count: 0 },
{ input: "invalid,data,format\ntest", expected_count: 0 },
]
test_cases.each_with_index do |test_case, index|
input_io = StringIO.new(test_case[:input])
output_io = StringIO.new
processor = CSVProcessor.new(input_io, output_io)
processor.process
actual_count = output_io.string.lines.count
assert_equal test_case[:expected_count], actual_count,
"Test case #{index} failed"
end
end
Production Patterns
Web applications frequently use StringIO for generating downloadable content without creating temporary files. CSV exports, PDF generation, and data serialization benefit from in-memory string construction before streaming to HTTP responses. This pattern reduces disk I/O and simplifies cleanup operations.
class ReportController < ApplicationController
def download_csv
output = StringIO.new
# Write CSV header
output.write("Name,Email,Created At\n")
# Stream user data
User.find_in_batches(batch_size: 1000) do |batch|
batch.each do |user|
output.write("#{user.name},#{user.email},#{user.created_at}\n")
end
end
send_data output.string,
filename: "users_#{Date.current}.csv",
type: 'text/csv'
end
end
Template rendering systems utilize StringIO for building complex documents from multiple sources. The pattern allows incremental content construction while maintaining clean separation between data processing and output formatting.
class DocumentBuilder
def initialize
@output = StringIO.new
@section_count = 0
end
def add_header(title, level = 1)
@output.write("#{'#' * level} #{title}\n\n")
end
def add_section(content)
@section_count += 1
@output.write("## Section #{@section_count}\n\n")
@output.write("#{content}\n\n")
end
def add_code_block(code, language = nil)
lang = language ? language : ''
@output.write("```#{lang}\n#{code}\n```\n\n")
end
def build
@output.string
end
end
# Usage in production
builder = DocumentBuilder.new
builder.add_header("API Documentation")
builder.add_section("This document describes the REST API endpoints.")
builder.add_code_block("GET /api/users", "http")
document = builder.build
Log aggregation and processing pipelines use StringIO for buffering log entries before batch processing or transmission. The pattern provides memory-efficient buffering while maintaining compatibility with existing I/O-based processing tools.
class LogBuffer
def initialize(max_size: 64_000, flush_callback: nil)
@buffer = StringIO.new
@max_size = max_size
@flush_callback = flush_callback
end
def write_entry(timestamp, level, message)
entry = "#{timestamp} [#{level}] #{message}\n"
@buffer.write(entry)
flush if @buffer.size >= @max_size
end
def flush
return if @buffer.size == 0
content = @buffer.string.dup
@buffer.rewind
@buffer.truncate(0)
@flush_callback&.call(content)
content
end
end
# Production usage
log_buffer = LogBuffer.new(
max_size: 32_768,
flush_callback: ->(content) {
LogShipper.send_to_aggregator(content)
}
)
Data serialization workflows leverage StringIO for building complex data structures before persistence or transmission. The approach enables incremental construction with rollback capabilities through position management.
class DataExporter
def initialize(format: :json)
@format = format
@buffer = StringIO.new
end
def export_dataset(records)
case @format
when :json
export_json(records)
when :xml
export_xml(records)
when :csv
export_csv(records)
end
@buffer.string
end
private
def export_json(records)
@buffer.write("[\n")
records.each_with_index do |record, index|
@buffer.write(" #{record.to_json}")
@buffer.write(",\n") unless index == records.length - 1
end
@buffer.write("\n]")
end
def export_csv(records)
return if records.empty?
headers = records.first.keys
@buffer.write("#{headers.join(',')}\n")
records.each do |record|
values = headers.map { |h| record[h] }
@buffer.write("#{values.join(',')}\n")
end
end
end
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
StringIO.new(string="", mode="r+") |
string (String), mode (String) |
StringIO |
Creates new StringIO instance with optional content and mode |
#<<(obj) |
obj (Object) |
StringIO |
Appends object string representation and returns self |
#close |
None | nil |
Closes StringIO for both reading and writing |
#close_read |
None | nil |
Closes StringIO for reading only |
#close_write |
None | nil |
Closes StringIO for writing only |
#closed? |
None | Boolean |
Returns true if StringIO is closed for both reading and writing |
#each_line(sep=$/,limit=nil) {block} |
sep (String), limit (Integer), block |
Enumerator or StringIO |
Iterates over lines with optional separator and limit |
#eof? |
None | Boolean |
Returns true if positioned at end of string |
#getc |
None | String or nil |
Reads next character, returns nil at EOF |
#gets(sep=$/,limit=nil) |
sep (String), limit (Integer) |
String or nil |
Reads next line with optional separator and limit |
#length |
None | Integer |
Returns current string length (alias for size) |
#pos |
None | Integer |
Returns current position within string |
#pos=(position) |
position (Integer) |
Integer |
Sets current position within string |
#read(length=nil, outbuf=nil) |
length (Integer), outbuf (String) |
String or nil |
Reads specified length or all remaining data |
#readlines(sep=$/,limit=nil) |
sep (String), limit (Integer) |
Array<String> |
Returns array of all remaining lines |
#rewind |
None | Integer |
Sets position to 0 and returns 0 |
#seek(offset, whence=IO::SEEK_SET) |
offset (Integer), whence (Integer) |
Integer |
Sets position relative to whence constant |
#size |
None | Integer |
Returns current string length |
#string |
None | String |
Returns underlying string object |
#string=(new_string) |
new_string (String) |
String |
Replaces underlying string and resets position |
#tell |
None | Integer |
Returns current position (alias for pos) |
#truncate(length) |
length (Integer) |
Integer |
Truncates string to specified length |
#write(obj, *objs) |
obj (Object), *objs (Array) |
Integer |
Writes objects to string, returns bytes written |
Mode Specifications
Mode | Read | Write | Position | Truncate | Create |
---|---|---|---|---|---|
"r" |
Yes | No | Start | No | No |
"r+" |
Yes | Yes | Start | No | No |
"w" |
No | Yes | Start | Yes | Yes |
"w+" |
Yes | Yes | Start | Yes | Yes |
"a" |
No | Yes | End | No | Yes |
"a+" |
Yes | Yes | End | No | Yes |
Seek Constants
Constant | Value | Description |
---|---|---|
IO::SEEK_SET |
0 | Absolute position from start |
IO::SEEK_CUR |
1 | Relative position from current |
IO::SEEK_END |
2 | Relative position from end |
Common Exceptions
Exception | Condition | Resolution |
---|---|---|
IOError |
Writing to read-only or reading from write-only | Check mode permissions |
Errno::EINVAL |
Invalid seek parameters | Use valid whence constants and positions |
EOFError |
Reading past end with readline methods | Check eof? before reading |
TypeError |
Invalid mode string | Use documented mode specifications |
State Query Methods
Method | Returns | Description |
---|---|---|
#closed? |
Boolean | True if closed for both read and write |
#eof? |
Boolean | True if position is at end of string |
#pos |
Integer | Current byte position within string |
#size |
Integer | Total string length in bytes |
Binary Mode Operations
Binary mode handles raw byte data without encoding conversion. Use "b"
suffix with mode strings for binary operations.
# Binary mode example
binary_io = StringIO.new("".b, "w+b")
binary_io.write([0x48, 0x65, 0x6C, 0x6C, 0x6F].pack('C*'))
binary_io.string.encoding # => #<Encoding:ASCII-8BIT>