Overview
Ruby provides multiple mechanisms for creating strings, ranging from literal syntax to constructor methods. The language supports several string literal forms including single-quoted strings, double-quoted strings, heredocs, and percent notation. Each creation method exhibits distinct behavior regarding interpolation, escape sequences, and encoding handling.
String literals represent the most common approach for string creation. Single-quoted strings ('text'
) preserve literal characters with minimal escape processing, while double-quoted strings ("text"
) enable interpolation and comprehensive escape sequence support. Ruby also provides heredoc syntax for multi-line strings and percent notation for strings containing quote characters.
# Single-quoted literal
single = 'Hello world'
# Double-quoted with interpolation
name = 'Ruby'
double = "Hello #{name}"
# => "Hello Ruby"
# Heredoc for multi-line content
heredoc = <<~EOF
This is a multi-line
string with indentation
EOF
The String.new
constructor creates empty strings or copies existing strings, accepting optional encoding parameters. String creation methods differ in memory allocation patterns, with literals often benefiting from string pooling in recent Ruby versions.
Ruby strings are mutable objects by default, though frozen string literals provide immutable alternatives. The encoding system affects string creation, with source file encoding determining literal string encoding unless explicitly overridden.
Basic Usage
String literals form the foundation of string creation in Ruby. Single-quoted strings treat most characters literally, processing only single quote escapes (\'
) and backslash escapes (\\
). This literal treatment makes single quotes appropriate for strings containing backslashes or special characters.
# Single-quoted strings preserve literal content
path = 'C:\Users\Documents\file.txt'
regex_pattern = '\d+\.\d+'
sql_query = 'SELECT * FROM users WHERE name = \'John\''
puts path
# => C:\Users\Documents\file.txt
Double-quoted strings enable interpolation and comprehensive escape sequence processing. Ruby evaluates expressions within #{}
delimiters and replaces them with their string representations. Common escape sequences include \n
for newlines, \t
for tabs, and \"
for embedded quotes.
user_count = 42
status = 'active'
message = "Found #{user_count} #{status} users"
# => "Found 42 active users"
formatted = "Line 1\nLine 2\tTabbed content"
puts formatted
# Line 1
# Line 2 Tabbed content
Heredoc syntax accommodates multi-line strings with optional indentation handling. The <<
operator creates heredocs, while <<~
strips leading whitespace based on the least indented line. The delimiter identifier appears immediately after the operator.
# Standard heredoc preserves all whitespace
html = <<HTML
<div class="container">
<h1>Welcome</h1>
<p>Content goes here</p>
</div>
HTML
# Squiggly heredoc strips common indentation
indented = <<~TEXT
This line has leading spaces
This line is indented more
Back to base indentation
TEXT
Percent notation provides alternatives for strings containing quote characters. The %q
operator creates single-quoted equivalent strings, while %Q
or %
creates double-quoted equivalents. Various delimiter characters work with percent notation.
# Percent notation with different delimiters
mixed_quotes = %q{String with 'single' and "double" quotes}
interpolated = %Q[Today is #{Date.today}]
parentheses = %(Another string with (parentheses))
The String.new
constructor creates strings programmatically, accepting optional capacity hints for performance optimization. Constructor calls create new string objects regardless of content, unlike literals which may use string pooling.
# Constructor creates empty string
empty = String.new
# Constructor with initial content
copy = String.new("original content")
# Constructor with encoding specification
utf8_string = String.new("content", encoding: 'UTF-8')
Advanced Usage
Ruby's string creation system provides sophisticated features for complex scenarios. Frozen string literals, enabled via the frozen_string_literal
pragma or command-line flags, create immutable strings that reduce memory allocation and improve performance in many applications.
# File with frozen string literal pragma
# frozen_string_literal: true
name = "Ruby" # This string is frozen
greeting = "Hello #{name}" # Interpolated strings remain mutable
puts name.frozen? # => true
puts greeting.frozen? # => false
# Explicit freezing
mutable = "changeable".dup
frozen = "unchangeable".freeze
Advanced heredoc usage includes executing shell commands and creating complex data structures. Heredocs can use any delimiter string and support all double-quoted string features including interpolation.
# Heredoc with command execution
system_info = <<~SHELL
echo "System: $(uname -s)"
echo "User: $(whoami)"
echo "Date: $(date)"
SHELL
# Complex heredoc for configuration
config = <<~YAML.strip
database:
host: #{ENV.fetch('DB_HOST', 'localhost')}
port: #{ENV.fetch('DB_PORT', 5432)}
name: #{Rails.env}_database
YAML
String creation patterns for DSLs and builders demonstrate advanced interpolation techniques. Ruby evaluates interpolation expressions in the context where the string literal appears, enabling complex object interactions.
class QueryBuilder
def initialize
@conditions = []
end
def where(condition)
@conditions << condition
self
end
def build(table)
conditions_sql = @conditions.join(' AND ')
<<~SQL
SELECT * FROM #{table}
WHERE #{conditions_sql}
SQL
end
end
builder = QueryBuilder.new
query = builder
.where("age > #{min_age}")
.where("status = '#{active_status}'")
.build(:users)
Encoding-aware string creation requires understanding Ruby's encoding system. String literals inherit source file encoding, but constructor methods and encoding conversion provide explicit control over character representation.
# Explicit encoding specification
utf8_text = "Résumé café".encode('UTF-8')
latin1_text = String.new("caf\xe9", encoding: 'ISO-8859-1')
# Binary string creation
binary_data = "\x00\x01\x02\x03".b
puts binary_data.encoding # => ASCII-8BIT
# Encoding conversion during creation
converted = String.new(utf8_text, encoding: 'UTF-16')
Dynamic string creation using metaprogramming techniques enables flexible string construction. Ruby's introspection capabilities combine with string interpolation for powerful template systems.
class TemplateRenderer
def initialize(template, context)
@template = template
@context = context
end
def render
@context.instance_eval('"' + @template.gsub('"', '\"') + '"')
end
end
template = "Hello #{@name}, today is #{@date.strftime('%A')}"
context = OpenStruct.new(name: 'Alice', date: Date.today)
renderer = TemplateRenderer.new(template, context)
result = renderer.render
Common Pitfalls
String literal behavior creates several categories of common mistakes. Single-quoted and double-quoted strings exhibit different escape sequence processing, leading to unexpected results when developers assume consistent behavior across both types.
# Escape sequence differences
single_quoted = 'Line 1\nLine 2' # Literal backslash-n
double_quoted = "Line 1\nLine 2" # Actual newline
puts single_quoted.length # => 15 (includes \n as two characters)
puts double_quoted.length # => 13 (newline as single character)
# Tab characters in single vs double quotes
single_tab = 'Text\tTabbed' # Literal \t
double_tab = "Text\tTabbed" # Actual tab character
Interpolation evaluation timing causes subtle bugs when expressions have side effects or depend on changing state. Ruby evaluates interpolation expressions when creating the string, not when using it, which affects variable binding and method calls.
counter = 0
# Interpolation evaluates immediately
messages = []
3.times do |i|
messages << "Message #{counter += 1} at iteration #{i}"
end
puts messages
# => ["Message 1 at iteration 0", "Message 2 at iteration 1", "Message 3 at iteration 2"]
# Problematic delayed evaluation expectation
template = "Current time: #{Time.now}"
sleep(1)
puts template # Still shows original time, not current time
Heredoc delimiter positioning requirements trip up developers unfamiliar with the syntax rules. The ending delimiter must appear at the beginning of a line with no leading whitespace, unless using the squiggly heredoc variant.
# Incorrect delimiter placement
def create_sql
<<SQL
SELECT * FROM users
WHERE active = true
SQL # This must be at line start for standard heredoc
end
# Correct squiggly heredoc for indented code
def create_sql
<<~SQL
SELECT * FROM users
WHERE active = true
SQL # This works with indentation
end
String encoding mismatches generate runtime errors when combining strings with different encodings. Ruby raises Encoding::CompatibilityError
when operations attempt to mix incompatible encoded strings.
# Encoding compatibility issues
utf8_string = "café".encode('UTF-8')
latin1_string = "caf\xe9".force_encoding('ISO-8859-1')
# This raises Encoding::CompatibilityError
begin
combined = utf8_string + latin1_string
rescue Encoding::CompatibilityError => e
puts "Cannot combine: #{e.message}"
end
# Correct approach with encoding conversion
compatible = utf8_string + latin1_string.encode('UTF-8')
Frozen string literal behavior changes string mutability expectations. Code that assumes string mutability fails when frozen string literals are enabled, requiring explicit duplication for mutable strings.
# frozen_string_literal: true
def modify_greeting(name)
greeting = "Hello" # This string is frozen
greeting << " #{name}" # Raises FrozenError
end
# Correct approach with explicit duplication
def modify_greeting(name)
greeting = "Hello".dup # Creates mutable copy
greeting << " #{name}" # Works correctly
end
Memory allocation patterns differ between literal strings and constructed strings. String pooling affects literal strings but not strings created through constructors, leading to unexpected memory usage patterns.
# String pooling with literals (Ruby 2.5+)
1000.times do
str1 = "constant string" # May reuse same object
end
# Constructor always creates new objects
1000.times do
str2 = String.new("constant string") # Always creates new object
end
# Check object identity
lit1 = "test"
lit2 = "test"
puts lit1.object_id == lit2.object_id # May be true with string pooling
con1 = String.new("test")
con2 = String.new("test")
puts con1.object_id == con2.object_id # Always false
Reference
String Literal Syntax
Syntax | Description | Interpolation | Escape Processing |
---|---|---|---|
'string' |
Single-quoted literal | No | Minimal (\' , \\ ) |
"string" |
Double-quoted literal | Yes | Full escape sequences |
<<DELIMITER |
Heredoc | Yes | Full escape sequences |
<<~DELIMITER |
Squiggly heredoc | Yes | Full + indent stripping |
%q{string} |
Percent single-quoted | No | Minimal |
%Q{string} , %{string} |
Percent double-quoted | Yes | Full escape sequences |
Constructor Methods
Method | Parameters | Returns | Description |
---|---|---|---|
String.new |
None | String |
Creates empty string |
String.new(str) |
str (String) |
String |
Creates copy of string |
String.new(str, encoding:) |
str (String), encoding (String/Encoding) |
String |
Creates string with specified encoding |
String.new(str, capacity:) |
str (String), capacity (Integer) |
String |
Creates string with capacity hint |
"".b |
None | String |
Creates binary string (ASCII-8BIT) |
Escape Sequences
Sequence | Description | Single Quote | Double Quote |
---|---|---|---|
\' |
Single quote | ✓ | ✓ |
\" |
Double quote | No | ✓ |
\\ |
Backslash | ✓ | ✓ |
\n |
Newline | No | ✓ |
\t |
Tab | No | ✓ |
\r |
Carriage return | No | ✓ |
\f |
Form feed | No | ✓ |
\v |
Vertical tab | No | ✓ |
\a |
Bell | No | ✓ |
\e |
Escape | No | ✓ |
\b |
Backspace | No | ✓ |
\s |
Space | No | ✓ |
\nnn |
Octal value | No | ✓ |
\xnn |
Hexadecimal value | No | ✓ |
\unnnn |
Unicode codepoint | No | ✓ |
Heredoc Delimiters
# Standard heredoc
<<DELIMITER
content
DELIMITER
# Squiggly heredoc (strips indent)
<<~DELIMITER
content
DELIMITER
# Quoted delimiters
<<"DELIMITER"
content with interpolation
DELIMITER
<<'DELIMITER'
content without interpolation
DELIMITER
Percent Notation Delimiters
Delimiter | Usage | Example |
---|---|---|
() |
Parentheses | %(text) |
[] |
Brackets | %[text] |
{} |
Braces | %{text} |
<> |
Angle brackets | %<text> |
// |
Slashes | %/text/ |
!! |
Exclamation | %!text! |
|| |
Pipes | %|text| |
## |
Hash symbols | %#text# |
Encoding-Related Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#encoding |
None | Encoding |
Returns string encoding |
#encode(encoding) |
encoding (String/Encoding) |
String |
Converts to new encoding |
#force_encoding(encoding) |
encoding (String/Encoding) |
String |
Changes encoding without conversion |
#valid_encoding? |
None | Boolean |
Checks if string is valid in its encoding |
#ascii_only? |
None | Boolean |
Checks if string contains only ASCII characters |
String Pool and Memory
# Check if string is pooled (Ruby 2.5+)
"literal".frozen? # May return true
# Force string pooling
-"string" # Unary minus creates pooled string
+"string" # Unary plus creates mutable copy
# Memory efficient string creation
String.new(capacity: 1000) # Pre-allocates buffer
Common Patterns
# Safe interpolation with fallback
name = user&.name || 'Anonymous'
greeting = "Hello #{name}"
# Multi-line string with proper indentation
sql = <<~SQL.strip
SELECT users.name, profiles.email
FROM users
INNER JOIN profiles ON users.id = profiles.user_id
WHERE users.active = true
SQL
# Binary string manipulation
data = "\x89PNG\r\n\x1a\n".b
header = data[0, 8] # Extract PNG header
# Encoding-safe string building
parts = ["Hello", " ", "World"]
result = parts.join.encode('UTF-8')