Overview
Ruby provides several string literal syntaxes that create String objects with varying behaviors for interpolation, escape sequences, and delimiter handling. The primary forms include single-quoted strings, double-quoted strings, percent literals, and heredoc syntax.
Single-quoted strings interpret only the escape sequences \'
and \\
, treating all other characters literally. Double-quoted strings support full escape sequence processing and string interpolation through #{}
syntax. Percent literals use customizable delimiters and follow double-quote interpolation rules, while heredocs provide multi-line string creation with configurable indentation and interpolation behavior.
single = 'literal text with minimal escaping'
double = "interpolated text with #{variable} and \n escapes"
percent = %Q{custom delimited string with #{interpolation}}
String literals create new String objects each time they execute, unless frozen with the frozen string literal pragma. Ruby processes these literals at parse time for syntax validation, then creates the actual String objects during execution.
The encoding of string literals depends on the source file encoding, defaultable to UTF-8, with override capability through encoding comments. Different literal forms share the same underlying String class but vary in their compile-time processing behavior.
Basic Usage
Single-quoted strings provide literal text with minimal escape processing. Only backslash-quote (\'
) and backslash-backslash (\\
) sequences receive interpretation, making single quotes ideal for strings containing many backslashes or special characters.
path = 'C:\Users\name\file.txt'
regex = 'pattern with \d+ and \w* without escaping'
message = 'String with \'embedded quotes\' stays readable'
# => "String with 'embedded quotes' stays readable"
Double-quoted strings enable interpolation and full escape sequence processing. String interpolation executes Ruby expressions within #{}
and converts results to strings using to_s
.
name = "Alice"
age = 30
greeting = "Hello, #{name}! You are #{age} years old."
# => "Hello, Alice! You are 30 years old."
formatted = "Line 1\nLine 2\tTabbed content"
puts formatted
# Line 1
# Line 2 Tabbed content
Percent literals use %
followed by a delimiter character, supporting various quote-like behaviors. The %Q
form behaves like double quotes with interpolation, while %q
behaves like single quotes without interpolation.
mixed_quotes = %Q{String with "double" and 'single' quotes}
literal_percent = %q!Raw string with #{no_interpolation} preserved!
custom_delimiter = %(Parentheses as delimiters work too)
Heredoc syntax creates multi-line strings using <<
followed by an identifier. The string content continues until a line containing only the identifier appears. Heredocs support interpolation by default, unless the identifier appears in single quotes.
sql_query = <<~SQL
SELECT users.name, profiles.bio
FROM users
JOIN profiles ON users.id = profiles.user_id
WHERE users.active = true
SQL
template = <<~HTML
<div class="user">
<h1>#{user.name}</h1>
<p>#{user.description}</p>
</div>
HTML
Advanced Usage
Heredoc squiggly syntax (<<~
) removes leading whitespace from each line based on the least-indented line, enabling clean multi-line strings within indented code blocks.
class EmailTemplate
def welcome_message(user)
<<~MESSAGE
Dear #{user.name},
Welcome to our service! Your account has been created with
the email address #{user.email}.
Best regards,
The Team
MESSAGE
end
end
Percent notation supports multiple delimiters and alternative forms. Each form serves specific use cases where certain characters appear frequently in the string content.
# Different delimiters for different content types
json_template = %Q{"name": "#{user.name}", "active": #{user.active?}}
regex_pattern = %r{/api/v\d+/users/\d+}
file_path = %q{C:\Program Files\Application\config.ini}
shell_command = %x{ls -la #{directory}}
# Nested delimiters work when balanced
nested = %Q{Outer (contains (nested) parentheses) structure}
Frozen string literals reduce memory allocation by reusing identical string objects. The frozen_string_literal: true
pragma applies to all string literals in the file.
# frozen_string_literal: true
def process_data
status = "processing" # Same object reused each call
log_message = "Data processing started at #{Time.now}" # New object each call due to interpolation
end
# Explicit freezing for individual strings
CONSTANT_MESSAGE = "System initialized".freeze
String literals support method chaining directly on the literal syntax, enabling concise string processing pipelines.
processed = " MIXED case text "
.strip
.downcase
.gsub(/\s+/, "_")
.capitalize
# => "Mixed_case_text"
formatted_list = %w[apple banana cherry]
.map(&:capitalize)
.join(", ")
# => "Apple, Banana, Cherry"
Character escape sequences provide precise control over string content, including Unicode codepoints and byte values.
unicode_string = "Unicode: \u{1F600} \u{2764} \u{1F44D}"
# => "Unicode: 😀 ❤ 👍"
byte_string = "Hex bytes: \xFF\x00\x42"
control_chars = "Bell: \a Tab: \t Newline: \n"
Common Pitfalls
String interpolation creates new String objects on each execution, even when interpolated expressions return identical values. This behavior impacts performance in tight loops and memory-sensitive applications.
# Memory inefficient - creates new string each iteration
1000.times do |i|
log_message = "Processing item #{i}" # New string object each time
# process(log_message)
end
# More efficient approaches
base_message = "Processing item "
1000.times do |i|
log_message = base_message + i.to_s # Still creates objects but more controlled
# Or use String#% for formatting
log_message = "Processing item %d" % i
end
Escape sequence interpretation differs significantly between single and double quotes, leading to unexpected behavior when switching between literal forms.
# Single quotes preserve backslashes literally
single_path = 'C:\new\file.txt'
# => "C:\\new\\file.txt" (literal backslashes)
# Double quotes interpret escape sequences
double_path = "C:\new\file.txt"
# => "C:\new\file.txt" (interpreted \n as newline)
# Correct approach for file paths
correct_path = "C:\\new\\file.txt" # Escape backslashes in double quotes
unix_style = "C:/new/file.txt" # Use forward slashes
Heredoc indentation behavior changes between <<
and <<~
forms, affecting string content in unexpected ways.
def indented_content
if true
standard_heredoc = <<TEXT
This content preserves
the leading spaces
in the final string
TEXT
squiggly_heredoc = <<~TEXT
This content removes
leading whitespace based
on the least indented line
TEXT
end
end
# standard_heredoc contains leading spaces
# squiggly_heredoc has clean, unindented content
String interpolation evaluates expressions at string creation time, not when the string gets used. This timing affects variable access and method execution.
def create_message
counter = 0
message = "Counter value: #{counter += 1}"
# counter is now 1, expression evaluated immediately
lambda { message } # Captures already-interpolated string
end
proc_message = create_message
puts proc_message.call # "Counter value: 1"
puts proc_message.call # "Counter value: 1" (same string, no re-evaluation)
Percent literal delimiter selection affects parsing when the chosen delimiter appears within the string content.
# Problematic - unbalanced delimiters confuse parser
# broken = %(String with ) middle parenthesis) # Syntax error
# Solutions: choose different delimiters or escape
fixed1 = %{String with ) middle parenthesis}
fixed2 = %(String with \) middle parenthesis)
fixed3 = %Q!String with ) middle parenthesis!
Encoding issues arise when string literals contain characters incompatible with the source file encoding or when mixing strings with different encodings.
# Source file encoding affects literal interpretation
# With UTF-8 source encoding:
utf8_string = "Café résumé 🎉" # Works correctly
# With ASCII source encoding (causes issues):
# ascii_string = "Café résumé" # Encoding error
# Explicit encoding specification
binary_string = "Binary data".force_encoding("ASCII-8BIT")
Reference
String Literal Syntax
Syntax | Interpolation | Escape Sequences | Use Case |
---|---|---|---|
'text' |
No | \' and \\ only |
Literal strings, minimal processing |
"text" |
Yes | Full escape sequences | General purpose, interpolated content |
%q{text} |
No | \' and \\ only |
Alternative to single quotes |
%Q{text} |
Yes | Full escape sequences | Alternative to double quotes |
%(text) |
Yes | Full escape sequences | Shorthand for %Q |
<<IDENTIFIER |
Yes | Full escape sequences | Multi-line strings |
<<~IDENTIFIER |
Yes | Full escape sequences | Multi-line with indent removal |
<<'IDENTIFIER' |
No | \' and \\ only |
Literal multi-line strings |
Escape Sequences
Sequence | Result | Description |
---|---|---|
\" |
" |
Double quote |
\' |
' |
Single quote |
\\ |
\ |
Backslash |
\n |
Newline | Line feed character |
\r |
Carriage return | Carriage return character |
\t |
Tab | Horizontal tab |
\s |
Space | Space character |
\a |
Bell | Bell/alert character |
\b |
Backspace | Backspace character |
\f |
Form feed | Form feed character |
\v |
Vertical tab | Vertical tab character |
\0 |
Null | Null character |
\nnn |
Byte value | Octal byte value (1-3 digits) |
\xHH |
Byte value | Hexadecimal byte value (1-2 digits) |
\uHHHH |
Unicode | Unicode codepoint (4 hex digits) |
\u{HHHHH} |
Unicode | Unicode codepoint (1-6 hex digits) |
Percent Literal Forms
Form | Equivalent | Interpolation | Typical Use |
---|---|---|---|
%q |
Single quotes | No | Literal strings with special chars |
%Q |
Double quotes | Yes | Interpolated strings with special chars |
% |
Double quotes | Yes | Shorthand for %Q |
%w |
Array of strings | No | Word arrays without interpolation |
%W |
Array of strings | Yes | Word arrays with interpolation |
%r |
Regular expression | Yes | Regex patterns |
%x |
Backtick command | Yes | Shell command execution |
%s |
Symbol | No | Symbol creation |
%i |
Array of symbols | No | Symbol arrays |
%I |
Array of symbols | Yes | Symbol arrays with interpolation |
Delimiter Options
Delimiter Type | Examples | Behavior |
---|---|---|
Paired | () [] {} <> |
Must be balanced within content |
Unpaired | ! @ # $ % ^ & * - _ + = | : ; " ' ? / ~ |
Opening and closing delimiter identical |
Heredoc Variants
Syntax | Indentation | Interpolation | Common Usage |
---|---|---|---|
<<WORD |
Preserved | Yes | SQL queries, templates |
<<~WORD |
Removed | Yes | Clean multi-line strings in methods |
<<'WORD' |
Preserved | No | Literal multi-line content |
<<~'WORD' |
Removed | No | Clean literal multi-line content |
Performance Characteristics
Operation | Relative Cost | Notes |
---|---|---|
Single quote literal | Fastest | Minimal processing |
Double quote without interpolation | Fast | Escape sequence processing |
Double quote with interpolation | Moderate | Expression evaluation overhead |
Heredoc | Moderate | Multi-line processing |
Percent literal | Fast to Moderate | Depends on content and form |
Frozen literal | Variable | Reuses objects, saves allocation |
Memory Behavior
Feature | Object Creation | Memory Impact |
---|---|---|
String literals | New object per execution | High in loops |
Frozen literals | Reused objects | Reduced allocation |
Interpolation | Always new object | Cannot be frozen |
Heredoc | Single object per execution | Moderate |
Concatenation | New object | Additive memory usage |