CrackedRuby logo

CrackedRuby

Here Documents

Overview

Here Documents provide a syntax for creating multi-line strings in Ruby. The feature uses a delimiter-based approach where content spans multiple lines until a matching end delimiter appears. Ruby processes Here Documents during parsing, making them compile-time constructs rather than runtime string operations.

The basic syntax starts with << followed by a delimiter identifier. Ruby reads subsequent lines as string content until it encounters a line containing only the delimiter. The delimiter can be quoted or unquoted, affecting interpolation and escaping behavior.

text = <<EOF
This is a here document
spanning multiple lines
EOF

Ruby supports several Here Document variants through different prefix operators. The <<- operator allows the closing delimiter to be indented, while <<~ strips common leading whitespace from all lines. These variants address formatting concerns in source code while maintaining string content structure.

# Indented delimiter allowed
content = <<-DELIMITER
  Line one
  Line two
DELIMITER

# Squiggly heredoc strips leading whitespace
formatted = <<~TEXT
  First line
  Second line
TEXT

Here Documents integrate with Ruby's string interpolation system when using unquoted delimiters. Ruby evaluates embedded expressions during document creation, providing dynamic content generation capabilities. The interpolation occurs at parse time for literal content and at runtime for variable references.

Basic Usage

Here Documents create string objects containing literal newline characters. Ruby preserves exact spacing and line breaks as they appear in the source code. The closing delimiter must appear on its own line with no leading or trailing whitespace unless using the indented variant.

message = <<GREETING
Hello, World!
This message spans
multiple lines.
GREETING

puts message.length
# => 42

Delimiter naming follows Ruby identifier rules but can include additional characters. Common conventions use uppercase names, often matching the content purpose or format. The delimiter appears twice - once to open the Here Document and once to close it.

sql_query = <<SQL
SELECT users.name, orders.total
FROM users
JOIN orders ON users.id = orders.user_id
WHERE orders.date > '2023-01-01'
SQL

html_template = <<HTML
<div class="container">
  <h1>Welcome</h1>
  <p>Content goes here</p>
</div>
HTML

Quoted delimiters control interpolation and escaping behavior. Single quotes prevent interpolation entirely, while double quotes enable full interpolation. Backticks execute the content as shell commands, returning the output as a string.

name = "Alice"

# Interpolation enabled with unquoted delimiter
greeting = <<WELCOME
Hello, #{name}!
Today is #{Date.today}
WELCOME

# Interpolation disabled with single-quoted delimiter
literal = <<'LITERAL'
Hello, #{name}!
This will not interpolate: #{Date.today}
LITERAL

# Shell execution with backticks
file_list = <<`COMMANDS`
ls -la
pwd
COMMANDS

The indented Here Document variant allows the closing delimiter to have leading whitespace, improving code formatting in methods and control structures. This addresses indentation conflicts between source code structure and string content formatting.

def generate_config(database_url)
  config = <<-CONFIG
    database:
      url: #{database_url}
      pool: 5
      timeout: 5000
  CONFIG
  
  return config.strip
end

Advanced Usage

The squiggly Here Document operator strips common leading whitespace from all lines, enabling proper source code indentation while maintaining relative spacing in the resulting string. Ruby calculates the minimum indentation across all non-empty lines and removes that amount from each line.

class EmailTemplate
  def welcome_message(user)
    <<~MESSAGE
      Dear #{user.name},
      
      Welcome to our platform! Your account has been created
      with the following details:
      
        Email: #{user.email}
        Account ID: #{user.id}
        
      Thank you for joining us.
      
      Best regards,
      The Team
    MESSAGE
  end
end

Here Documents can be assigned to variables, passed as method arguments, or used in any context expecting a string value. When used as method arguments, the Here Document content appears after the method call, which can improve readability for methods accepting large string parameters.

# Here Document as method argument
execute_sql <<~SQL, user_id, start_date
  SELECT *
  FROM user_activities
  WHERE user_id = ?
    AND created_at >= ?
  ORDER BY created_at DESC
SQL

# Multiple Here Documents in single expression
template = <<~HTML + <<~CSS
  <style>
  #{css_content}
  </style>
  <body>
    <h1>Page Title</h1>
  </body>
HTML
  body { margin: 0; }
  h1 { color: blue; }
CSS

Complex interpolation patterns work within Here Documents, including nested string literals, method calls, and conditional expressions. Ruby evaluates these expressions in the context where the Here Document appears, accessing local variables, instance variables, and methods.

class ReportGenerator
  def initialize(data)
    @data = data
  end

  def generate_report(format: :text)
    case format
    when :text then text_report
    when :html then html_report
    end
  end

  private

  def text_report
    <<~REPORT
      Data Summary Report
      #{'-' * 50}
      
      Total Records: #{@data.size}
      Generated: #{Time.now.strftime('%Y-%m-%d %H:%M:%S')}
      
      #{@data.empty? ? 'No data available' : detailed_breakdown}
    REPORT
  end

  def detailed_breakdown
    @data.map.with_index do |item, index|
      <<~ITEM.strip
        #{index + 1}. #{item[:title]}
           Status: #{item[:status] || 'Unknown'}
           Value: #{item[:value] ? "$#{'%.2f' % item[:value]}" : 'N/A'}
      ITEM
    end.join("\n")
  end
end

Here Documents support chaining with other string methods, enabling immediate transformation of the multi-line content. This approach combines the readability benefits of Here Documents with Ruby's string processing capabilities.

processed_content = <<~CONTENT.gsub(/\s+/, ' ').strip.downcase
  This    is   some
  poorly formatted    text
  with   extra   whitespace
CONTENT
# => "this is some poorly formatted text with extra whitespace"

normalized_sql = <<~SQL.split.join(' ')
  SELECT users.name,
         users.email,
         COUNT(orders.id) as order_count
  FROM users
  LEFT JOIN orders ON users.id = orders.user_id
  GROUP BY users.id
SQL

Common Pitfalls

Delimiter matching requires exact string equality, including case sensitivity and character composition. Ruby treats the opening and closing delimiters as separate tokens, so any deviation prevents proper document termination. This creates parsing errors or captures unintended content.

# Incorrect - case mismatch
broken = <<end
This will cause problems
END
# SyntaxError: unexpected end-of-input

# Incorrect - extra characters on closing line
broken = <<DELIMITER
Content here
DELIMITER # This comment breaks it
# The heredoc continues until EOF

# Correct approach
content = <<DELIMITER
Multi-line content
goes here
DELIMITER

Indentation behavior differs significantly between Here Document variants. The standard << operator preserves all whitespace exactly as written, while <<- only affects delimiter placement, and <<~ strips common leading whitespace. Mixing these approaches creates inconsistent string formatting.

# Standard heredoc preserves exact indentation
def method_one
  text = <<CONTENT
This line has no indentation
  This line has two spaces
    This line has four spaces
CONTENT
end

# Squiggly heredoc strips common indentation
def method_two
  text = <<~CONTENT
    This becomes: "This line has no indentation"
      This becomes: "  This line has two spaces"  
        This becomes: "    This line has four spaces"
  CONTENT
end

Variable interpolation within Here Documents follows the same scoping rules as regular strings, but the multi-line context can obscure variable availability issues. Variables referenced in Here Documents must be accessible at the point where the Here Document begins, not where individual lines appear.

def generate_message(user)
  # This works - user is available
  message = <<~TEXT
    Welcome, #{user.name}!
    Your account: #{user.id}
  TEXT

  if user.premium?
    # This fails - premium_features not yet defined
    details = <<~DETAILS
      Premium features: #{premium_features.join(', ')}
    DETAILS
    
    premium_features = ['feature1', 'feature2']
  end
end

Nested Here Documents require careful delimiter naming to avoid conflicts. Ruby processes outer Here Documents first, so inner delimiters must not match outer delimiters. This issue commonly occurs when generating code or templates that themselves contain Here Documents.

# Problematic - delimiter conflict
template = <<TEMPLATE
def generate_text
  content = <<TEMPLATE
    This inner delimiter matches the outer one
  TEMPLATE
end
TEMPLATE
# Ruby closes the heredoc at the inner TEMPLATE

# Solution - use distinct delimiters
template = <<OUTER
def generate_text
  content = <<INNER
    This works correctly
  INNER
end
OUTER

Memory implications arise when Here Documents contain large amounts of data or when created frequently in loops. Ruby creates string objects for each Here Document evaluation, potentially causing memory pressure in high-volume scenarios.

# Memory-intensive pattern
large_strings = []
1000.times do |i|
  large_strings << <<~DATA
    This creates a new string object each iteration
    Data for item #{i}
    #{('x' * 1000)}
  DATA
end

# More efficient approach
template = <<~TEMPLATE
  This creates a new string object each iteration
  Data for item %d
  %s
TEMPLATE

large_strings = []
1000.times do |i|
  large_strings << (template % [i, 'x' * 1000])
end

Reference

Here Document Syntax Variants

Syntax Delimiter Rules Interpolation Whitespace Handling
<<DELIMITER Must be unindented Enabled Preserved exactly
<<'DELIMITER' Must be unindented Disabled Preserved exactly
<<"DELIMITER" Must be unindented Enabled Preserved exactly
<<-DELIMITER Can be indented Enabled Preserved exactly
<<-'DELIMITER' Can be indented Disabled Preserved exactly
<<~DELIMITER Can be indented Enabled Common indentation stripped
<<~'DELIMITER' Can be indented Disabled Common indentation stripped
<<`DELIMITER` Can be indented Shell execution Command output returned

Delimiter Naming Rules

Rule Valid Examples Invalid Examples
Must be valid identifier start EOF, HTML, _CONTENT 123ABC, -DELIMITER
Can contain letters, numbers, underscores DATA_2023, HTMLv2 DATA-2023, HTML.v2
Case sensitive EOFeof N/A
Cannot be Ruby keywords TEXT, CONTENT class, def, end

Interpolation Behavior

Context Single Quotes 'DEL' Unquoted DEL Double Quotes "DEL"
#{expression} Literal text Interpolated Interpolated
\n Literal backslash-n Newline Newline
\t Literal backslash-t Tab character Tab character
\" Literal quote Escaped quote Escaped quote
\\ Literal backslash Escaped backslash Escaped backslash

Whitespace Stripping Rules (<<~)

Line Content Leading Spaces Result After Stripping
First line 4 First line (if min is 4)
Second line 2 Second line (becomes unindented)
(empty line) 0 (remains empty)
Third line 6 Third line (2 spaces remain)

Common Methods with Here Documents

Method Usage Result
String#strip <<~TEXT.strip Remove leading/trailing whitespace
String#gsub <<~TEXT.gsub(/\s+/, ' ') Normalize whitespace
String#split <<~TEXT.split("\n") Convert to array of lines
String#chomp <<~TEXT.chomp Remove final newline
String#lines <<~TEXT.lines Enumerable over lines
String#+ <<~A + <<~B Concatenate multiple heredocs

Performance Characteristics

Scenario Memory Usage Performance Notes
Static content Single allocation Optimized by Ruby parser
Interpolated content New allocation per evaluation Variables resolved at runtime
Large heredocs (>10KB) Proportional to content size Consider alternatives for huge content
Heredocs in loops N allocations for N iterations Cache template strings when possible
Nested interpolation Multiple string operations Complex expressions add overhead

Error Patterns

Error Type Cause Solution
SyntaxError: unexpected end-of-input Missing closing delimiter Check delimiter spelling/case
NameError: undefined local variable Variable not in scope Define variables before heredoc
Unexpected content inclusion Characters after closing delimiter Keep delimiter line clean
Incorrect indentation Wrong heredoc variant Use <<~ for indentation stripping