CrackedRuby - Input Validation

Overview

Input validation in Ruby encompasses the techniques and patterns used to verify, sanitize, and transform user-provided data before processing. Ruby provides multiple approaches to validation through built-in methods, regular expressions, type checking mechanisms, and custom validation frameworks.

Ruby's validation capabilities span from basic type checking using methods like Integer() and Float() to complex pattern matching with regular expressions. The language includes string validation methods such as String#match?, numeric range checking, and length validation through String#size and Array#length. Ruby's duck typing system requires explicit validation when type safety matters.

# Basic type validation
def validate_age(input)
  age = Integer(input)
  raise ArgumentError, "Age must be positive" unless age > 0
  age
rescue ArgumentError
  raise ArgumentError, "Invalid age format"
end

# Pattern-based validation
EMAIL_REGEX = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i

def validate_email(email)
  return false unless email.is_a?(String)
  email.match?(EMAIL_REGEX)
end

Ruby applications typically implement validation at multiple layers: input sanitization, format validation, business rule validation, and output encoding. The standard library provides URI for URL validation, Date and Time for temporal validation, and JSON for structured data validation.

Validation in Ruby serves three primary purposes: preventing application errors from invalid data, enforcing business rules, and protecting against security vulnerabilities like SQL injection and cross-site scripting. Ruby's flexible nature requires explicit validation since the language performs minimal automatic type coercion compared to statically typed languages.

Basic Usage

Ruby input validation begins with type checking and format verification. The Integer(), Float(), and String() methods provide strict type conversion that raises exceptions for invalid inputs, making them effective validators.

def validate_numeric_input(value)
  # Integer() raises ArgumentError for non-numeric strings
  number = Integer(value)
  
  # Additional range validation
  raise ArgumentError, "Number out of range" unless (1..100).cover?(number)
  number
end

# Usage examples
validate_numeric_input("42")    # => 42
validate_numeric_input("abc")   # => ArgumentError: invalid value for Integer()
validate_numeric_input("150")   # => ArgumentError: Number out of range

String validation commonly uses length checks and pattern matching. Ruby's String#size method returns character count, while regular expressions provide pattern-based validation for formats like emails, phone numbers, and identifiers.

class InputValidator
  USERNAME_PATTERN = /\A[a-zA-Z0-9_]{3,20}\z/
  
  def self.validate_username(username)
    return false unless username.is_a?(String)
    return false if username.empty?
    
    username.match?(USERNAME_PATTERN)
  end
  
  def self.validate_password(password)
    return false unless password.is_a?(String)
    return false if password.length < 8
    return false unless password.match?(/[A-Z]/)  # uppercase
    return false unless password.match?(/[a-z]/)  # lowercase
    return false unless password.match?(/\d/)     # digit
    
    true
  end
end

Array and hash validation requires checking structure and content. Ruby provides methods like Array#all? and Hash#key? for validating collections and their elements.

def validate_user_data(data)
  # Check if data is a hash
  raise ArgumentError, "Data must be a hash" unless data.is_a?(Hash)
  
  # Required fields validation
  required_fields = %w[name email age]
  missing_fields = required_fields - data.keys.map(&:to_s)
  raise ArgumentError, "Missing fields: #{missing_fields.join(', ')}" unless missing_fields.empty?
  
  # Individual field validation
  raise ArgumentError, "Invalid name" unless data[:name].is_a?(String) && !data[:name].empty?
  raise ArgumentError, "Invalid email" unless validate_email(data[:email])
  raise ArgumentError, "Invalid age" unless data[:age].is_a?(Integer) && data[:age] > 0
  
  true
end

# Array element validation
def validate_tags(tags)
  return false unless tags.is_a?(Array)
  return false if tags.empty?
  
  tags.all? { |tag| tag.is_a?(String) && tag.match?(/\A[a-z0-9-]{2,30}\z/) }
end

Date and time validation uses Ruby's Date and Time classes with parsing methods that raise exceptions for invalid formats. The Date.strptime method allows custom format specification.

def validate_date(date_string, format = '%Y-%m-%d')
  Date.strptime(date_string, format)
  true
rescue Date::Error, ArgumentError
  false
end

def validate_date_range(start_date, end_date)
  start_parsed = Date.parse(start_date)
  end_parsed = Date.parse(end_date)
  
  raise ArgumentError, "End date must be after start date" if end_parsed <= start_parsed
  
  { start: start_parsed, end: end_parsed }
rescue Date::Error, ArgumentError => e
  raise ArgumentError, "Invalid date format: #{e.message}"
end

Error Handling & Debugging

Input validation error handling in Ruby requires distinguishing between different failure types and providing meaningful error messages. Ruby's exception hierarchy allows catching specific validation errors while letting system errors propagate.

class ValidationError < StandardError
  attr_reader :field, :value, :constraint
  
  def initialize(field, value, constraint, message = nil)
    @field = field
    @value = value
    @constraint = constraint
    super(message || "Validation failed for #{field}: #{constraint}")
  end
end

class DataValidator
  def self.validate_user_registration(params)
    errors = {}
    
    # Email validation with specific error types
    begin
      validate_email_format(params[:email])
    rescue ValidationError => e
      errors[:email] = e.message
    end
    
    # Password validation with multiple constraints
    begin
      validate_password_strength(params[:password])
    rescue ValidationError => e
      errors[:password] = e.message
    end
    
    # Age validation with range checking
    begin
      validate_age_range(params[:age])
    rescue ValidationError => e
      errors[:age] = e.message
    end
    
    raise ValidationError, "Multiple validation errors: #{errors}" unless errors.empty?
    true
  end
  
  private
  
  def self.validate_email_format(email)
    raise ValidationError.new(:email, email, "required") if email.nil? || email.empty?
    raise ValidationError.new(:email, email, "invalid format") unless email.match?(EMAIL_REGEX)
    raise ValidationError.new(:email, email, "too long") if email.length > 254
  end
  
  def self.validate_password_strength(password)
    raise ValidationError.new(:password, "[hidden]", "required") if password.nil? || password.empty?
    raise ValidationError.new(:password, "[hidden]", "too short") if password.length < 8
    raise ValidationError.new(:password, "[hidden]", "missing uppercase") unless password.match?(/[A-Z]/)
    raise ValidationError.new(:password, "[hidden]", "missing lowercase") unless password.match?(/[a-z]/)
    raise ValidationError.new(:password, "[hidden]", "missing digit") unless password.match?(/\d/)
  end
end

Debugging validation failures requires logging both the input values and the validation rules that failed. Ruby's logging capabilities help track validation patterns and identify problematic inputs.

require 'logger'

class DebugValidator
  def initialize(logger = Logger.new(STDOUT))
    @logger = logger
  end
  
  def validate_with_debugging(value, validators)
    @logger.info("Starting validation for value: #{value.inspect}")
    
    validators.each_with_index do |validator, index|
      begin
        result = validator.call(value)
        @logger.info("Validator #{index + 1} passed: #{validator}")
        return result if result
      rescue => e
        @logger.error("Validator #{index + 1} failed: #{e.message}")
        @logger.debug("Validator details: #{validator}")
        @logger.debug("Stack trace: #{e.backtrace.first(3)}")
        raise
      end
    end
    
    false
  end
end

# Usage with debugging
validator = DebugValidator.new
validators = [
  ->(v) { raise "Too short" if v.length < 3; true },
  ->(v) { raise "Invalid chars" unless v.match?(/\A[a-z]+\z/); true },
  ->(v) { raise "Reserved word" if %w[admin root].include?(v); true }
]

begin
  validator.validate_with_debugging("admin", validators)
rescue => e
  puts "Validation failed: #{e.message}"
end

Complex validation scenarios benefit from validation result objects that capture both success state and detailed error information. This approach separates validation logic from error handling.

class ValidationResult
  attr_reader :valid, :errors, :warnings
  
  def initialize(valid = true)
    @valid = valid
    @errors = []
    @warnings = []
  end
  
  def add_error(field, message)
    @errors << { field: field, message: message }
    @valid = false
  end
  
  def add_warning(field, message)
    @warnings << { field: field, message: message }
  end
  
  def valid?
    @valid && @errors.empty?
  end
  
  def error_messages
    @errors.map { |e| "#{e[:field]}: #{e[:message]}" }
  end
end

class ComprehensiveValidator
  def validate_user_profile(profile)
    result = ValidationResult.new
    
    # Required field validation
    %w[name email].each do |field|
      if profile[field.to_sym].nil? || profile[field.to_sym].empty?
        result.add_error(field, "is required")
      end
    end
    
    # Email format validation
    if profile[:email] && !profile[:email].match?(EMAIL_REGEX)
      result.add_error("email", "invalid format")
    end
    
    # Age validation with warnings
    if profile[:age]
      if profile[:age] < 13
        result.add_error("age", "must be at least 13")
      elsif profile[:age] > 120
        result.add_warning("age", "unusually high age")
      end
    end
    
    result
  end
end

Performance & Memory

Input validation performance in Ruby depends heavily on the validation methods chosen and the size of data being processed. Regular expressions, string operations, and type conversions have different performance characteristics that affect large-scale applications.

Regular expression performance varies significantly based on pattern complexity and input string length. Simple patterns like \A[a-z]+\z perform faster than complex lookahead patterns. Compiling regular expressions outside of validation methods improves performance for repeated validations.

# Performance comparison of validation approaches
require 'benchmark'

class PerformanceValidator
  # Pre-compiled regex for better performance
  EMAIL_REGEX = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i.freeze
  PHONE_REGEX = /\A\+?[\d\s\-\(\)]{10,15}\z/.freeze
  
  def self.benchmark_email_validation(emails)
    Benchmark.bm(15) do |x|
      # Regex validation
      x.report("Regex match") do
        emails.each { |email| email.match?(EMAIL_REGEX) }
      end
      
      # String method validation
      x.report("String methods") do
        emails.each do |email|
          email.include?("@") && 
          email.count("@") == 1 && 
          email.length > 5 && 
          email.length < 255
        end
      end
      
      # Type conversion approach
      x.report("URI parsing") do
        emails.each do |email|
          begin
            require 'uri'
            URI::MailTo.build(email)
            true
          rescue
            false
          end
        end
      end
    end
  end
end

# Memory-efficient validation for large datasets
class BatchValidator
  def validate_large_dataset(data_stream)
    valid_count = 0
    error_count = 0
    
    data_stream.each_slice(1000) do |batch|
      batch.each do |record|
        if validate_record(record)
          valid_count += 1
        else
          error_count += 1
        end
      end
      
      # Process batch and free memory
      GC.start if (valid_count + error_count) % 10000 == 0
    end
    
    { valid: valid_count, errors: error_count }
  end
  
  private
  
  def validate_record(record)
    # Efficient validation that doesn't create large intermediate objects
    return false unless record.is_a?(Hash)
    return false unless record[:id].to_s.match?(/\A\d+\z/)
    return false unless record[:name].to_s.length.between?(1, 100)
    
    true
  end
end

String validation performance benefits from choosing appropriate methods. String#match? performs faster than String#match when only boolean results are needed. Length validation using String#bytesize differs from String#length for multibyte characters.

class OptimizedStringValidator
  # Fast string validation methods
  def self.fast_length_check(string, min, max)
    # bytesize is faster than length for ASCII-only validation
    size = string.bytesize
    size >= min && size <= max
  end
  
  def self.validate_ascii_only(string)
    # ASCII validation is faster than full unicode checks
    string.ascii_only? && string.valid_encoding?
  end
  
  def self.validate_numeric_string(string)
    # Faster than regex for simple numeric validation
    return false if string.empty?
    
    string.each_byte do |byte|
      return false unless byte >= 48 && byte <= 57  # '0' to '9'
    end
    
    true
  end
  
  # Memory-efficient validation for large strings
  def self.validate_large_text(text, max_words = 1000)
    word_count = 0
    
    text.scan(/\S+/) do |word|
      word_count += 1
      return false if word_count > max_words
    end
    
    true
  end
end

# Performance testing framework
class ValidationBenchmark
  def self.compare_validation_methods(dataset)
    puts "Dataset size: #{dataset.size} records"
    
    Benchmark.bm(20) do |x|
      x.report("Regex validation") do
        dataset.each { |item| item.match?(/\A[a-z0-9_]{3,20}\z/) }
      end
      
      x.report("Length + char check") do
        dataset.each do |item|
          item.length.between?(3, 20) && 
          item.chars.all? { |c| c.match?(/[a-z0-9_]/) }
        end
      end
      
      x.report("Byte-level check") do
        dataset.each do |item|
          next false unless item.length.between?(3, 20)
          
          item.each_byte.all? do |byte|
            (byte >= 97 && byte <= 122) ||  # a-z
            (byte >= 48 && byte <= 57) ||   # 0-9
            byte == 95                      # _
          end
        end
      end
    end
  end
end

Memory management becomes critical when validating large datasets. Ruby's garbage collector can be triggered strategically, and validation methods should avoid creating unnecessary intermediate objects.

class MemoryEfficientValidator
  def validate_csv_stream(file_path, chunk_size = 5000)
    require 'csv'
    
    valid_records = 0
    invalid_records = []
    memory_usage = []
    
    CSV.foreach(file_path, headers: true).each_slice(chunk_size).with_index do |chunk, index|
      # Record memory usage before processing
      memory_before = GC.stat[:heap_live_slots]
      
      chunk_results = chunk.map { |row| validate_csv_row(row) }
      valid_records += chunk_results.count(true)
      
      # Collect only invalid record references, not full data
      chunk.each_with_index do |row, row_index|
        unless chunk_results[row_index]
          invalid_records << { chunk: index, row: row_index, id: row['id'] }
        end
      end
      
      # Force garbage collection and measure memory
      GC.start
      memory_after = GC.stat[:heap_live_slots]
      memory_usage << { chunk: index, before: memory_before, after: memory_after }
      
      # Log progress for large files
      puts "Processed chunk #{index + 1}, valid: #{valid_records}, memory delta: #{memory_after - memory_before}" if index % 10 == 0
    end
    
    {
      valid_count: valid_records,
      invalid_count: invalid_records.size,
      invalid_records: invalid_records,
      memory_profile: memory_usage
    }
  end
  
  private
  
  def validate_csv_row(row)
    # Efficient validation without creating intermediate objects
    return false unless row['id']&.match?(/\A\d+\z/)
    return false unless row['email']&.include?('@')
    return false unless row['name']&.length&.between?(1, 100)
    
    true
  end
end

Testing Strategies

Testing input validation requires comprehensive coverage of boundary conditions, edge cases, and error scenarios. Ruby's testing frameworks provide tools for parameterized testing, exception assertions, and property-based testing approaches.

RSpec and Minitest both support validation testing patterns. Testing strategies should cover valid inputs, invalid inputs, boundary values, and error handling behavior. Property-based testing helps discover edge cases that manual test cases might miss.

# RSpec validation testing patterns
RSpec.describe InputValidator do
  describe '#validate_email' do
    # Valid email test cases
    valid_emails = [
      'user@example.com',
      'test.email+tag@example.co.uk',
      'user123@sub.domain.com'
    ]
    
    valid_emails.each do |email|
      it "accepts valid email: #{email}" do
        expect(InputValidator.validate_email(email)).to be true
      end
    end
    
    # Invalid email test cases with specific error messages
    invalid_email_cases = [
      { email: nil, reason: 'nil input' },
      { email: '', reason: 'empty string' },
      { email: 'invalid', reason: 'missing @ symbol' },
      { email: '@example.com', reason: 'missing local part' },
      { email: 'user@', reason: 'missing domain' },
      { email: 'user@.com', reason: 'invalid domain' },
      { email: 'user name@example.com', reason: 'contains space' }
    ]
    
    invalid_email_cases.each do |test_case|
      it "rejects #{test_case[:reason]}: #{test_case[:email].inspect}" do
        expect(InputValidator.validate_email(test_case[:email])).to be false
      end
    end
  end
  
  describe '#validate_password' do
    # Boundary testing for password length
    context 'password length validation' do
      it 'rejects passwords shorter than 8 characters' do
        expect(InputValidator.validate_password('1234567')).to be false
      end
      
      it 'accepts passwords exactly 8 characters' do
        expect(InputValidator.validate_password('Abcdef12')).to be true
      end
      
      it 'accepts very long passwords' do
        long_password = 'A' + 'a' * 100 + '1'
        expect(InputValidator.validate_password(long_password)).to be true
      end
    end
    
    # Character requirement testing
    context 'character requirements' do
      let(:base_password) { 'password' }
      
      it 'requires uppercase letter' do
        expect(InputValidator.validate_password('lowercase123')).to be false
        expect(InputValidator.validate_password('Uppercase123')).to be true
      end
      
      it 'requires lowercase letter' do
        expect(InputValidator.validate_password('UPPERCASE123')).to be false
        expect(InputValidator.validate_password('Uppercase123')).to be true
      end
      
      it 'requires digit' do
        expect(InputValidator.validate_password('NoDigits')).to be false
        expect(InputValidator.validate_password('WithDigit1')).to be true
      end
    end
  end
end

# Minitest validation testing with helper methods
class ValidationTest < Minitest::Test
  def setup
    @validator = DataValidator.new
  end
  
  def test_age_validation_boundaries
    # Test boundary values
    assert_raises(ValidationError) { @validator.validate_age(0) }
    assert @validator.validate_age(1)
    assert @validator.validate_age(120)
    assert_raises(ValidationError) { @validator.validate_age(121) }
  end
  
  def test_age_validation_types
    # Test different input types
    assert @validator.validate_age(25)
    assert @validator.validate_age("25")  # String conversion
    
    assert_raises(ValidationError) { @validator.validate_age("abc") }
    assert_raises(ValidationError) { @validator.validate_age(nil) }
    assert_raises(ValidationError) { @validator.validate_age(25.5) }
  end
  
  # Property-based testing helper
  def test_username_validation_properties
    # Generate test data with known properties
    valid_usernames = generate_valid_usernames(100)
    invalid_usernames = generate_invalid_usernames(100)
    
    valid_usernames.each do |username|
      assert @validator.validate_username(username), 
             "Should accept valid username: #{username}"
    end
    
    invalid_usernames.each do |username|
      refute @validator.validate_username(username), 
             "Should reject invalid username: #{username}"
    end
  end
  
  private
  
  def generate_valid_usernames(count)
    count.times.map do
      length = rand(3..20)
      chars = [*'a'..'z', *'A'..'Z', *'0'..'9', '_']
      Array.new(length) { chars.sample }.join
    end
  end
  
  def generate_invalid_usernames(count)
    invalid = []
    
    # Too short
    invalid += Array.new(count / 4) { [*'a'..'z'].sample(rand(0..2)).join }
    
    # Too long  
    invalid += Array.new(count / 4) { [*'a'..'z'].sample(rand(21..50)).join }
    
    # Invalid characters
    invalid += Array.new(count / 4) { "valid#{['!', '@', '#', '$'].sample}name" }
    
    # Empty and nil
    invalid += ['', nil] * (count / 8)
    
    invalid.first(count)
  end
end

Mock and stub testing helps isolate validation logic from external dependencies. Testing validation methods that depend on database lookups, API calls, or file system access requires careful mocking.

# Testing validation with external dependencies
class UserValidator
  def initialize(user_repository = UserRepository.new)
    @user_repository = user_repository
  end
  
  def validate_unique_email(email)
    return false unless validate_email_format(email)
    
    !@user_repository.exists_with_email?(email)
  end
  
  def validate_username_availability(username)
    return false unless validate_username_format(username)
    
    !@user_repository.exists_with_username?(username)
  end
  
  private
  
  def validate_email_format(email)
    email&.match?(EMAIL_REGEX)
  end
  
  def validate_username_format(username)
    username&.match?(/\A[a-zA-Z0-9_]{3,20}\z/)
  end
end

# RSpec tests with mocking
RSpec.describe UserValidator do
  let(:mock_repository) { double('UserRepository') }
  let(:validator) { UserValidator.new(mock_repository) }
  
  describe '#validate_unique_email' do
    context 'with valid email format' do
      let(:email) { 'test@example.com' }
      
      it 'returns true when email is not taken' do
        allow(mock_repository).to receive(:exists_with_email?).with(email).and_return(false)
        
        expect(validator.validate_unique_email(email)).to be true
      end
      
      it 'returns false when email is already taken' do
        allow(mock_repository).to receive(:exists_with_email?).with(email).and_return(true)
        
        expect(validator.validate_unique_email(email)).to be false
      end
    end
    
    it 'returns false for invalid email format without checking repository' do
      expect(mock_repository).not_to receive(:exists_with_email?)
      
      expect(validator.validate_unique_email('invalid-email')).to be false
    end
  end
  
  describe '#validate_username_availability' do
    it 'calls repository only after format validation passes' do
      expect(mock_repository).not_to receive(:exists_with_username?)
      
      validator.validate_username_availability('ab')  # Too short
    end
    
    it 'checks repository when format is valid' do
      username = 'validuser'
      expect(mock_repository).to receive(:exists_with_username?).with(username).and_return(false)
      
      expect(validator.validate_username_availability(username)).to be true
    end
  end
end

Integration testing validates the complete validation pipeline including error handling, logging, and side effects. These tests ensure validation works correctly within the broader application context.

# Integration test for validation pipeline
class ValidationIntegrationTest < Minitest::Test
  def setup
    @temp_db = create_test_database
    @validator = UserValidator.new(UserRepository.new(@temp_db))
    @logger = Logger.new(StringIO.new)
  end
  
  def teardown
    cleanup_test_database(@temp_db)
  end
  
  def test_complete_user_validation_flow
    user_data = {
      email: 'newuser@example.com',
      username: 'newuser123',
      password: 'SecurePass1',
      age: 25
    }
    
    # Should pass all validation
    result = @validator.validate_complete_user(user_data)
    assert result.valid?
    assert_empty result.errors
    
    # Create user to test uniqueness validation
    create_test_user(user_data)
    
    # Should fail uniqueness validation
    result = @validator.validate_complete_user(user_data)
    refute result.valid?
    assert_includes result.error_messages, "email: already taken"
  end
  
  def test_validation_error_logging
    invalid_data = { email: 'invalid', username: 'ab', password: 'weak', age: -5 }
    
    # Capture log output
    log_output = StringIO.new
    validator_with_logging = UserValidator.new(UserRepository.new(@temp_db), Logger.new(log_output))
    
    result = validator_with_logging.validate_complete_user(invalid_data)
    
    refute result.valid?
    log_content = log_output.string
    
    assert_includes log_content, "Validation failed for email"
    assert_includes log_content, "Validation failed for username" 
    assert_includes log_content, "Validation failed for password"
    assert_includes log_content, "Validation failed for age"
  end
  
  private
  
  def create_test_database
    # Setup in-memory SQLite database for testing
    require 'sqlite3'
    db = SQLite3::Database.new(':memory:')
    db.execute('CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT, username TEXT)')
    db
  end
  
  def cleanup_test_database(db)
    db.close
  end
  
  def create_test_user(user_data)
    @temp_db.execute('INSERT INTO users (email, username) VALUES (?, ?)', 
                     [user_data[:email], user_data[:username]])
  end
end

Common Pitfalls

Input validation in Ruby contains several subtle gotchas that can lead to security vulnerabilities or application errors. Understanding these pitfalls prevents common mistakes in validation logic.

Regular expression anchoring represents a critical security concern. Using ^ and $ anchors instead of \A and \z allows multiline bypass attacks where malicious input includes newline characters.

# DANGEROUS: Vulnerable to newline bypass
BAD_EMAIL_REGEX = /^[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+$/i

def vulnerable_email_validation(email)
  email.match?(BAD_EMAIL_REGEX)
end

# This passes validation but contains malicious content
malicious_input = "valid@example.com\n<script>alert('xss')</script>"
puts vulnerable_email_validation(malicious_input)  # => true

# SAFE: Proper anchoring
SAFE_EMAIL_REGEX = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i

def secure_email_validation(email)
  email.match?(SAFE_EMAIL_REGEX)
end

puts secure_email_validation(malicious_input)  # => false

Type coercion pitfalls occur when Ruby's flexible type system allows unexpected conversions. The to_i and to_s methods perform silent conversion, while Integer() and String() raise exceptions for invalid input.

# Dangerous: Silent conversion loses validation
def bad_age_validation(age_input)
  age = age_input.to_i  # "abc".to_i => 0, "25abc".to_i => 25
  age > 0 && age < 120
end

puts bad_age_validation("abc")     # => false (but should raise error)
puts bad_age_validation("25abc")   # => true (accepts invalid input!)
puts bad_age_validation("0025")    # => true (accepts leading zeros)

# Better: Explicit validation with strict conversion
def proper_age_validation(age_input)
  # First validate string format
  return false unless age_input.is_a?(String)
  return false unless age_input.match?(/\A\d+\z/)  # Only digits
  
  # Then convert and validate range
  age = Integer(age_input)  # Raises exception for invalid format
  age > 0 && age < 120
rescue ArgumentError
  false
end

# Best: Comprehensive validation with detailed errors
class AgeValidator
  def self.validate(input)
    errors = []
    
    errors << "must be a string" unless input.is_a?(String)
    return errors unless errors.empty?
    
    errors << "cannot be empty" if input.empty?
    errors << "must contain only digits" unless input.match?(/\A\d+\z/)
    errors << "cannot have leading zeros" if input.match?(/\A0\d+\z/)
    return errors unless errors.empty?
    
    age = Integer(input)
    errors << "must be positive" unless age > 0
    errors << "must be realistic (< 120)" unless age < 120
    
    errors
  end
end

String encoding issues create validation bypasses when byte-level and character-level operations produce different results. Ruby's encoding handling requires explicit consideration in validation routines.

# Encoding pitfall demonstration
def length_validation_pitfall(input)
  # These can give different results for multibyte strings
  puts "String: #{input.inspect}"
  puts "Length (chars): #{input.length}"
  puts "Bytesize: #{input.bytesize}"
  puts "Valid encoding: #{input.valid_encoding?}"
end

# Examples showing the differences
length_validation_pitfall("café")        # 4 chars, 5 bytes (UTF-8)
length_validation_pitfall("🚀")          # 1 char, 4 bytes  
length_validation_pitfall("\xff\xfe")    # Invalid UTF-8 sequence

# Proper encoding-aware validation
class SafeStringValidator
  def self.validate_text_input(input, max_chars: 100, max_bytes: 1000)
    errors = []
    
    # Check encoding validity first
    unless input.valid_encoding?
      errors << "contains invalid byte sequences"
      return errors
    end
    
    # Character count validation (for display)
    if input.length > max_chars
      errors << "too many characters (#{input.length}/#{max_chars})"
    end
    
    # Byte count validation (for storage)
    if input.bytesize > max_bytes
      errors << "too many bytes (#{input.bytesize}/#{max_bytes})"
    end
    
    # Check for dangerous characters
    if input.match?(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/)
      errors << "contains control characters"
    end
    
    errors
  end
end

Regex performance pitfalls arise from catastrophic backtracking in complex patterns. Certain regex constructs can cause exponential time complexity with malicious input.

# DANGEROUS: Catastrophic backtracking vulnerability
VULNERABLE_REGEX = /^(a+)+b$/

def vulnerable_validation(input)
  # This can take exponential time for inputs like "aaaaaaaaaaaaaaaaaac"
  start_time = Time.now
  result = input.match?(VULNERABLE_REGEX)
  end_time = Time.now
  
  puts "Validation took #{end_time - start_time} seconds"
  result
end

# Safe: More efficient regex design
SAFE_REGEX = /\Aa+b\z/

def safe_validation(input)
  input.match?(SAFE_REGEX)
end

# Testing with potentially problematic input
problematic_input = "a" * 20 + "c"  # No 'b' at end

puts "Testing vulnerable regex..."
vulnerable_validation(problematic_input)

puts "Testing safe regex..."
start_time = Time.now
result = safe_validation(problematic_input)
end_time = Time.now
puts "Safe validation took #{end_time - start_time} seconds: #{result}"

State-dependent validation errors occur when validation depends on external state that can change between validation and use. This creates race conditions and time-of-check-time-of-use vulnerabilities.

# Problematic: Validation depends on mutable external state
class ProblematicValidator
  def initialize
    @blacklisted_domains = load_blacklist_from_file  # Mutable external state
  end
  
  def validate_email_domain(email)
    domain = email.split('@').last
    !@blacklisted_domains.include?(domain)  # State can change!
  end
  
  private
  
  def load_blacklist_from_file
    # External file can be modified between validation calls
    File.readlines('blacklist.txt').map(&:strip)
  rescue
    []
  end
end

# Better: Immutable validation with explicit state management
class RobustValidator
  def initialize(blacklist_source = nil)
    @blacklist_snapshot = create_blacklist_snapshot(blacklist_source)
    @blacklist_loaded_at = Time.now
  end
  
  def validate_email_domain(email, refresh_blacklist: false)
    refresh_blacklist! if refresh_blacklist || blacklist_expired?
    
    domain = extract_domain(email)
    return false unless domain
    
    !@blacklist_snapshot.include?(domain)
  end
  
  def blacklist_age
    Time.now - @blacklist_loaded_at
  end
  
  private
  
  def create_blacklist_snapshot(source)
    # Create immutable snapshot of validation rules
    case source
    when Array then source.dup.freeze
    when String then File.readlines(source).map(&:strip).freeze
    when nil then [].freeze
    else raise ArgumentError, "Invalid blacklist source"
    end
  rescue => e
    warn "Failed to load blacklist: #{e.message}"
    [].freeze
  end
  
  def blacklist_expired?
    blacklist_age > 3600  # 1 hour expiry
  end
  
  def refresh_blacklist!
    @blacklist_snapshot = create_blacklist_snapshot('blacklist.txt')
    @blacklist_loaded_at = Time.now
  end
  
  def extract_domain(email)
    return nil unless email.is_a?(String) && email.include?('@')
    
    parts = email.split('@')
    return nil unless parts.length == 2
    
    domain = parts.last.downcase
    return nil if domain.empty?
    
    domain
  end
end

Reference

Core Validation Methods

Method	Parameters	Returns	Description
`Integer(value)`	`value` (Object)	`Integer`	Converts value to integer, raises ArgumentError for invalid input
`Float(value)`	`value` (Object)	`Float`	Converts value to float, raises ArgumentError for invalid input
`String(value)`	`value` (Object)	`String`	Converts value to string using to_s method
`String#match?(pattern)`	`pattern` (Regexp/String)	`Boolean`	Returns true if string matches pattern, false otherwise
`String#empty?`	None	`Boolean`	Returns true if string length is zero
`String#length`	None	`Integer`	Returns character count in string
`String#bytesize`	None	`Integer`	Returns byte count in string
`String#valid_encoding?`	None	`Boolean`	Returns true if string contains valid byte sequence
`String#ascii_only?`	None	`Boolean`	Returns true if string contains only ASCII characters

Regular Expression Anchors

Anchor	Behavior	Security Impact
`^`	Start of line (allows multiline bypass)	Dangerous - vulnerable to injection
`$`	End of line (allows multiline bypass)	Dangerous - vulnerable to injection
`\A`	Start of string (absolute)	Safe - prevents multiline bypass
`\z`	End of string (absolute)	Safe - prevents multiline bypass
`\Z`	End of string or before final newline	Potentially unsafe

Validation Error Classes

Exception	Inheritance	Common Use Cases
`ArgumentError`	`StandardError`	Invalid method arguments, type mismatches
`TypeError`	`StandardError`	Unexpected object types
`RangeError`	`StandardError`	Values outside acceptable ranges
`EncodingError`	`StandardError`	String encoding problems
`Date::Error`	`ArgumentError`	Invalid date/time formats
`JSON::ParserError`	`StandardError`	Malformed JSON data
`URI::InvalidURIError`	`StandardError`	Invalid URI formats

Type Checking Methods

Method	Behavior	Error Handling
`obj.is_a?(Class)`	Checks exact class or inheritance	Returns boolean
`obj.kind_of?(Class)`	Alias for is_a?	Returns boolean
`obj.instance_of?(Class)`	Checks exact class only	Returns boolean
`obj.respond_to?(method)`	Checks method availability	Returns boolean
`obj.class`	Returns object's class	Never fails

String Validation Patterns

Pattern	Regular Expression	Use Case
Email	`/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i`	Email address format
Username	`/\A[a-zA-Z0-9_]{3,20}\z/`	Alphanumeric usernames
Phone	`/\A\+?[\d\s\-]{10,15}\z/`	Phone number format
URL	`/\Ahttps?:\/\/[\S]+\z/`	Basic URL validation
UUID	`/\A[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\z/i`	UUID format
IP Address	`/\A(?:[0-9]{1,3}\.){3}[0-9]{1,3}\z/`	IPv4 address
Hex Color	`/\A#?[0-9a-f]{6}\z/i`	Hexadecimal color

Numeric Validation Ranges

Data Type	Minimum Value	Maximum Value	Ruby Method
Integer	`-2**63`	`2**63 - 1`	`Integer()`
Float	`-Float::INFINITY`	`Float::INFINITY`	`Float()`
Age	`0`	`150`	Custom validation
Percentage	`0.0`	`100.0`	`Range#cover?`
Port Number	`1`	`65535`	`Integer()` + range
HTTP Status	`100`	`599`	Custom validation

Performance Characteristics

Operation	Complexity	Performance Notes
String length check	O(1)	Constant time for UTF-8 strings
Regex matching	O(n) to O(2^n)	Depends on pattern complexity
Integer conversion	O(n)	Linear with string length
Hash key lookup	O(1) average	For validation rule caching
Array inclusion	O(n)	Linear search through elements
Set inclusion	O(1) average	Use for large validation lists

Common Validation Patterns

# Email validation with comprehensive checks
def validate_email(email)
  return false unless email.is_a?(String)
  return false if email.length > 254  # RFC limit
  return false unless email.match?(/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i)
  
  local, domain = email.split('@')
  return false if local.length > 64   # RFC limit
  return false if domain.length > 253 # RFC limit
  
  true
end

# Numeric range validation
def validate_in_range(value, min, max)
  num = Float(value)
  (min..max).cover?(num)
rescue ArgumentError
  false
end

# Collection validation
def validate_array_of_strings(array, max_length: 100)
  return false unless array.is_a?(Array)
  return false if array.length > max_length
  
  array.all? { |item| item.is_a?(String) && !item.empty? }
end