CrackedRuby - Core Class Extensions

Overview

Ruby extends the String class with numerous methods that handle text processing, encoding transformations, and pattern operations. These extensions form the backbone of text manipulation in Ruby applications, providing methods for case conversion, substring extraction, pattern matching, and character encoding operations.

The String class includes methods for modifying content (#gsub, #tr, #squeeze), extracting information (#scan, #match, #include?), and transforming format (#upcase, #downcase, #capitalize). Ruby handles string encoding through methods like #encode, #force_encoding, and #valid_encoding?, supporting multiple character encodings including UTF-8, ASCII, and ISO-8859-1.

text = "Hello, World!"
text.upcase                    # => "HELLO, WORLD!"
text.gsub(/[aeiou]/, '*')      # => "H*ll*, W*rld!"
text.include?("World")         # => true

String interpolation works seamlessly with these methods:

name = "ruby developer"
puts "Welcome #{name.titleize}!"  # => "Welcome Ruby Developer!"

The encoding system handles character conversion transparently:

utf8_string = "café".encode('UTF-8')
ascii_string = utf8_string.encode('ASCII', invalid: :replace)
# => "caf?"

Basic Usage

String case conversion methods transform text between different capitalization formats. Ruby provides #upcase, #downcase, #capitalize, #swapcase, and locale-aware variants.

text = "Mixed Case String"
text.upcase        # => "MIXED CASE STRING"
text.downcase      # => "mixed case string"  
text.capitalize    # => "Mixed case string"
text.swapcase      # => "mIXED cASE sTRING"

The #gsub method performs pattern-based substitutions using regular expressions or strings. The method accepts blocks for complex replacement logic.

text = "The quick brown fox jumps"
text.gsub(/\b\w{4}\b/, '[WORD]')           # => "The [WORD] brown fox [WORD]"
text.gsub(/(\w+)/) { |word| word.reverse }  # => "ehT kciuq nworb xof spmuj"

Character translation occurs through #tr and #tr_s methods. These methods map character sets to replacement characters.

"hello world".tr('l', 'x')      # => "hexxo worxd"
"hello world".tr('a-z', 'A-Z')  # => "HELLO WORLD"
"bookkeeper".tr_s('k', 'c')     # => "booceeper"

String scanning with #scan extracts matching patterns into arrays. The method works with regular expressions and string patterns.

text = "Phone: 555-1234, Fax: 555-5678"
text.scan(/\d{3}-\d{4}/)        # => ["555-1234", "555-5678"]
text.scan(/(\w+):\s*(\S+)/)     # => [["Phone", "555-1234"], ["Fax", "555-5678"]]

Advanced Usage

String extensions support complex text processing through method chaining and block-based transformations. The #gsub method accepts advanced regular expression patterns with named captures and lookarounds.

html_text = "<p>Hello <strong>world</strong>!</p>"
clean_text = html_text
  .gsub(/<[^>]+>/, '')           # Remove HTML tags
  .squeeze(' ')                   # Collapse multiple spaces
  .strip                         # Remove leading/trailing whitespace
# => "Hello world!"

Pattern extraction becomes sophisticated with named captures and complex regular expressions:

log_entry = "2024-01-15 14:30:25 ERROR DatabaseConnection timeout after 30s"
pattern = /(?<date>\d{4}-\d{2}-\d{2})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<level>\w+)\s+(?<message>.*)/

match = log_entry.match(pattern)
{
  timestamp: "#{match[:date]} #{match[:time]}",
  severity: match[:level],
  details: match[:message]
}
# => {:timestamp=>"2024-01-15 14:30:25", :severity=>"ERROR", :details=>"DatabaseConnection timeout after 30s"}

The #partition and #rpartition methods split strings around delimiters, returning three-element arrays containing the parts before, including, and after the delimiter.

email = "user@example.com"
username, at_sign, domain = email.partition('@')
# username => "user", at_sign => "@", domain => "example.com"

filepath = "/home/user/documents/file.txt"
directory, separator, filename = filepath.rpartition('/')
# directory => "/home/user/documents", separator => "/", filename => "file.txt"

String encoding conversion handles character set transformations with error handling strategies. The #encode method accepts replacement characters and invalid byte handling.

mixed_encoding = "Café naïve résumé".encode('UTF-8')

# Convert with replacement characters
ascii_version = mixed_encoding.encode('ASCII', 
  invalid: :replace, 
  undef: :replace, 
  replace: '?')
# => "Caf? na?ve r?sum?"

# Convert with XML entity encoding  
xml_safe = mixed_encoding.encode('ASCII', 
  invalid: :replace, 
  undef: :replace, 
  replace: proc { |char| "&##{char.ord};" })
# => "Caf&#233; na&#239;ve r&#233;sum&#233;"

Common Pitfalls

String mutating methods create confusion between destructive and non-destructive operations. Methods ending with exclamation marks modify the original string, while others return new strings.

original = "hello world"
result = original.upcase    # original unchanged, result = "HELLO WORLD"  
original.upcase!           # original modified to "HELLO WORLD"

# Common mistake: expecting mutation
text = "sample text"
text.gsub(/\s+/, '_')      # text still equals "sample text"
text = text.gsub(/\s+/, '_')  # Correct: reassign result

Regular expression escaping causes problems when user input contains special characters. The Regexp.escape method handles special character escaping.

user_input = "What is $5.00 + $3.50?"

# Wrong: treats $ and + as regex metacharacters
text.gsub(/#{user_input}/, 'REDACTED')  # Syntax error

# Correct: escape special characters  
text.gsub(/#{Regexp.escape(user_input)}/, 'REDACTED')

Encoding issues arise when mixing strings with different encodings or reading files without specifying encoding. Ruby raises Encoding::CompatibilityError for incompatible operations.

utf8_string = "résumé".encode('UTF-8')
ascii_string = "hello".encode('ASCII')

# This raises Encoding::CompatibilityError
begin
  result = utf8_string + ascii_string.force_encoding('UTF-8')  
rescue Encoding::CompatibilityError
  # Handle encoding mismatch
  result = utf8_string + ascii_string.encode('UTF-8')
end

Case conversion with international characters requires locale-aware methods. Standard case methods may not handle accented characters correctly.

turkish_text = "İstanbul"
turkish_text.downcase        # => "i̇stanbul" (incorrect for Turkish)

# Use locale-aware conversion when available
require 'unicode'
Unicode.downcase(turkish_text, :tr)  # Correct Turkish lowercase

The #tr method performs character-by-character replacement, not substring replacement. This creates unexpected results when replacing multi-character sequences.

text = "hello"
text.tr('ll', 'x')     # => "hexo" (each 'l' becomes 'x')
text.gsub('ll', 'x')   # => "hexo" (substring 'll' becomes 'x')

# For multi-character replacement, use gsub
"bookkeeper".tr('kk', 'c')    # => "booceeper" (each 'k' becomes 'c')  
"bookkeeper".gsub('kk', 'c')  # => "booceeper" (substring 'kk' becomes 'c')

Reference

Case Conversion Methods

Method	Parameters	Returns	Description
`#upcase`	None	`String`	Returns uppercase copy
`#upcase!`	None	`String/nil`	Modifies string to uppercase
`#downcase`	None	`String`	Returns lowercase copy
`#downcase!`	None	`String/nil`	Modifies string to lowercase
`#capitalize`	None	`String`	Returns copy with first character uppercase
`#capitalize!`	None	`String/nil`	Modifies string capitalizing first character
`#swapcase`	None	`String`	Returns copy with case swapped
`#swapcase!`	None	`String/nil`	Modifies string swapping case

Pattern Matching and Substitution

Method	Parameters	Returns	Description
`#gsub(pattern, replacement)`	`pattern` (Regexp/String), `replacement` (String/Hash)	`String`	Returns copy with pattern replaced
`#gsub!(pattern, replacement)`	`pattern` (Regexp/String), `replacement` (String/Hash)	`String/nil`	Modifies string replacing pattern
`#sub(pattern, replacement)`	`pattern` (Regexp/String), `replacement` (String/Hash)	`String`	Returns copy with first pattern replaced
`#sub!(pattern, replacement)`	`pattern` (Regexp/String), `replacement` (String/Hash)	`String/nil`	Modifies string replacing first pattern
`#scan(pattern)`	`pattern` (Regexp/String)	`Array`	Returns array of pattern matches
`#match(pattern, pos=0)`	`pattern` (Regexp), `pos` (Integer)	`MatchData/nil`	Returns match data or nil

Character Translation

Method	Parameters	Returns	Description
`#tr(from_str, to_str)`	`from_str` (String), `to_str` (String)	`String`	Returns copy with characters translated
`#tr!(from_str, to_str)`	`from_str` (String), `to_str` (String)	`String/nil`	Modifies string translating characters
`#tr_s(from_str, to_str)`	`from_str` (String), `to_str` (String)	`String`	Returns copy with characters translated and squeezed
`#tr_s!(from_str, to_str)`	`from_str` (String), `to_str` (String)	`String/nil`	Modifies string translating and squeezing
`#delete(other_str)`	`other_str` (String)	`String`	Returns copy with characters removed
`#delete!(other_str)`	`other_str` (String)	`String/nil`	Modifies string removing characters
`#squeeze(other_str=nil)`	`other_str` (String)	`String`	Returns copy with consecutive characters squeezed
`#squeeze!(other_str=nil)`	`other_str` (String)	`String/nil`	Modifies string squeezing consecutive characters

String Splitting and Partitioning

Method	Parameters	Returns	Description
`#split(pattern=nil, limit=0)`	`pattern` (Regexp/String/nil), `limit` (Integer)	`Array`	Splits string into array
`#partition(sep)`	`sep` (String/Regexp)	`Array`	Returns [before, separator, after]
`#rpartition(sep)`	`sep` (String/Regexp)	`Array`	Returns [before, separator, after] from right
`#lines(separator=$/)`	`separator` (String)	`Array`	Returns array of lines
`#chars`	None	`Array`	Returns array of characters
`#bytes`	None	`Array`	Returns array of byte values

Encoding Operations

Method	Parameters	Returns	Description
`#encode(encoding, **opts)`	`encoding` (String/Encoding), options (Hash)	`String`	Returns string in specified encoding
`#encode!(encoding, **opts)`	`encoding` (String/Encoding), options (Hash)	`String`	Modifies string encoding
`#force_encoding(encoding)`	`encoding` (String/Encoding)	`String`	Changes encoding without conversion
`#encoding`	None	`Encoding`	Returns current encoding
`#valid_encoding?`	None	`Boolean`	Checks if string has valid encoding
`#ascii_only?`	None	`Boolean`	Checks if string contains only ASCII

Encoding Options

Option	Values	Description
`:invalid`	`:replace`, `:ignore`	How to handle invalid bytes
`:undef`	`:replace`, `:ignore`	How to handle undefined conversions
`:replace`	`String`	Replacement string for invalid/undefined
`:fallback`	`Hash/Proc`	Fallback for undefined characters
`:xml`	`:text`, `:attr`	XML entity conversion mode
`:cr_newline`	`Boolean`	Convert LF to CRLF
`:crlf_newline`	`Boolean`	Convert CRLF to LF
`:universal_newline`	`Boolean`	Convert various newlines to LF

Core Class Extensions