Web forms produce inconsistent data, and importing customer records from external systems requires normalization. This drill teaches you to clean messy CSV data by stripping whitespace, normalizing phone formats, standardizing emails, validating required fields, removing duplicates, and handling malformed rows. You'll learn data validation patterns essential for preventing SQL errors and data corruption. Note: This drill parses CSV manually using string methods rather than Ruby's CSV library.
String#strip removes leading/trailing whitespace
String#downcase converts to lowercase
String#gsub(/\D/, '') removes all non-digit characters
Use hash to track seen emails: seen_emails[email] = true
Validate email with email.include?('@') && email.include?('.')
Split company name, capitalize each word: .split.map(&:capitalize).join(' ')
clean_csv('dirty_data.csv')
Processed 5 rows: 2 cleaned, 2 errors 1 duplicate removed Errors found
clean_csv('dirty_data.csv')
puts 'Validation OK'
Processed 5 rows: 2 cleaned, 2 errors 1 duplicate removed Errors found Validation OK
clean_csv('dirty_data.csv')
puts 'Email check OK'
Processed 5 rows: 2 cleaned, 2 errors 1 duplicate removed Errors found Email check OK
Console output will appear here...
Are you sure?
You're making great progress
Become a Ruby Pro
1,600+ problems to master every concept