Marketing teams extract contact information from customer inquiries, privacy compliance requires finding personal data before sharing documents, and sales teams build contact lists from exported records. This drill teaches you to use regular expressions to extract email addresses, URLs, and phone numbers from unstructured text files. You'll learn practical regex patterns for real-world data extraction tasks.
String#scan with a regex returns all matches as an array
Email regex: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/i (case insensitive)
URL regex: /https?:\/\/[^\s]+/ matches http:// or https://
Phone regex: /\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}/ handles multiple formats
Use .uniq to remove duplicates and .sort to alphabetize
\b in regex means word boundary
extract_contact_info('contacts.txt')
Emails found: 3 john.doe@company.org sales@example.com support@example.com URLs found: 2 https://company.org https://example.com Phone numbers found: 3 (555) 123-4567 555-987-6543 555.111.2222
extract_contact_info('contacts.txt')
puts '---'
puts 'verified'
Emails found: 3 john.doe@company.org sales@example.com support@example.com URLs found: 2 https://company.org https://example.com Phone numbers found: 3 (555) 123-4567 555-987-6543 555.111.2222 --- verified
extract_contact_info('contacts.txt')
puts 'Duplicates handled correctly'
Emails found: 3 john.doe@company.org sales@example.com support@example.com URLs found: 2 https://company.org https://example.com Phone numbers found: 3 (555) 123-4567 555-987-6543 555.111.2222 Duplicates handled correctly
Console output will appear here...
Are you sure?
You're making great progress
Become a Ruby Pro
1,600+ problems to master every concept