Challenge

Problem

Analytics teams track engagement metrics, product managers need DAU/MAU reports, and SaaS companies monitor usage for billing. This drill teaches you to parse Apache access logs to extract IP addresses and usernames, group activity by date, count unique users per day, and calculate 30-day active users. You'll work with complex nested hash structures and date arithmetic.

Difficulty: Intermediate

Instructions

  1. Read Apache access log file line by line
  2. Parse each line to extract: IP address, username (or '-' for anonymous), date
  3. Skip anonymous users (username = '-')
  4. Build nested structure: { date => { username => request_count } }
  5. Calculate daily unique users for each date
  6. Calculate 30-day active users (unique users across all dates)
  7. Output format:
    'Date: 2024-10-01, Active users: 5'
    'Date: 2024-10-02, Active users: 7'
    '30-day active users: 12'

Files

Editable
Read-only

Hints

Hint 1

Apache log format: IP - username [date] "request" status size

Hint 2

Use regex to extract the three parts before the bracket

Hint 3

Hash.new { |h, k| h[k] = Hash.new(0) } creates nested hash with counters

Hint 4

Date.strptime(date_str, '%d/%b/%Y') parses Apache date format

Hint 5

Use .strftime('%Y-%m-%d') to convert to standard format

Hint 6

Count unique users with hash.keys.length

Hint 7

Get all unique users across days with .flat_map(&:keys).uniq

Provided Files (Read-only)

1. Basic tracking - 2 days

Input:
track_active_users('apache.log')
Expected Output:
Date: 2024-10-01, Active users: 2
Date: 2024-10-02, Active users: 1

30-day active users: 3

2. Skip anonymous users

Input:
track_active_users('apache.log')
Expected Output:
Date: 2024-10-01, Active users: 2

30-day active users: 2

3. Same user multiple requests per day

Input:
track_active_users('apache.log')
Expected Output:
Date: 2024-10-01, Active users: 1

30-day active users: 1
+ 2 hidden test cases