CrackedRuby CrackedRuby

Overview

Git is a distributed version control system that tracks changes in source code during software development. Created by Linus Torvalds in 2005 for Linux kernel development, Git addresses the need for fast, distributed collaboration on large codebases. Unlike centralized version control systems, Git stores the complete repository history on each developer's machine, enabling offline work and faster operations.

Git operates on a snapshot-based model rather than tracking file deltas. Each commit represents a complete snapshot of the repository at a specific point in time. This design enables efficient branching, merging, and history traversal. The system uses content-addressable storage where each object (commit, tree, blob) is identified by a SHA-1 hash of its contents.

The three-tree architecture forms Git's core structure: the working directory contains the current file versions, the staging area (index) holds changes prepared for the next commit, and the repository stores the complete commit history. This staging area provides fine-grained control over what changes enter each commit, separating file modification from version recording.

Git's distributed nature means every clone contains the full repository history. Developers can commit, branch, merge, and examine history entirely offline. Network operations occur only when synchronizing with remote repositories through push, pull, and fetch operations. This architecture provides redundancy—every clone serves as a full backup of the project history.

# Initialize a new Git repository
git init my-project
cd my-project

# Create a file and track it
echo "Hello Git" > README.md
git add README.md
git commit -m "Initial commit"

# View repository status
git status
# => On branch main
# => nothing to commit, working tree clean

Key Principles

Git's object model consists of four fundamental types: blobs store file contents, trees represent directory structures, commits capture snapshots with metadata, and tags mark specific commits. Each object is immutable and identified by a SHA-1 hash computed from its content. This content-addressing ensures data integrity—any corruption becomes immediately detectable through hash mismatch.

The commit object contains a pointer to a tree object (the root directory snapshot), pointers to parent commit(s), author information, committer information, and the commit message. Commits form a directed acyclic graph (DAG) where each commit points to its parent(s). This structure enables Git to efficiently determine relationships between commits and compute differences.

# View commit contents
git cat-file -p HEAD
# => tree 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
# => author Name <email@example.com> 1638360000 -0500
# => committer Name <email@example.com> 1638360000 -0500
# =>
# => Initial commit

# Examine tree object
git cat-file -p 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
# => 100644 blob 9f4d96d5b00d98959ea9960f069585ce42b1349a    README.md

Branches in Git are simply pointers to commits. Creating a branch creates a new pointer; switching branches changes which pointer HEAD references. This design makes branching and merging operations fast and lightweight—creating a branch writes a 41-byte file (40-character SHA-1 hash plus newline). The HEAD reference points to the current branch, and the current branch pointer advances with each new commit.

The staging area (index) acts as an intermediate layer between the working directory and repository. Files move through three states: modified (changed in working directory), staged (marked for inclusion in next commit), and committed (stored in repository). This three-state model provides precise control over commit contents. Developers can stage portions of files using interactive staging, creating commits that represent logical units of change rather than all modifications to a file.

Remote repositories are versions of the project hosted elsewhere. The origin remote typically refers to the repository from which the local repository was cloned. Remote-tracking branches (refs/remotes/origin/main) represent the state of branches in remote repositories as of the last fetch or pull operation. Local branches can track remote branches, establishing a relationship for push and pull operations.

Git's merge strategies determine how to combine divergent branch histories. Fast-forward merges occur when the target branch is an ancestor of the source branch—Git simply moves the branch pointer forward. Three-way merges occur when branches have diverged—Git creates a new merge commit with two parents. Rebase operations replay commits from one branch onto another, creating a linear history without merge commits. Each approach has different implications for history structure and collaboration workflows.

The reflog tracks HEAD movement and branch updates, maintaining a local history of repository state changes. This reference log enables recovery from mistakes—commits removed from branch histories remain accessible through the reflog for a retention period (typically 30-90 days). The reflog is local to each repository and not shared through push or fetch operations.

Ruby Implementation

Ruby provides several libraries for Git integration, with Rugged and Git gem being the primary options. Rugged is a Ruby binding to libgit2, offering low-level access to Git functionality. The Git gem provides a higher-level interface suitable for most applications. Ruby on Rails applications commonly use these libraries for deployment automation, code analysis, and repository management.

# Using the git gem for high-level operations
require 'git'

# Clone a repository
repo = Git.clone('https://github.com/user/project.git', 'project-dir')

# Open existing repository
repo = Git.open('/path/to/repo')

# Check status
repo.status.each do |file|
  puts "#{file.path}: #{file.type}"
end

# Create and checkout a branch
repo.branch('feature-branch').checkout

# Stage and commit changes
repo.add(all: true)
repo.commit('Add new feature')

Rugged offers more control and better performance for operations requiring direct object database access. The library exposes Git's internal object model, allowing applications to read and manipulate repository objects directly.

require 'rugged'

# Open repository
repo = Rugged::Repository.new('/path/to/repo')

# Access commits
commit = repo.head.target
puts commit.message
puts commit.author[:name]
puts commit.author[:time]

# Walk commit history
walker = Rugged::Walker.new(repo)
walker.push(commit.oid)
walker.each do |commit|
  puts "#{commit.oid[0..6]} #{commit.message.lines.first}"
end

# Read file contents from specific commit
entry = commit.tree['README.md']
blob = repo.lookup(entry[:oid])
puts blob.content

Ruby applications can implement Git hooks using the git gem or by placing executable Ruby scripts in the .git/hooks directory. Pre-commit hooks validate changes before committing, post-receive hooks trigger deployment or notification workflows, and pre-push hooks prevent pushing to protected branches.

#!/usr/bin/env ruby
# .git/hooks/pre-commit

# Prevent commits to main branch
current_branch = `git symbolic-ref --short HEAD`.strip

if current_branch == 'main'
  puts "Direct commits to main branch are not allowed"
  exit 1
end

# Run tests before committing
system('bundle exec rspec')
exit_status = $?.exitstatus

if exit_status != 0
  puts "Tests must pass before committing"
  exit exit_status
end

exit 0

Ruby scripts can automate repository management tasks such as bulk operations across multiple repositories, custom merge strategies, or automated code review workflows.

require 'git'

class RepositoryManager
  def initialize(repo_path)
    @repo = Git.open(repo_path)
  end

  def cleanup_merged_branches
    merged_branches = @repo.branches.local.select do |branch|
      next if branch.name == @repo.current_branch
      
      # Check if branch is merged into main
      merge_base = @repo.merge_base('main', branch.name)
      branch_commit = @repo.revparse(branch.name)
      
      merge_base == branch_commit
    end

    merged_branches.each do |branch|
      puts "Deleting merged branch: #{branch.name}"
      @repo.branch(branch.name).delete
    end
  end

  def find_commits_by_author(author_pattern)
    commits = []
    
    @repo.log(1000).each do |commit|
      if commit.author.name =~ /#{author_pattern}/i
        commits << {
          sha: commit.sha,
          message: commit.message,
          date: commit.date,
          author: commit.author.name
        }
      end
    end
    
    commits
  end

  def cherry_pick_range(start_sha, end_sha, target_branch)
    @repo.checkout(target_branch)
    
    commits = @repo.log.between(start_sha, end_sha).reverse
    
    commits.each do |commit|
      begin
        @repo.cherry_pick(commit.sha)
        puts "Cherry-picked: #{commit.sha[0..7]} - #{commit.message.lines.first}"
      rescue Git::GitExecuteError => e
        puts "Failed to cherry-pick #{commit.sha}: #{e.message}"
        @repo.reset_hard
        break
      end
    end
  end
end

Practical Examples

Basic workflow operations form the foundation of Git usage. The cycle of modifying files, staging changes, creating commits, and synchronizing with remote repositories repeats throughout development.

# Create a feature branch from main
git checkout main
git pull origin main
git checkout -b feature/user-authentication

# Make changes to files
echo "class User; end" > app/models/user.rb

# View changes
git diff
# => diff --git a/app/models/user.rb b/app/models/user.rb
# => new file mode 100644
# => index 0000000..abcd123
# => --- /dev/null
# => +++ b/app/models/user.rb
# => @@ -0,0 +1 @@
# => +class User; end

# Stage changes
git add app/models/user.rb

# Create commit
git commit -m "Add User model"

# Push to remote
git push origin feature/user-authentication

Interactive staging enables committing portions of changes rather than entire files. This granularity produces commits that represent coherent logical changes.

# Start interactive staging
git add -p app/models/user.rb

# Git presents each hunk with options:
# y - stage this hunk
# n - do not stage this hunk
# s - split into smaller hunks
# e - manually edit the hunk
# q - quit; do not stage this hunk or any remaining ones

# View staged changes
git diff --staged

# Unstage specific files
git reset HEAD app/models/user.rb

Branching strategies determine how teams organize parallel development efforts. Feature branches isolate work on specific features or fixes. Long-running branches like develop and main serve as integration points for completed work.

# Create feature branch
git checkout -b feature/payment-processing

# Work on feature with multiple commits
git commit -m "Add payment gateway integration"
git commit -m "Implement payment validation"
git commit -m "Add payment processing tests"

# Update feature branch with main changes
git checkout main
git pull origin main
git checkout feature/payment-processing
git rebase main

# Resolve conflicts if any, then continue
git add resolved-file.rb
git rebase --continue

# Push rebased branch (force required after rebase)
git push --force-with-lease origin feature/payment-processing

Merge operations combine branch histories. Fast-forward merges move the branch pointer when no divergence exists. Three-way merges create merge commits when branches have diverged.

# Merge feature branch into main
git checkout main
git merge feature/payment-processing

# If merge creates conflicts
git status
# => both modified: app/models/payment.rb

# Edit conflicted files, choosing which changes to keep
# Conflict markers show both versions:
# <<<<<<< HEAD
# main branch version
# =======
# feature branch version
# >>>>>>> feature/payment-processing

# Stage resolved files
git add app/models/payment.rb

# Complete merge
git commit

Stashing saves uncommitted changes temporarily, clearing the working directory without creating commits. This operation enables switching branches with uncommitted work or pulling remote changes into a dirty working directory.

# Save current changes
git stash save "Work in progress on authentication"

# List stashes
git stash list
# => stash@{0}: On feature-branch: Work in progress on authentication
# => stash@{1}: WIP on main: Previous stashed changes

# Apply most recent stash
git stash apply

# Apply and remove specific stash
git stash pop stash@{1}

# View stash contents
git stash show -p stash@{0}

# Drop stash without applying
git stash drop stash@{0}

History manipulation through rebase, reset, and commit amending enables cleaning commit history before sharing. Interactive rebase provides control over multiple commits.

# Interactive rebase last 5 commits
git rebase -i HEAD~5

# Editor opens with commit list:
# pick abc1234 Add feature X
# pick def5678 Fix typo
# pick ghi9012 Add feature Y
# pick jkl3456 Fix bug in feature Y
# pick mno7890 Add tests

# Change to:
# pick abc1234 Add feature X
# squash def5678 Fix typo
# pick ghi9012 Add feature Y
# fixup jkl3456 Fix bug in feature Y
# reword mno7890 Add tests

# Save and close editor
# Git combines commits as specified

Common Patterns

Feature Branch Workflow isolates development work on separate branches. Each feature, bug fix, or experiment receives its own branch created from main. Developers complete work on feature branches, then merge back to main through pull requests or direct merges. This pattern keeps main stable and deployable while enabling parallel development.

# Feature branch lifecycle
git checkout main
git pull origin main
git checkout -b feature/new-dashboard

# Develop feature
git add .
git commit -m "Implement dashboard layout"
git commit -m "Add dashboard data loading"
git commit -m "Style dashboard components"

# Prepare for merge
git checkout main
git pull origin main
git checkout feature/new-dashboard
git rebase main

# Merge to main
git checkout main
git merge --no-ff feature/new-dashboard
git push origin main

# Clean up
git branch -d feature/new-dashboard
git push origin --delete feature/new-dashboard

Git Flow extends feature branches with additional branch types for releases and hotfixes. The main branches are main (production-ready code) and develop (integration branch for features). Supporting branches include feature branches (new features), release branches (release preparation), and hotfix branches (production fixes).

# Start new feature
git checkout develop
git checkout -b feature/user-notifications

# Complete feature
git checkout develop
git merge --no-ff feature/user-notifications
git branch -d feature/user-notifications

# Create release branch
git checkout develop
git checkout -b release/1.2.0
# Perform release preparation (version bumps, changelog updates)
git commit -am "Bump version to 1.2.0"

# Finish release
git checkout main
git merge --no-ff release/1.2.0
git tag -a 1.2.0 -m "Release version 1.2.0"
git checkout develop
git merge --no-ff release/1.2.0
git branch -d release/1.2.0

# Create hotfix
git checkout main
git checkout -b hotfix/security-patch
git commit -am "Fix security vulnerability"
git checkout main
git merge --no-ff hotfix/security-patch
git tag -a 1.2.1 -m "Hotfix version 1.2.1"
git checkout develop
git merge --no-ff hotfix/security-patch
git branch -d hotfix/security-patch

Trunk-Based Development maintains a single main branch where developers commit frequently. Short-lived feature branches (less than one day) may exist, but the focus is on continuous integration to main. Feature flags control incomplete feature visibility in production. This pattern requires strong automated testing and continuous integration.

Forking Workflow is common in open-source projects. Contributors fork the main repository, make changes in their fork, then submit pull requests to the upstream repository. This pattern maintains clear separation between the authoritative repository and contributor copies.

# Fork and clone
git clone https://github.com/contributor/project.git
cd project
git remote add upstream https://github.com/original/project.git

# Create feature branch
git checkout -b feature/contribution

# Make changes and commit
git commit -am "Add new feature"

# Update with upstream changes
git fetch upstream
git rebase upstream/main

# Push to fork
git push origin feature/contribution

# Create pull request through GitHub/GitLab interface

Rebase vs Merge strategies affect history structure. Rebasing creates linear history by replaying commits, while merging preserves the actual development timeline with merge commits. Teams choose based on preference for history clarity versus accuracy.

Common Pitfalls

Detached HEAD state occurs when HEAD points directly to a commit instead of a branch. This happens when checking out a specific commit by SHA, tag, or relative reference. Commits made in detached HEAD state are not associated with any branch and become unreachable when switching branches.

# Enter detached HEAD state
git checkout abc1234

# Git warns:
# You are in 'detached HEAD' state...

# Make commits (dangerous)
git commit -m "Some changes"

# Save work before switching branches
git checkout -b temp-branch
# Or note the commit SHA and cherry-pick later

# Alternative: create branch first
git checkout -b exploration abc1234

Merge conflicts require manual resolution when Git cannot automatically combine changes. Conflicts occur when the same lines are modified differently in both branches, or when one branch modifies a file deleted in the other branch.

# After merge conflict
git status
# => both modified: app/models/user.rb

# File contains conflict markers:
# <<<<<<< HEAD
# def full_name
#   "#{first_name} #{last_name}"
# end
# =======
# def full_name
#   [first_name, last_name].compact.join(' ')
# end
# >>>>>>> feature-branch

# Common mistakes:
# - Committing with conflict markers still present
# - Choosing wrong version of conflicting changes
# - Not testing after resolving conflicts

# Proper resolution:
# 1. Examine both versions
# 2. Decide on correct implementation
# 3. Remove conflict markers
# 4. Test the code
# 5. Stage and commit

Force pushing overwrites remote branch history, potentially destroying other developers' work. This operation is necessary after rebasing or amending commits that were already pushed, but requires coordination with team members.

# Dangerous: unconditional force push
git push --force origin feature-branch

# Safer: force push with lease
git push --force-with-lease origin feature-branch
# Fails if remote has commits not in local branch

# Communicate with team before force pushing
# Ensure no one else is working on the branch

Lost commits can occur through various operations: hard reset, branch deletion, rebasing, or amending. The reflog provides recovery options for recent operations.

# Accidentally reset branch
git reset --hard HEAD~5

# Recover using reflog
git reflog
# => abc1234 HEAD@{0}: reset: moving to HEAD~5
# => def5678 HEAD@{1}: commit: Important changes
# => ghi9012 HEAD@{2}: commit: More work

# Restore to previous state
git reset --hard HEAD@{1}

# Recovery window is limited (typically 30-90 days)

Large file commits degrade repository performance. Binary files and generated artifacts do not compress well and increase clone times. Once committed, large files remain in history even if removed from current version.

# Prevent large file commits with pre-commit hook
# Check file sizes before allowing commit
# Add large files to .gitignore

# Remove large file from history (destructive)
git filter-branch --tree-filter 'rm -f path/to/large-file' HEAD
# Or use BFG Repo Cleaner for better performance

Committing sensitive data (passwords, API keys, private keys) exposes secrets in repository history. Removing sensitive data requires rewriting history and rotating compromised credentials.

# Prevent sensitive data commits
# Use .gitignore for config files with secrets
# Use environment variables or secret management
# Scan commits before pushing

# Example .gitignore entries:
# .env
# config/secrets.yml
# *.pem
# *.key

Incorrect remote URLs cause push and fetch operations to fail or target wrong repositories. SSH vs HTTPS URL differences affect authentication methods.

# View remote URLs
git remote -v
# => origin  https://github.com/user/repo.git (fetch)
# => origin  https://github.com/user/repo.git (push)

# Change remote URL
git remote set-url origin git@github.com:user/repo.git

# Add additional remote
git remote add upstream https://github.com/original/repo.git

Submodule pitfalls include forgetting to update submodules after cloning, committing wrong submodule pointers, and merge conflicts in submodule references.

# Clone repository with submodules
git clone --recursive https://github.com/user/project.git

# Update submodules in existing clone
git submodule update --init --recursive

# Common mistake: working inside submodule
# Changes in submodule directory are not tracked by parent repository
# Must commit in submodule, then commit submodule pointer in parent

Tools & Ecosystem

GitHub, GitLab, and Bitbucket provide Git repository hosting with additional collaboration features: pull requests, issue tracking, continuous integration, and code review workflows. These platforms extend Git with web interfaces, permission management, and team coordination tools.

Git hooks automate repository workflows by executing scripts at specific points in Git operations. Client-side hooks run on developer machines (pre-commit, pre-push), while server-side hooks run on the repository server (pre-receive, post-receive, update).

# Example pre-push hook
#!/bin/bash
# .git/hooks/pre-push

protected_branch='main'
current_branch=$(git symbolic-ref HEAD | sed -e 's,.*/\(.*\),\1,')

if [ "$current_branch" = "$protected_branch" ]; then
    read -p "Pushing to main. Are you sure? [y/N] " -n 1 -r < /dev/tty
    echo
    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
        exit 1
    fi
fi

exit 0

Git Large File Storage (LFS) handles large binary files by storing pointers in the repository while maintaining actual file contents on a separate server. This approach keeps repository size manageable for projects with large assets.

# Install Git LFS
git lfs install

# Track file types
git lfs track "*.psd"
git lfs track "*.mp4"

# Add .gitattributes to repository
git add .gitattributes

# Large files are now handled by LFS
git add design.psd
git commit -m "Add design file"

Git GUI clients provide graphical interfaces for Git operations: GitKraken, Sourcetree, GitHub Desktop, Tower, and IDE integrations. These tools visualize repository history, simplify complex operations, and provide merge conflict resolution interfaces.

Continuous Integration systems integrate with Git repositories to run automated tests, builds, and deployments on commits and pull requests. Services like GitHub Actions, GitLab CI, Jenkins, and CircleCI monitor repositories and execute configured workflows.

Git aliases create shortcuts for frequently used commands, improving workflow efficiency.

# Configure aliases
git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.st status
git config --global alias.unstage 'reset HEAD --'
git config --global alias.last 'log -1 HEAD'
git config --global alias.visual 'log --oneline --graph --decorate --all'

# Use aliases
git co main
git br feature/new-feature
git visual

Reference

Core Commands

Command Description Common Usage
git init Initialize new repository Create new project under version control
git clone Copy remote repository Start working on existing project
git add Stage changes Prepare files for commit
git commit Record changes Save snapshot to repository
git status Show working tree status Check current state
git log Display commit history Review project timeline
git diff Show changes between commits Review modifications
git branch List, create, delete branches Manage development lines
git checkout Switch branches or restore files Change working context
git merge Combine branch histories Integrate completed work
git rebase Reapply commits on new base Linearize history
git push Upload to remote repository Share commits
git pull Fetch and merge from remote Update local repository
git fetch Download from remote Update remote-tracking branches
git reset Reset current HEAD Undo commits
git revert Create commit that undoes changes Safely undo published commits
git stash Save uncommitted changes Temporarily store work
git tag Mark specific commits Label releases

Configuration Scopes

Scope File Location Command Flag Use Case
System /etc/gitconfig --system All users on machine
Global ~/.gitconfig --global All repositories for user
Local .git/config --local Specific repository
Worktree .git/config.worktree --worktree Specific working tree

Object Types

Type Purpose Content
blob File contents Raw file data
tree Directory structure File and directory references
commit Snapshot Tree pointer, parent commits, metadata
tag Named reference Commit pointer, annotation

Reset Modes

Mode Working Directory Staging Area Repository
--soft Unchanged Unchanged Changed
--mixed Unchanged Changed Changed
--hard Changed Changed Changed

Merge Strategies

Strategy Behavior Use Case
fast-forward Move branch pointer forward No divergence exists
recursive Three-way merge with single merge commit Standard merge for two branches
octopus Merge multiple branches Integrating several features
ours Keep current branch version Discard incoming changes
subtree Merge project subdirectory Managing dependencies

Branch Naming Conventions

Pattern Purpose Example
feature/ New functionality feature/user-authentication
bugfix/ Bug corrections bugfix/login-validation
hotfix/ Production fixes hotfix/security-patch
release/ Release preparation release/1.2.0
experiment/ Experimental work experiment/new-algorithm

Common Reflog References

Reference Meaning
HEAD@{0} Current position
HEAD@{1} Previous position
HEAD@{2.hours.ago} Position two hours ago
HEAD@{yesterday} Position yesterday
main@{one.week.ago} Branch position one week ago

Configuration Options

Setting Purpose Example Value
user.name Author name John Doe
user.email Author email john@example.com
core.editor Default text editor vim
core.autocrlf Line ending handling true, false, input
push.default Default push behavior simple, matching, current
pull.rebase Default pull behavior true, false
merge.conflictstyle Conflict marker style merge, diff3
branch.autosetupmerge Auto-track on checkout always, true, false