Overview
Git is a distributed version control system that tracks changes in source code during software development. Created by Linus Torvalds in 2005 for Linux kernel development, Git addresses the need for fast, distributed collaboration on large codebases. Unlike centralized version control systems, Git stores the complete repository history on each developer's machine, enabling offline work and faster operations.
Git operates on a snapshot-based model rather than tracking file deltas. Each commit represents a complete snapshot of the repository at a specific point in time. This design enables efficient branching, merging, and history traversal. The system uses content-addressable storage where each object (commit, tree, blob) is identified by a SHA-1 hash of its contents.
The three-tree architecture forms Git's core structure: the working directory contains the current file versions, the staging area (index) holds changes prepared for the next commit, and the repository stores the complete commit history. This staging area provides fine-grained control over what changes enter each commit, separating file modification from version recording.
Git's distributed nature means every clone contains the full repository history. Developers can commit, branch, merge, and examine history entirely offline. Network operations occur only when synchronizing with remote repositories through push, pull, and fetch operations. This architecture provides redundancy—every clone serves as a full backup of the project history.
# Initialize a new Git repository
git init my-project
cd my-project
# Create a file and track it
echo "Hello Git" > README.md
git add README.md
git commit -m "Initial commit"
# View repository status
git status
# => On branch main
# => nothing to commit, working tree clean
Key Principles
Git's object model consists of four fundamental types: blobs store file contents, trees represent directory structures, commits capture snapshots with metadata, and tags mark specific commits. Each object is immutable and identified by a SHA-1 hash computed from its content. This content-addressing ensures data integrity—any corruption becomes immediately detectable through hash mismatch.
The commit object contains a pointer to a tree object (the root directory snapshot), pointers to parent commit(s), author information, committer information, and the commit message. Commits form a directed acyclic graph (DAG) where each commit points to its parent(s). This structure enables Git to efficiently determine relationships between commits and compute differences.
# View commit contents
git cat-file -p HEAD
# => tree 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
# => author Name <email@example.com> 1638360000 -0500
# => committer Name <email@example.com> 1638360000 -0500
# =>
# => Initial commit
# Examine tree object
git cat-file -p 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
# => 100644 blob 9f4d96d5b00d98959ea9960f069585ce42b1349a README.md
Branches in Git are simply pointers to commits. Creating a branch creates a new pointer; switching branches changes which pointer HEAD references. This design makes branching and merging operations fast and lightweight—creating a branch writes a 41-byte file (40-character SHA-1 hash plus newline). The HEAD reference points to the current branch, and the current branch pointer advances with each new commit.
The staging area (index) acts as an intermediate layer between the working directory and repository. Files move through three states: modified (changed in working directory), staged (marked for inclusion in next commit), and committed (stored in repository). This three-state model provides precise control over commit contents. Developers can stage portions of files using interactive staging, creating commits that represent logical units of change rather than all modifications to a file.
Remote repositories are versions of the project hosted elsewhere. The origin remote typically refers to the repository from which the local repository was cloned. Remote-tracking branches (refs/remotes/origin/main) represent the state of branches in remote repositories as of the last fetch or pull operation. Local branches can track remote branches, establishing a relationship for push and pull operations.
Git's merge strategies determine how to combine divergent branch histories. Fast-forward merges occur when the target branch is an ancestor of the source branch—Git simply moves the branch pointer forward. Three-way merges occur when branches have diverged—Git creates a new merge commit with two parents. Rebase operations replay commits from one branch onto another, creating a linear history without merge commits. Each approach has different implications for history structure and collaboration workflows.
The reflog tracks HEAD movement and branch updates, maintaining a local history of repository state changes. This reference log enables recovery from mistakes—commits removed from branch histories remain accessible through the reflog for a retention period (typically 30-90 days). The reflog is local to each repository and not shared through push or fetch operations.
Ruby Implementation
Ruby provides several libraries for Git integration, with Rugged and Git gem being the primary options. Rugged is a Ruby binding to libgit2, offering low-level access to Git functionality. The Git gem provides a higher-level interface suitable for most applications. Ruby on Rails applications commonly use these libraries for deployment automation, code analysis, and repository management.
# Using the git gem for high-level operations
require 'git'
# Clone a repository
repo = Git.clone('https://github.com/user/project.git', 'project-dir')
# Open existing repository
repo = Git.open('/path/to/repo')
# Check status
repo.status.each do |file|
puts "#{file.path}: #{file.type}"
end
# Create and checkout a branch
repo.branch('feature-branch').checkout
# Stage and commit changes
repo.add(all: true)
repo.commit('Add new feature')
Rugged offers more control and better performance for operations requiring direct object database access. The library exposes Git's internal object model, allowing applications to read and manipulate repository objects directly.
require 'rugged'
# Open repository
repo = Rugged::Repository.new('/path/to/repo')
# Access commits
commit = repo.head.target
puts commit.message
puts commit.author[:name]
puts commit.author[:time]
# Walk commit history
walker = Rugged::Walker.new(repo)
walker.push(commit.oid)
walker.each do |commit|
puts "#{commit.oid[0..6]} #{commit.message.lines.first}"
end
# Read file contents from specific commit
entry = commit.tree['README.md']
blob = repo.lookup(entry[:oid])
puts blob.content
Ruby applications can implement Git hooks using the git gem or by placing executable Ruby scripts in the .git/hooks directory. Pre-commit hooks validate changes before committing, post-receive hooks trigger deployment or notification workflows, and pre-push hooks prevent pushing to protected branches.
#!/usr/bin/env ruby
# .git/hooks/pre-commit
# Prevent commits to main branch
current_branch = `git symbolic-ref --short HEAD`.strip
if current_branch == 'main'
puts "Direct commits to main branch are not allowed"
exit 1
end
# Run tests before committing
system('bundle exec rspec')
exit_status = $?.exitstatus
if exit_status != 0
puts "Tests must pass before committing"
exit exit_status
end
exit 0
Ruby scripts can automate repository management tasks such as bulk operations across multiple repositories, custom merge strategies, or automated code review workflows.
require 'git'
class RepositoryManager
def initialize(repo_path)
@repo = Git.open(repo_path)
end
def cleanup_merged_branches
merged_branches = @repo.branches.local.select do |branch|
next if branch.name == @repo.current_branch
# Check if branch is merged into main
merge_base = @repo.merge_base('main', branch.name)
branch_commit = @repo.revparse(branch.name)
merge_base == branch_commit
end
merged_branches.each do |branch|
puts "Deleting merged branch: #{branch.name}"
@repo.branch(branch.name).delete
end
end
def find_commits_by_author(author_pattern)
commits = []
@repo.log(1000).each do |commit|
if commit.author.name =~ /#{author_pattern}/i
commits << {
sha: commit.sha,
message: commit.message,
date: commit.date,
author: commit.author.name
}
end
end
commits
end
def cherry_pick_range(start_sha, end_sha, target_branch)
@repo.checkout(target_branch)
commits = @repo.log.between(start_sha, end_sha).reverse
commits.each do |commit|
begin
@repo.cherry_pick(commit.sha)
puts "Cherry-picked: #{commit.sha[0..7]} - #{commit.message.lines.first}"
rescue Git::GitExecuteError => e
puts "Failed to cherry-pick #{commit.sha}: #{e.message}"
@repo.reset_hard
break
end
end
end
end
Practical Examples
Basic workflow operations form the foundation of Git usage. The cycle of modifying files, staging changes, creating commits, and synchronizing with remote repositories repeats throughout development.
# Create a feature branch from main
git checkout main
git pull origin main
git checkout -b feature/user-authentication
# Make changes to files
echo "class User; end" > app/models/user.rb
# View changes
git diff
# => diff --git a/app/models/user.rb b/app/models/user.rb
# => new file mode 100644
# => index 0000000..abcd123
# => --- /dev/null
# => +++ b/app/models/user.rb
# => @@ -0,0 +1 @@
# => +class User; end
# Stage changes
git add app/models/user.rb
# Create commit
git commit -m "Add User model"
# Push to remote
git push origin feature/user-authentication
Interactive staging enables committing portions of changes rather than entire files. This granularity produces commits that represent coherent logical changes.
# Start interactive staging
git add -p app/models/user.rb
# Git presents each hunk with options:
# y - stage this hunk
# n - do not stage this hunk
# s - split into smaller hunks
# e - manually edit the hunk
# q - quit; do not stage this hunk or any remaining ones
# View staged changes
git diff --staged
# Unstage specific files
git reset HEAD app/models/user.rb
Branching strategies determine how teams organize parallel development efforts. Feature branches isolate work on specific features or fixes. Long-running branches like develop and main serve as integration points for completed work.
# Create feature branch
git checkout -b feature/payment-processing
# Work on feature with multiple commits
git commit -m "Add payment gateway integration"
git commit -m "Implement payment validation"
git commit -m "Add payment processing tests"
# Update feature branch with main changes
git checkout main
git pull origin main
git checkout feature/payment-processing
git rebase main
# Resolve conflicts if any, then continue
git add resolved-file.rb
git rebase --continue
# Push rebased branch (force required after rebase)
git push --force-with-lease origin feature/payment-processing
Merge operations combine branch histories. Fast-forward merges move the branch pointer when no divergence exists. Three-way merges create merge commits when branches have diverged.
# Merge feature branch into main
git checkout main
git merge feature/payment-processing
# If merge creates conflicts
git status
# => both modified: app/models/payment.rb
# Edit conflicted files, choosing which changes to keep
# Conflict markers show both versions:
# <<<<<<< HEAD
# main branch version
# =======
# feature branch version
# >>>>>>> feature/payment-processing
# Stage resolved files
git add app/models/payment.rb
# Complete merge
git commit
Stashing saves uncommitted changes temporarily, clearing the working directory without creating commits. This operation enables switching branches with uncommitted work or pulling remote changes into a dirty working directory.
# Save current changes
git stash save "Work in progress on authentication"
# List stashes
git stash list
# => stash@{0}: On feature-branch: Work in progress on authentication
# => stash@{1}: WIP on main: Previous stashed changes
# Apply most recent stash
git stash apply
# Apply and remove specific stash
git stash pop stash@{1}
# View stash contents
git stash show -p stash@{0}
# Drop stash without applying
git stash drop stash@{0}
History manipulation through rebase, reset, and commit amending enables cleaning commit history before sharing. Interactive rebase provides control over multiple commits.
# Interactive rebase last 5 commits
git rebase -i HEAD~5
# Editor opens with commit list:
# pick abc1234 Add feature X
# pick def5678 Fix typo
# pick ghi9012 Add feature Y
# pick jkl3456 Fix bug in feature Y
# pick mno7890 Add tests
# Change to:
# pick abc1234 Add feature X
# squash def5678 Fix typo
# pick ghi9012 Add feature Y
# fixup jkl3456 Fix bug in feature Y
# reword mno7890 Add tests
# Save and close editor
# Git combines commits as specified
Common Patterns
Feature Branch Workflow isolates development work on separate branches. Each feature, bug fix, or experiment receives its own branch created from main. Developers complete work on feature branches, then merge back to main through pull requests or direct merges. This pattern keeps main stable and deployable while enabling parallel development.
# Feature branch lifecycle
git checkout main
git pull origin main
git checkout -b feature/new-dashboard
# Develop feature
git add .
git commit -m "Implement dashboard layout"
git commit -m "Add dashboard data loading"
git commit -m "Style dashboard components"
# Prepare for merge
git checkout main
git pull origin main
git checkout feature/new-dashboard
git rebase main
# Merge to main
git checkout main
git merge --no-ff feature/new-dashboard
git push origin main
# Clean up
git branch -d feature/new-dashboard
git push origin --delete feature/new-dashboard
Git Flow extends feature branches with additional branch types for releases and hotfixes. The main branches are main (production-ready code) and develop (integration branch for features). Supporting branches include feature branches (new features), release branches (release preparation), and hotfix branches (production fixes).
# Start new feature
git checkout develop
git checkout -b feature/user-notifications
# Complete feature
git checkout develop
git merge --no-ff feature/user-notifications
git branch -d feature/user-notifications
# Create release branch
git checkout develop
git checkout -b release/1.2.0
# Perform release preparation (version bumps, changelog updates)
git commit -am "Bump version to 1.2.0"
# Finish release
git checkout main
git merge --no-ff release/1.2.0
git tag -a 1.2.0 -m "Release version 1.2.0"
git checkout develop
git merge --no-ff release/1.2.0
git branch -d release/1.2.0
# Create hotfix
git checkout main
git checkout -b hotfix/security-patch
git commit -am "Fix security vulnerability"
git checkout main
git merge --no-ff hotfix/security-patch
git tag -a 1.2.1 -m "Hotfix version 1.2.1"
git checkout develop
git merge --no-ff hotfix/security-patch
git branch -d hotfix/security-patch
Trunk-Based Development maintains a single main branch where developers commit frequently. Short-lived feature branches (less than one day) may exist, but the focus is on continuous integration to main. Feature flags control incomplete feature visibility in production. This pattern requires strong automated testing and continuous integration.
Forking Workflow is common in open-source projects. Contributors fork the main repository, make changes in their fork, then submit pull requests to the upstream repository. This pattern maintains clear separation between the authoritative repository and contributor copies.
# Fork and clone
git clone https://github.com/contributor/project.git
cd project
git remote add upstream https://github.com/original/project.git
# Create feature branch
git checkout -b feature/contribution
# Make changes and commit
git commit -am "Add new feature"
# Update with upstream changes
git fetch upstream
git rebase upstream/main
# Push to fork
git push origin feature/contribution
# Create pull request through GitHub/GitLab interface
Rebase vs Merge strategies affect history structure. Rebasing creates linear history by replaying commits, while merging preserves the actual development timeline with merge commits. Teams choose based on preference for history clarity versus accuracy.
Common Pitfalls
Detached HEAD state occurs when HEAD points directly to a commit instead of a branch. This happens when checking out a specific commit by SHA, tag, or relative reference. Commits made in detached HEAD state are not associated with any branch and become unreachable when switching branches.
# Enter detached HEAD state
git checkout abc1234
# Git warns:
# You are in 'detached HEAD' state...
# Make commits (dangerous)
git commit -m "Some changes"
# Save work before switching branches
git checkout -b temp-branch
# Or note the commit SHA and cherry-pick later
# Alternative: create branch first
git checkout -b exploration abc1234
Merge conflicts require manual resolution when Git cannot automatically combine changes. Conflicts occur when the same lines are modified differently in both branches, or when one branch modifies a file deleted in the other branch.
# After merge conflict
git status
# => both modified: app/models/user.rb
# File contains conflict markers:
# <<<<<<< HEAD
# def full_name
# "#{first_name} #{last_name}"
# end
# =======
# def full_name
# [first_name, last_name].compact.join(' ')
# end
# >>>>>>> feature-branch
# Common mistakes:
# - Committing with conflict markers still present
# - Choosing wrong version of conflicting changes
# - Not testing after resolving conflicts
# Proper resolution:
# 1. Examine both versions
# 2. Decide on correct implementation
# 3. Remove conflict markers
# 4. Test the code
# 5. Stage and commit
Force pushing overwrites remote branch history, potentially destroying other developers' work. This operation is necessary after rebasing or amending commits that were already pushed, but requires coordination with team members.
# Dangerous: unconditional force push
git push --force origin feature-branch
# Safer: force push with lease
git push --force-with-lease origin feature-branch
# Fails if remote has commits not in local branch
# Communicate with team before force pushing
# Ensure no one else is working on the branch
Lost commits can occur through various operations: hard reset, branch deletion, rebasing, or amending. The reflog provides recovery options for recent operations.
# Accidentally reset branch
git reset --hard HEAD~5
# Recover using reflog
git reflog
# => abc1234 HEAD@{0}: reset: moving to HEAD~5
# => def5678 HEAD@{1}: commit: Important changes
# => ghi9012 HEAD@{2}: commit: More work
# Restore to previous state
git reset --hard HEAD@{1}
# Recovery window is limited (typically 30-90 days)
Large file commits degrade repository performance. Binary files and generated artifacts do not compress well and increase clone times. Once committed, large files remain in history even if removed from current version.
# Prevent large file commits with pre-commit hook
# Check file sizes before allowing commit
# Add large files to .gitignore
# Remove large file from history (destructive)
git filter-branch --tree-filter 'rm -f path/to/large-file' HEAD
# Or use BFG Repo Cleaner for better performance
Committing sensitive data (passwords, API keys, private keys) exposes secrets in repository history. Removing sensitive data requires rewriting history and rotating compromised credentials.
# Prevent sensitive data commits
# Use .gitignore for config files with secrets
# Use environment variables or secret management
# Scan commits before pushing
# Example .gitignore entries:
# .env
# config/secrets.yml
# *.pem
# *.key
Incorrect remote URLs cause push and fetch operations to fail or target wrong repositories. SSH vs HTTPS URL differences affect authentication methods.
# View remote URLs
git remote -v
# => origin https://github.com/user/repo.git (fetch)
# => origin https://github.com/user/repo.git (push)
# Change remote URL
git remote set-url origin git@github.com:user/repo.git
# Add additional remote
git remote add upstream https://github.com/original/repo.git
Submodule pitfalls include forgetting to update submodules after cloning, committing wrong submodule pointers, and merge conflicts in submodule references.
# Clone repository with submodules
git clone --recursive https://github.com/user/project.git
# Update submodules in existing clone
git submodule update --init --recursive
# Common mistake: working inside submodule
# Changes in submodule directory are not tracked by parent repository
# Must commit in submodule, then commit submodule pointer in parent
Tools & Ecosystem
GitHub, GitLab, and Bitbucket provide Git repository hosting with additional collaboration features: pull requests, issue tracking, continuous integration, and code review workflows. These platforms extend Git with web interfaces, permission management, and team coordination tools.
Git hooks automate repository workflows by executing scripts at specific points in Git operations. Client-side hooks run on developer machines (pre-commit, pre-push), while server-side hooks run on the repository server (pre-receive, post-receive, update).
# Example pre-push hook
#!/bin/bash
# .git/hooks/pre-push
protected_branch='main'
current_branch=$(git symbolic-ref HEAD | sed -e 's,.*/\(.*\),\1,')
if [ "$current_branch" = "$protected_branch" ]; then
read -p "Pushing to main. Are you sure? [y/N] " -n 1 -r < /dev/tty
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
fi
exit 0
Git Large File Storage (LFS) handles large binary files by storing pointers in the repository while maintaining actual file contents on a separate server. This approach keeps repository size manageable for projects with large assets.
# Install Git LFS
git lfs install
# Track file types
git lfs track "*.psd"
git lfs track "*.mp4"
# Add .gitattributes to repository
git add .gitattributes
# Large files are now handled by LFS
git add design.psd
git commit -m "Add design file"
Git GUI clients provide graphical interfaces for Git operations: GitKraken, Sourcetree, GitHub Desktop, Tower, and IDE integrations. These tools visualize repository history, simplify complex operations, and provide merge conflict resolution interfaces.
Continuous Integration systems integrate with Git repositories to run automated tests, builds, and deployments on commits and pull requests. Services like GitHub Actions, GitLab CI, Jenkins, and CircleCI monitor repositories and execute configured workflows.
Git aliases create shortcuts for frequently used commands, improving workflow efficiency.
# Configure aliases
git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.st status
git config --global alias.unstage 'reset HEAD --'
git config --global alias.last 'log -1 HEAD'
git config --global alias.visual 'log --oneline --graph --decorate --all'
# Use aliases
git co main
git br feature/new-feature
git visual
Reference
Core Commands
| Command | Description | Common Usage |
|---|---|---|
| git init | Initialize new repository | Create new project under version control |
| git clone | Copy remote repository | Start working on existing project |
| git add | Stage changes | Prepare files for commit |
| git commit | Record changes | Save snapshot to repository |
| git status | Show working tree status | Check current state |
| git log | Display commit history | Review project timeline |
| git diff | Show changes between commits | Review modifications |
| git branch | List, create, delete branches | Manage development lines |
| git checkout | Switch branches or restore files | Change working context |
| git merge | Combine branch histories | Integrate completed work |
| git rebase | Reapply commits on new base | Linearize history |
| git push | Upload to remote repository | Share commits |
| git pull | Fetch and merge from remote | Update local repository |
| git fetch | Download from remote | Update remote-tracking branches |
| git reset | Reset current HEAD | Undo commits |
| git revert | Create commit that undoes changes | Safely undo published commits |
| git stash | Save uncommitted changes | Temporarily store work |
| git tag | Mark specific commits | Label releases |
Configuration Scopes
| Scope | File Location | Command Flag | Use Case |
|---|---|---|---|
| System | /etc/gitconfig | --system | All users on machine |
| Global | ~/.gitconfig | --global | All repositories for user |
| Local | .git/config | --local | Specific repository |
| Worktree | .git/config.worktree | --worktree | Specific working tree |
Object Types
| Type | Purpose | Content |
|---|---|---|
| blob | File contents | Raw file data |
| tree | Directory structure | File and directory references |
| commit | Snapshot | Tree pointer, parent commits, metadata |
| tag | Named reference | Commit pointer, annotation |
Reset Modes
| Mode | Working Directory | Staging Area | Repository |
|---|---|---|---|
| --soft | Unchanged | Unchanged | Changed |
| --mixed | Unchanged | Changed | Changed |
| --hard | Changed | Changed | Changed |
Merge Strategies
| Strategy | Behavior | Use Case |
|---|---|---|
| fast-forward | Move branch pointer forward | No divergence exists |
| recursive | Three-way merge with single merge commit | Standard merge for two branches |
| octopus | Merge multiple branches | Integrating several features |
| ours | Keep current branch version | Discard incoming changes |
| subtree | Merge project subdirectory | Managing dependencies |
Branch Naming Conventions
| Pattern | Purpose | Example |
|---|---|---|
| feature/ | New functionality | feature/user-authentication |
| bugfix/ | Bug corrections | bugfix/login-validation |
| hotfix/ | Production fixes | hotfix/security-patch |
| release/ | Release preparation | release/1.2.0 |
| experiment/ | Experimental work | experiment/new-algorithm |
Common Reflog References
| Reference | Meaning |
|---|---|
| HEAD@{0} | Current position |
| HEAD@{1} | Previous position |
| HEAD@{2.hours.ago} | Position two hours ago |
| HEAD@{yesterday} | Position yesterday |
| main@{one.week.ago} | Branch position one week ago |
Configuration Options
| Setting | Purpose | Example Value |
|---|---|---|
| user.name | Author name | John Doe |
| user.email | Author email | john@example.com |
| core.editor | Default text editor | vim |
| core.autocrlf | Line ending handling | true, false, input |
| push.default | Default push behavior | simple, matching, current |
| pull.rebase | Default pull behavior | true, false |
| merge.conflictstyle | Conflict marker style | merge, diff3 |
| branch.autosetupmerge | Auto-track on checkout | always, true, false |