GC Sweep¶
The sweep phase is where the garbage collector finally reclaims memory—freeing objects that were identified as unused during the mark phase. Go's sweep implementation is remarkably efficient, using clever bit manipulation and lazy reclamation to minimize overhead.
This article explores the GC Sweep implementation in Go 1.23.12, primarily located in src/runtime/mgcsweep.go.
The Core Idea: Lazy Sweep¶
Unlike traditional garbage collectors that immediately free memory after marking, Go defers the actual freeing until memory is needed. This lazy sweep strategy provides several benefits:
- Amortized Cost: Sweep work is spread across allocation operations
- Cache Efficiency: Recently freed memory stays hot in CPU caches
- No Dedicated Phase: No separate "sweep phase" pause—the world keeps running
The sweep phase itself is surprisingly fast. Its main work is flipping bits and updating data structures, not actually zeroing memory or returning pages to the OS.
How Sweep Relies on Mark¶
The sweep phase is critically dependent on the mark phase's output. During marking, the GC doesn't explicitly tag objects as "live" or "dead"—instead, it performs bit manipulations on span structures:
This elegant bit-flip achieves O(1) complexity per span—no explicit cleanup or traversal occurs during mark termination. The sweepgen increment signals to the allocator that this span is ready for sweeping.
Span Classification After Sweep¶
After the mark phase completes, each memory span (mspan) falls into one of three states. The sweep operation categorizes spans based on their size class (small vs large objects) and liveness:
Small Objects (sizeclass != 0)¶
For spans containing small objects (objects < 32KB), sweep performs two operations:
- Statistics Update:
- If any objects within the span were freed, sets
s.needzero = 1to defer zeroing until allocation time -
Updates runtime's
memstatsfor monitoring and pacer calculations -
Span Placement: === "Span Placement Logic"
- All objects dead: Return span to global heap via
mheap_.freeSpan(s) - All objects live: Push to
mheap_.central[spc].mcentral.fullSwept(sweepgen).push(s) - Partial liveness: Push to
mheap_.central[spc].mcentral.partialSwept(sweepgen).push(s)
- All objects dead: Return span to global heap via
Large Objects (sizeclass == 0, > 32KB)¶
Large objects occupy entire spans, simplifying sweep logic:
- Object dead:
mheap_.freeSpan(s)— immediately return to heap - Object live:
mheap_.central[spc].mcentral.fullSwept(sweepgen).push(s)— mark as fully live
The Lazy Sweep Mechanism¶
The key insight is that sweep doesn't actually free memory—it merely changes bookkeeping. The actual freeing happens incrementally during allocation:
- Allocator requests an mspan from a central cache
- If no suitable span exists in the
partialorfulllists, allocator checks swept lists - If still empty, allocator triggers sweep of a span on-demand
- Swept spans are then popped and used for allocations
This design means sweep work is naturally distributed across the application's allocation pattern.
Complexity from Practical Requirements¶
While the core sweep logic is simple, the implementation includes substantial additional complexity to handle real-world scenarios:
- Arena Management: Interaction with Go's memory arenas for specialized allocation patterns
- Special Objects: Handling of profiler objects and finalizers (objects with cleanup code)
- Tracing Integration: Hooks for
go tool traceto visualize sweep behavior - Debugging Infrastructure: Internal consistency checks and zombie detection
- Bit Manipulation: Highly optimized operations for managing allocation bitmasks
Each of these adds correctness and observability at the cost of code complexity.
Ordering: Mark vs Sweep¶
A critical invariant in Go's GC is the strict ordering between cycles:
Two key ordering rules:
- Mark N precedes Sweep N: Obvious—sweep consumes mark's bitflip output
- Sweep N strictly precedes Mark N+1: Enforced by
finishsweep_m()at the start of each new GC cycle
The second rule exists because mark operations would overwrite sweep state if they overlapped. In practice: - Most cycles: sweep finishes long before next GC trigger (no impact) - Allocation bursts: next GC might wait for sweep, potentially extending the first STW pause
Advanced Topics¶
The sweep phase has several nuanced behaviors that affect application performance:
Sweep Termination STW¶
The transition between sweep completion and next GC mark requires a brief stop-the-world pause. In most cases this is negligible, but in extreme scenarios with massive heap growth, it can contribute to latency spikes.
Lazy Sweep and Allocation Latency¶
Because sweep occurs during allocation, a request might encounter a span that needs sweeping first. This adds small, unpredictable delays to allocation operations.
Sweep Assist¶
Similar to mark assist, the GC may force allocators to perform sweep work if background sweep workers fall behind allocation pressure.
Manual GC Interaction¶
When runtime.GC() is called manually, it forces both concurrent mark and sweep to complete immediately, potentially causing longer pauses than normal.
Summary¶
Go's sweep phase demonstrates the power of lazy reclamation:
- Fast: Bit manipulation and deferred zeroing minimize immediate work
- Concurrent: Swept spans are available to allocators immediately
- Incremental: Sweep cost is spread across allocation operations
- Adaptive: Background workers assist when allocation pressure increases
Understanding sweep is essential for diagnosing memory-related performance issues, particularly when tuning GOGC or investigating allocation latency in latency-sensitive applications.