Allocation and GC¶
Memory allocation is the raison d'être of garbage collection—without allocation, there's nothing to collect. But allocation and GC are deeply intertwined in Go's runtime. Every allocation potentially triggers GC, updates GC statistics, and may force the allocating goroutine to assist with marking.
This article explores the tight coupling between allocation and GC in Go's runtime.
The Allocator-GC Feedback Loop¶
Go's allocator is not just a passive memory dispenser—it actively participates in GC:
- Trigger detection: Checks if heap size warrants starting a new GC cycle
- Pacer updates: Reports allocation rate and heap usage to GC controller
- Mark assist: Forces allocating goroutines to perform marking work when GC falls behind
This creates a feedback loop where allocation influences GC, and GC influences allocation cost.
Triggering GC: When Allocation Starts a Cycle¶
Trigger Conditions¶
The allocator checks whether to start a new GC cycle at strategic points:
When does shouldhelpgc() return true?
| Allocation Type | Check Frequency |
|---|---|
| Tiny alloc (≤ 16 bytes) | Only when cache miss → refill from mcentral |
| Small alloc (17B - 32KB) | Only when cache miss → refill from mcentral |
| Large alloc (> 32KB) | Every allocation |
Small objects cached in per-P cache don't check because the overhead would dominate the allocation cost. Large objects always check because allocating multiple megabytes without checking could cause runaway heap growth.
Trigger Point Calculation¶
The gcTrigger.test() call checks if current heap usage exceeds the trigger point calculated by the GC Pacer:
When heap usage ≥ trigger point, a new GC cycle begins immediately.
Mark Assist: Forcing Allocators to Mark¶
The Credit System¶
Go's GC implements a credit-based system for mark assist:
// Each goroutine maintains assist credit
var assistCredit int64 // Positive = surplus, Negative = debt
func mallocgc(size, ...) {
// Deduct allocation size from credit
assistG := deductAssistCredit(size)
// If in debt, must work it off
if assistG < 0 {
gcAssistAlloc(assistG)
}
// Then actually allocate
return allocate(size)
}
How it works:
- Initial credit: Each goroutine starts with zero credit
- Earn credit: Assist marking adds positive credit (mark more than allocated)
- Spend credit: Allocations deduct from credit (allocate more than marked)
- Go into debt: If credit goes negative, goroutine must assist before returning
Assist Execution¶
When a goroutine has negative credit, gcAssistAlloc forces it to perform marking work:
Key characteristic: Assist happens before allocation returns, creating strict backpressure:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Allocate │───▶│ Assist │───▶│ Return │
│ Requested │ │ Mark │ │ Pointer │
└─────────────┘ └─────────────┘ └─────────────┘
Heap usage Mark work Heap grows
doesn't grow pays debt after payment
This ensures heap growth is throttled when GC can't keep up.
Background Worker vs Assist¶
The GC Pacer reserves at least 25% CPU utilization for background GC workers:
| Condition | Who does marking? |
|---|---|
| Allocation rate ≤ background GC capacity | Only background workers |
| Allocation rate > background GC capacity | Background workers + assists |
Credit stealing: When background workers finish early, they "steal" credit from allocating goroutines, reducing assist frequency. This minimizes tail latency for most allocations.
Publication Barrier: Safe Object Visibility¶
The Problem: Instruction Reordering¶
When an object is allocated and initialized, CPU or compiler reordering could expose the object to GC before initialization completes:
func makeNode(data *Data) *Node {
node := new(Node) // Step 1: Allocate
// publicationBarrier() would go here
node.data = data // Step 2: Initialize
node.next = nil // Step 3: Initialize
return node // Step 4: Publish
}
// CPU might reorder: 1 → 4 → 2 → 3
// GC sees node at step 4, but fields uninitialized at step 2!
The Solution: Publication Barrier¶
Go's allocator inserts a publication barrier to ensure memory initialization happens-before GC visibility:
Effect on GC:
- New allocations are black by default (conservatively assumed live)
- Publication barrier ensures GC never sees partially initialized objects
- If GC scans before initialization completes, it simply marks object live (safe)
- Initialization completes asynchronously with no race condition
Large Objects: Delayed Zeroing¶
Why Large Objects Are Special¶
When allocating large objects (>32KB), the zeroing operation becomes significant:
- Small object: Zeroing cost is negligible relative to allocation overhead
- Large object: Zeroing 2MB might take 500μs, during which the goroutine is unpreemptible
Go solves this with delayed zeroing—zeroing large objects incrementally.
Chunked Zeroing Strategy¶
func mallocgc(size, ...) unsafe.Pointer {
largeObject := size > maxSmallSize
if largeObject {
// Set flag for delayed zeroing
delayedZeroing = true
}
// Allocate (returns uninitialized memory for large objects)
result := allocateSpan(size)
if delayedZeroing {
// Zero in 256KB chunks, allowing preemption between chunks
chunkSize := 256 * 1024 // 256KB from benchmarking
remaining := size
for remaining > 0 {
// Zero this chunk (must not preempt mid-chunk)
memclrNoHeapPointers(result, chunkSize)
// Allow preemption between chunks
remaining -= chunkSize
result = add(result, chunkSize)
}
}
return result
}
Why 256KB?
This value was chosen through Go benchmarking as the sweet spot balancing: - Too small (e.g., 4KB): Frequent preemption → high overhead - Too large (e.g., 2MB): Unpreemptible period → STW delay
At 256KB, zeroing completes quickly enough to avoid long pauses, but large enough to amortize preemption overhead.
STW Impact and Prevention¶
The chunked zeroing strategy prevents a critical STW issue:
// Bad: Zero 2MB atomically
func zeroLargeObject(ptr, size) {
memclrNoHeapPointers(ptr, size) // Takes 500μs
// During this 500μs, goroutine CANNOT be preempted
}
// If STW occurs during zeroing:
// 1. STW signals all P's to stop
// 2. Most P's stop quickly
// 3. P zeroing 2MB continues for 500μs (can't stop mid-memclr)
// 4. All other P's wait for this one P
// 5. Result: 500μs STW pause instead of <10μs
// Good: Zero 2MB in 8 chunks of 256KB
func zeroLargeObjectChunked(ptr, size) {
for offset := 0; offset < size; offset += 256KB {
memclrNoHeapPointers(ptr+offset, 256KB) // 64μs per chunk
// Check for preemption between chunks
checkPreemption()
}
}
// If STW occurs:
// 1. STW signals all P's to stop
// 2. Currently executing chunk finishes (≤64μs)
// 3. Goroutine preempted
// 4. STW completes with minimal delay
The principle: Move long operations out of critical sections by making them interruptible.
Summary¶
Allocation and GC are tightly coupled in Go:
- Trigger detection: Allocators monitor heap size and start GC when needed
- Mark assist: Allocators perform marking work when GC falls behind
- Publication barriers: Ensure GC sees correctly initialized objects
- Delayed zeroing: Prevent large allocations from causing STW delays
Understanding these interactions is crucial for: - Diagnosing allocation-related performance issues - Tuning GOGC for allocation-heavy workloads - Designing GC-friendly data structures (minimize pointer churn)