Go Runtime Metrics Total CPU Explanation: go_cpu_classes_total_cpu_seconds_total¶
This blog explains how the Go runtime metrics /cpu/classes/total:cpu-seconds
is calculated. Go1.22.7 is used as source code reference.
runtime/metrics.go:157 exposes a metric named /cpu/classes/total:cpu-seconds
, which is mapped to the prometheus metric name go_cpu_classes_total_cpu_seconds_total
.
"/cpu/classes/total:cpu-seconds": {
deps: makeStatDepSet(cpuStatsDep),
compute: func(in *statAggregate, out *metricValue) {
out.kind = metricKindFloat64
out.scalar = float64bits(nsToSec(in.cpuStats.totalTime))
},
},
The /cpu/classes/total:cpu-seconds
value of is calculated from cpuStats.totalTime, while the cpuStats is passed from the caller.
runtime/metrics.go:L848
func readMetricsLocked(samplesp unsafe.Pointer, len int, cap int) {
// Construct a slice from the args.
sl := slice{samplesp, len, cap}
samples := *(*[]metricSample)(unsafe.Pointer(&sl))
// Clear agg defensively.
agg = statAggregate{}
// Sample.
for i := range samples {
sample := &samples[i]
data, ok := metrics[sample.name]
if !ok {
sample.value.kind = metricKindBad
continue
}
// Ensure we have all the stats we need.
// agg is populated lazily.
agg.ensure(&data.deps)
// Compute the value based on the stats we have.
data.compute(&agg, &sample.value)
}
}
Focuses on the cpu, the cpu status is ensured by agg.ensure(&data.deps), runtime/metrics.go:L720 The status is done by a.cpuStats.compute()
, runtime/metrics.go:L735, which just assigns the cpu stats by a global variable work
.
runtime/metrics.go
var work workType
// compute populates the cpuStatsAggregate with values from the runtime.
func (a *cpuStatsAggregate) compute() {
a.cpuStats = work.cpuStats
// TODO(mknyszek): Update the CPU stats again so that we're not
// just relying on the STW snapshot. The issue here is that currently
// this will cause non-monotonicity in the "user" CPU time metric.
//
// a.cpuStats.accumulate(nanotime(), gcphase == _GCmark)
}
Then, who updates the work variable? TL;DR, it's updated in func (s *cpuStats) accumulate
(runtime/mstats.go:L929), which is called in gcMarkTermination
(runtime/mgc.goL932).
runtime/mgc.go
var (
sched schedt
)
type schedt struct {
// ... ignore some code
totaltime int64 // ∫gomaxprocs dt up to procresizetime
procresizetime int64 // nanotime() of last change to gomaxprocs
// ... ignore some code
}
func (s *cpuStats) accumulate(now int64, gcMarkPhase bool) {
// ... ignore some code
// Update total CPU.
s.totalTime = sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs)
// ... ignore some code
}
The s.totalTime
is calculated based on sched.totaltime
and sched.procresizetime
by global variable sched
(runtime/runtime2.go:L1192),, which are updated in procresize
(runtime/proc.go:L5591).
runtime/proc.go
Function procresize
changes is called in 2 places:
- schedinit(runtime/proc.go:L750)
- startTheWorldWithSema(runtime/proc.go:L1154)
The startTheWorldWithSema
is called everytime the GC starts and the workd stops, regardless the GOMAXPROCS is changed or not. Every time, it calculates the timewindow since last time and multiplies by the old GOMAXPROCS value, then adds to the sched.totaltime
.
The implementation clearly reflects the total cpu is the total available CPU, rather than the actual used total CPU time.
Finally, the doc is: