讓我詳細分析Grafana源碼中計算step的完整邏輯,它確實比簡單的intervalMs/1000
復雜得多。
完整的Step計算流程
1. 入口點:[models.Parse](file://F:\JavaProject\grafana-release-11.2.0\pkg\promlib\models\query.go#L190-L274)函數
在pkg/promlib/models/query.go
中:
func Parse(span trace.Span, query backend.DataQuery, dsScrapeInterval string, intervalCalculator intervalv2.Calculator, fromAlert bool, enableScope bool) (*Query, error) {model := &internalQueryModel{}if err := json.Unmarshal(query.JSON, model); err != nil {return nil, err}// Final step value for prometheuscalculatedStep, err := calculatePrometheusInterval(model.Interval, dsScrapeInterval, int64(model.IntervalMS), model.IntervalFactor, query, intervalCalculator)if err != nil {return nil, err}// ...return &Query{Expr: expr,Step: calculatedStep, // 這就是最終的step值// ...}, nil
}
2. 核心計算函數:calculatePrometheusInterval
func calculatePrometheusInterval(queryInterval, dsScrapeInterval string,intervalMs, intervalFactor int64,query backend.DataQuery,intervalCalculator intervalv2.Calculator,
) (time.Duration, error) {// 保存原始queryInterval用于后續比較originalQueryInterval := queryInterval// 如果使用變量(如$__interval),則清空queryIntervalif isVariableInterval(queryInterval) {queryInterval = ""}// 1. 獲取最小間隔minInterval, err := gtime.GetIntervalFrom(dsScrapeInterval, queryInterval, intervalMs, 15*time.Second)if err != nil {return time.Duration(0), err}// 2. 使用intervalCalculator計算間隔calculatedInterval := intervalCalculator.Calculate(query.TimeRange, minInterval, query.MaxDataPoints)// 3. 計算安全間隔safeInterval := intervalCalculator.CalculateSafeInterval(query.TimeRange, int64(safeResolution))// 4. 選擇較大的間隔值adjustedInterval := safeInterval.Valueif calculatedInterval.Value > safeInterval.Value {adjustedInterval = calculatedInterval.Value}// 5. 特殊處理$__rate_interval情況if originalQueryInterval == varRateInterval || originalQueryInterval == varRateIntervalAlt {// Rate interval有特殊計算邏輯return calculateRateInterval(adjustedInterval, dsScrapeInterval), nil} else {// 6. 應用intervalFactorqueryIntervalFactor := intervalFactorif queryIntervalFactor == 0 {queryIntervalFactor = 1}return time.Duration(int64(adjustedInterval) * queryIntervalFactor), nil}
}
3. intervalCalculator的實現
在pkg/promlib/intervalv2/intervalv2.go
中:
func (ic *intervalCalculator) Calculate(timerange backend.TimeRange, minInterval time.Duration, maxDataPoints int64) Interval {to := timerange.To.UnixNano()from := timerange.From.UnixNano()resolution := maxDataPointsif resolution == 0 {resolution = DefaultRes // 默認1500}// 核心計算:(時間范圍) / (最大數據點數)calculatedInterval := time.Duration((to - from) / resolution)// 如果計算出的間隔小于最小間隔,則使用最小間隔if calculatedInterval < minInterval {return Interval{Text: gtime.FormatInterval(minInterval), Value: minInterval}}// 對計算出的間隔進行四舍五入調整rounded := gtime.RoundInterval(calculatedInterval)return Interval{Text: gtime.FormatInterval(rounded), Value: rounded}
}
4. gtime.RoundInterval的實現
這個函數在SDK中,用于將計算出的間隔四舍五入到標準值:
// 這是簡化版的邏輯,實際更復雜
func RoundInterval(interval time.Duration) time.Duration {// 根據不同的間隔范圍,四舍五入到標準值switch {case interval < 15*time.Second:return 15 * time.Secondcase interval < 30*time.Second:return 30 * time.Secondcase interval < 1*time.Minute:return 1 * time.Minute// ... 更多情況default:return interval}
}
5. 特殊情況:Rate Interval計算
func calculateRateInterval(queryInterval time.Duration,requestedMinStep string,
) time.Duration {scrape := requestedMinStepif scrape == "" {scrape = "15s"}scrapeIntervalDuration, err := gtime.ParseIntervalStringToTimeDuration(scrape)if err != nil {return time.Duration(0)}// Rate interval = max(queryInterval + scrapeInterval, 4 * scrapeInterval)rateInterval := time.Duration(int64(math.Max(float64(queryInterval+scrapeIntervalDuration), float64(4)*float64(scrapeIntervalDuration))))return rateInterval
}
實際計算示例
假設:
- 時間范圍:1小時 (3,600,000毫秒)
- maxDataPoints:1147
- intervalMs:從面板寬度計算得出
- dsScrapeInterval:“15s”
計算過程:
calculatedInterval = 3,600,000ms / 1147 ≈ 3139ms
RoundInterval(3139ms) → 5000ms
(四舍五入到標準值)safeInterval = 3,600,000ms / 11000 ≈ 327ms
(安全間隔)adjustedInterval = max(5000ms, 327ms) = 5000ms
- 如果沒有intervalFactor,則最終step = 5000ms = 5s
與簡單intervalMs/1000的區別
- 四舍五入:計算結果會被四舍五入到標準間隔值
- 最小間隔限制:不會小于數據源的scrape間隔
- 安全間隔:會考慮一個安全的上限
- 特殊變量處理:對
$__rate_interval
等變量有特殊處理 - intervalFactor:可以放大間隔
- 顯式step優先:如果查詢中設置了step字段,會直接使用
Java實現建議
基于這個復雜的邏輯,您的Java實現應該包括:
public class PrometheusStepCalculator {public static long calculateStep(Long explicitStep, // 顯式設置的step(秒)String interval, // 查詢中的interval字段long intervalMs, // 前端計算的intervalMslong intervalFactor, // intervalFactorString dsScrapeInterval, // 數據源scrape間隔(如"15s")long timeRangeMs, // 查詢時間范圍(毫秒)long maxDataPoints // maxDataPoints) {// 1. 顯式step優先if (explicitStep != null) {return explicitStep;}// 2. 處理變量情況if (isVariableInterval(interval)) {interval = "";}// 3. 獲取最小間隔long minInterval = getMinInterval(dsScrapeInterval, interval, intervalMs);// 4. 計算基本間隔long calculatedInterval = calculateBaseInterval(timeRangeMs, maxDataPoints);// 5. 四舍五入到標準值long roundedInterval = roundInterval(calculatedInterval);// 6. 計算安全間隔long safeInterval = calculateSafeInterval(timeRangeMs);// 7. 選擇較大值long adjustedInterval = Math.max(roundedInterval, safeInterval);// 8. 應用最小間隔限制adjustedInterval = Math.max(adjustedInterval, minInterval);// 9. 應用intervalFactorif (intervalFactor <= 0) {intervalFactor = 1;}long finalInterval = adjustedInterval * intervalFactor;// 10. 轉換為秒return finalInterval / 1000;}// 實現各個輔助方法...
}
這解釋了為什么您的觀察數據與簡單的intervalMs/1000
不一致,因為Grafana進行了復雜的調整和優化。