Go-知識測試-性能測試

1. 定義
2. 例子
3. testing.common 測試基礎數據
4. testing.TB 接口
5. 關鍵函數
- 5.1 testing.runBenchmarks
- 5.2 testing.B.runN
- 5.3 testing.B.StartTimer
- 5.4 testing.B.StopTimer
- 5.5 testing.B.ResetTimer
- 5.6 testing.B.Run
- 5.7 testing.B.run1
- 5.8 testing.B.run
- 5.9 processBench
- 5.10 tetsing.B.doBench
- 5.11 testing.B.launch
- 5.12 testing.B.SetBytes
6. 數據統計

建議先看：https://blog.csdn.net/a18792721831/article/details/140062769

Go-知識測試-工作機制

1. 定義

性能測試會執行多次，然后計算平均耗時。
性能測試要保證測試文件以_test.go結尾。
測試方法必須以BenchmarkXxx開頭。
測試文件可以與源碼處于同一目錄，也可以處于單獨的目錄。

2. 例子

在創建切片的時候，可以指定容量，也可以不指定容量。假設可以提前知道數據的長度，就可以在創建切片的時候，預分配存儲空間，避免多次拷貝。
函數如下：

func MakeWithout(n int) []int {var s []intfor i := 0; i < n; i++ {s = append(s, i)}return s
}func MakeWith(n int) []int {s := make([]int, n)for i := 0; i < n; i++ {s = append(s, i)}return s
}

接著使用性能測試，看看上面兩個函數的性能差距有多大

func BenchmarkMakeWithout(b *testing.B) {for i := 0; i < b.N; i++ {MakeWithout(1000)}
}func BenchmarkMakeWith(b *testing.B) {for i := 0; i < b.N; i++ {MakeWith(1000)}
}

先來個小容量的，n=1000
使用go test -v -bench=.執行性能測試，-v 表示控制臺輸出結果，-bench表示執行性能測試,-bench=.表示使用.作為正則，也就是執行全部的性能測試
在這里插入圖片描述

通過輸出可以知道 Without 執行了 403447 次，平均每次 2596 納秒
With 執行了 329895次，平均每次 3357 納秒
也就是說，在1000的容量下，預先分配反而慢。我猜測是顯式調用make花費了時間。
加大容量,n=1000_0000
在這里插入圖片描述

預分配比較快了，平均每次26.7毫秒

3. testing.common 測試基礎數據

每個性能測試都有一個入參t *testing.B，結構定義如下:

type B struct {common // 與 testing.T 共享的 testing.common ，負責記錄日志、狀態等importPath       string // 包含基準的包的導入路徑context          *benchContextN                int // 目標代碼執行次數，不需要用戶了解具體值，會自動調整previousN        int           // 上一次運行中的迭代次數previousDuration time.Duration // 上次運行的總持續時間benchFunc        func(b *B) // 性能測試函數benchTime        benchTimeFlag  // 性能測試函數最少執行的時間，默認為1sbytes            int64 // 每次迭代處理的字節數missingBytes     bool // 其中一個子基準標記沒有設置字節。timerOn          bool // 是否已開始計時 showAllocResult  boolresult           BenchmarkResult // 測試結果parallelism      int // RunParallel創建并行性*GOMAXPROCS goroutines// memStats的初始狀態。Mallocs和MemStats。TotalAlloc。startAllocs uint64 // 計時開始時堆中分配的對象總數startBytes  uint64 // 計時開始時堆中分配的字節總數// 運行后此測試的凈總數。netAllocs uint64 // 計時結束時，堆中增加的對象總數netBytes  uint64 // 計時結束時，堆中增加的字節總數// ReportMetric收集的額外指標。extra map[string]float64
}

T 組合了 common 類型

// common包含T和B之間的公共元素，以及
// 捕獲常見的方法，如Errorf。
type common struct {mu          sync.RWMutex         // 保衛這群田地output      []byte               // 測試或基準測試生成的輸出。w           io.Writer            // 對于flushToParent。ran         bool                 // 執行了測試或基準測試（或其中一個子測試）。failed      bool                 // 測試或基準測試失敗。skipped     bool                 // 已跳過測試或基準測試。done        bool                 // 測試已完成，所有子測試均已完成。helperPCs   map[uintptr]struct{} // 寫入文件/行信息時要跳過的函數helperNames map[string]struct{}  // helperPC轉換為函數名cleanups    []func()             // 測試結束時要調用的可選函數cleanupName string               // 清除函數的名稱。cleanupPc   []uintptr            // 調用Cleanup的點處的堆棧跟蹤。finished    bool                 // 測試功能已完成。chatty      *chattyPrinter       // 如果設置了chatty標志，則為chattyPrinter的副本。bench       bool                 // 當前測試是否為基準測試。hasSub      int32                // 以原子形式書寫。raceErrors  int                  // 測試過程中檢測到的種族數。runner      string               // 運行測試的tRunner的函數名稱。parent      *commonlevel       int       // 測試或基準的嵌套深度。creator     []uintptr // 如果級別>0，則堆棧跟蹤父級調用t.Run的點。name        string    // 測試或基準的名稱。start       time.Time // 時間測試或基準測試已啟動duration    time.Durationbarrier     chan bool // 為了發出平行子測驗的信號，他們可以開始。signal      chan bool // 發出測試完成的信號。sub         []*T      // 要并行運行的子測試的隊列。tempDirMu   sync.MutextempDir     stringtempDirErr  errortempDirSeq  int32
}

每個測試均對應一個 testing.common，不僅記錄了測試函數的基礎信息(比如名字),還管理了測試的執行過程和測試結果。
testing.commong是單元測試，性能測試和模糊測試的基礎。
通過繼承共同的結構，保證了各種測試的行為一致，降低使用的門檻。

4. testing.TB 接口

testing.common 實現的接口為 testing.TB，單元測試和性能測試通過該接口獲取基礎能力。

type TB interface {Cleanup(func())                            // 清理Error(args ...interface{})                 // 表示測試失敗+記錄日志Errorf(format string, args ...interface{}) // 格式化表示測試失敗+記錄日志Fail()                                     // 表示測試失敗FailNow()                                  // 標記測試失敗+結束當前測試Failed() bool                              // 查詢結果Fatal(args ...interface{})                 // 標記測試失敗+記錄日志+結束當前測試Fatalf(format string, args ...interface{}) // 格式化標記測試失敗+記錄日志+結束當前測試Helper()                                   // 標記測試為 Helper (避免打印當前代碼行號)Log(args ...interface{})                   // 記錄日志Logf(format string, args ...interface{})   // 格式化 記錄日志Name() string                              // 查詢測試名Setenv(key, value string)                  // 設置環境變量Skip(args ...interface{})                  // 記錄日志+跳過測試SkipNow()                                  // 跳過測試Skipf(format string, args ...interface{})  // 格式化記錄日志+跳過測試Skipped() bool                             // 查詢測試是否被跳過TempDir() string                           // 返回一個臨時目錄//阻止用戶實現的私有方法//接口，因此將來不會添加//違反Go 1兼容性。private()
}

5. 關鍵函數

5.1 testing.runBenchmarks

runBenchmarks負責創建name=Main的Benchmark作為啟動case
在這里插入圖片描述

在testing.B.runN中執行testing.B.Run

5.2 testing.B.runN

在runN中啟動定時器，然后執行benchFunc
在這里插入圖片描述

性能測試中，執行多少次，也時由runN中設置的

5.3 testing.B.StartTimer

StartTimer負責啟動計時并初始化內存相關計數，測試執行時會自動調用(name=Main的testing.B啟動)，一般不需要用戶啟動

func (b *B) StartTimer() {if !b.timerOn {runtime.ReadMemStats(&memStats) // 讀取當前堆內存分配信息b.startAllocs = memStats.Mallocs // 記錄當前堆內存分配的對象數b.startBytes = memStats.TotalAlloc // 記錄當前堆內存分配的字節數b.start = time.Now() // 記錄測試啟動時間b.timerOn = true // 標記計時標志}
}

5.4 testing.B.StopTimer

StopTimer負責停止計時，并累加相應的統計值:

func (b *B) StopTimer() {if b.timerOn {b.duration += time.Since(b.start) // 累加測試耗時runtime.ReadMemStats(&memStats) // 讀取當前堆內存分配信息b.netAllocs += memStats.Mallocs - b.startAllocs // 累加對北村分配的對象數b.netBytes += memStats.TotalAlloc - b.startBytes // 累加堆內存分配的字節數b.timerOn = false // 標記計時標志}
}

5.5 testing.B.ResetTimer

ResetTime用于重置計時器，相應地也會把其他統計值也重置:

func (b *B) ResetTimer() {if b.timerOn {runtime.ReadMemStats(&memStats) // 讀取當前堆內存分配信息b.startAllocs = memStats.Mallocs // 記錄當前堆內存分配的對象數b.startBytes = memStats.TotalAlloc // 記錄當前堆內存分配的字節數b.start = time.Now() // 記錄測試啟動時間}b.duration = 0 // 清空耗時b.netAllocs = 0 // 清空內存分配的對象數b.netBytes = 0 // 清空內存分配的字節數
}

ResetTimer必將常用，比如在一個測試中，初始化部分耗時比較長，初始化后再開始計時

5.6 testing.B.Run

func (b *B) Run(name string, f func(b *B)) bool {// 是否有子測試atomic.StoreInt32(&b.hasSub, 1)// 加鎖benchmarkLock.Unlock()// 延遲解鎖defer benchmarkLock.Lock()// 獲取 name等信息benchName, ok, partial := b.name, true, false// name 進行匹配if b.context != nil {benchName, ok, partial = b.context.match.fullName(&b.common, name)}// 匹配失敗，結束if !ok {return true}var pc [maxStackLen]uintptrn := runtime.Callers(2, pc[:])// 新建子測試數據結構sub := &B{common: common{signal:  make(chan bool),name:    benchName,parent:  &b.common,level:   b.level + 1,creator: pc[:n],w:       b.w,chatty:  b.chatty,bench:   true,},importPath: b.importPath,benchFunc:  f,benchTime:  b.benchTime,context:    b.context,}// 是否并發if partial {atomic.StoreInt32(&sub.hasSub, 1)}// 輸出日志信息if b.chatty != nil {labelsOnce.Do(func() {fmt.Printf("goos: %s\n", runtime.GOOS)fmt.Printf("goarch: %s\n", runtime.GOARCH)if b.importPath != "" {fmt.Printf("pkg: %s\n", b.importPath)}if cpu := sysinfo.CPU.Name(); cpu != "" {fmt.Printf("cpu: %s\n", cpu)}})fmt.Println(benchName)}// 先執行一次子測試，如果子測試不出錯且子測試沒有子測試則繼續執行runif sub.run1() {// run 中決定了要執行多少次runNsub.run()}// 累加統計結果到父測試中b.add(sub.result)return !sub.failed
}

所有的測試都是先使用run1方法執行一次，然后在決定要不要繼續迭代。
測試結果實際上以最后一次迭代的數據為準，最后一次迭代往往B.N更大，測試準確性相對更高。

5.7 testing.B.run1

在這里插入圖片描述

在 run1中，調用runN的時候，傳入1，表示執行一次BenchmarkXxx方法，統計執行一次的耗時。

5.8 testing.B.run

func (b *B) run() {// 打印額外的統計信息labelsOnce.Do(func() {fmt.Fprintf(b.w, "goos: %s\n", runtime.GOOS)fmt.Fprintf(b.w, "goarch: %s\n", runtime.GOARCH)if b.importPath != "" {fmt.Fprintf(b.w, "pkg: %s\n", b.importPath)}if cpu := sysinfo.CPU.Name(); cpu != "" {fmt.Fprintf(b.w, "cpu: %s\n", cpu)}})// 如果是子測試，那么此時子測試還未執行run1，在 processBench中會對子測試創建一個B，然后執行run1，接著執行 doBenchif b.context != nil {// Running go test --test.benchb.context.processBench(b) // Must call doBench.} else {// Running func Benchmark.b.doBench()}
}

5.9 processBench

在這里插入圖片描述

執行子測試的doBench

5.10 tetsing.B.doBench

func (b *B) doBench() BenchmarkResult {go b.launch() // goroutine 執行 launch 結束<-b.signalreturn b.result
}

5.11 testing.B.launch

func (b *B) launch() {// 延遲調用通知父測試結束defer func() {b.signal <- true}()// 如果用戶指定了，那么按照用戶指定的執行if b.benchTime.n > 0 {b.runN(b.benchTime.n)} else {// 獲取默認的時間間隔，默認為1sd := b.benchTime.d// 最少執行 b.benchTime(默認為1s)時間，最多執行1e9次for n := int64(1); !b.failed && b.duration < d && n < 1e9; {last := n// 獲取1秒的納秒數goalns := d.Nanoseconds()// 獲取上一次執行次數，1次prevIters := int64(b.N)// 獲取上一次執行時間// 執行 run 之前需要執行一次 run1 也就是說 prevIters 是第一次執行的耗時prevns := b.duration.Nanoseconds()if prevns <= 0 {// Round up, to avoid div by zero.prevns = 1}// goalns * prevIters 上次執行持續了多少納秒// prevns 上次執行一次的耗時// n 表示上次執行多少次 n = goalns * prevIters / prevns// 先增長 20% ， n = 1.2nn += n / 5// 不能增加過快，如果 20% 比100倍還大，那么取小值n = min(n, 100*last)// 并且至少增加1次n = max(n, last+1)// 不能超過 1e9n = min(n, 1e9)// 啟動執行b.runN(int(n))// 執行完成后，進行下一次循環}}// 統計結果b.result = BenchmarkResult{b.N, b.duration, b.bytes, b.netAllocs, b.netBytes, b.extra}
}

在不考慮程序出錯，而且用戶沒有主動停止測試的場景下，每個測試至少執行b.benchTime長的時間(秒)，默認為1s.
先執行一遍，看看用戶代碼執行一次需要花多長時間，如果時間比較短，那么B.N需要足夠大，才可以測試更準確。
如果時間比較長，那么B.N需要足夠少，否則測試效率比較慢。
n = goalns * prevIters / prevns ，如果 prevns比較少，那么n就會從一個比較大的值開始循環
如果prevns比較大，那么n就會以一個比較小的值開始循環，直到單批次超過1秒。

5.12 testing.B.SetBytes

這個函數用來設置單詞迭代處理的字節數，一旦設置了這個字節數，那么輸出報告中獎出現 xx MB/s 的信息。
用來表示待測函數處理字節的性能，待測函數每次處理多少字節只有用戶知道，所以需要用戶設置。
比如：

func MakeWithout(n int) []int {var s []intfor i := 0; i < n; i++ {s = append(s, i)}return s
}func MakeWith(n int) []int {s := make([]int, n)for i := 0; i < n; i++ {s = append(s, i)}return s
}func BenchmarkMakeWithout(b *testing.B) {b.SetBytes(1024)for i := 0; i < b.N; i++ {MakeWithout(1000)}
}func BenchmarkMakeWith(b *testing.B) {b.SetBytes(1024)for i := 0; i < b.N; i++ {MakeWith(1000)}
}

執行結果
在這里插入圖片描述

6. 數據統計

在測試開始時，會把當前內存值記錄下來：
在這里插入圖片描述

也就是記入testing.B.startAllocs和testing.B.startBytes，測試結束后，會用最終內存值與開始時的內存相減，
得到凈增加的內存值，并記入testing.B.netAllocs和testing.B.netBytes中。
每個測試結束后，會吧結果保存到BenchmarkResult中
在這里插入圖片描述

type BenchmarkResult struct {N         int           // 用戶代碼執行的次數T         time.Duration // 測試耗時Bytes     int64         // 用戶代碼每次處理的字節數，SetBytes設置的值MemAllocs uint64        // 內存對象凈增加值MemBytes  uint64        // 內存字節凈增加值// 附加信息Extra map[string]float64
}

最終統計時，只需要把凈增加值除以N就能得到每次新增多少內存。