背景
本文基于 StarRocks 3.3.5
單個Starrocks BE配置是 16CU 32GB
在Flink Yaml CDC 任務往 Starrocks寫數據的過程中,突然遇到了primary key memory usage exceeds the limit
問題,具體如下:
java.lang.RuntimeException: com.starrocks.data.load.stream.exception.StreamLoadFailException: Transaction prepare failed, db: xxx, table: xxxx, label: flink-960f94fc-6fb1-43e4-aaa1-4c3938241ffa,
responseBody: {"Status": "MEM_LIMIT_EXCEEDED","Message": "primary key memory usage exceeds the limit. tablet_id: 479203, consumption: 15928614825, limit: 15790082457. Memory stats of top five tablets: 4258582(314M)4258578(272M)4258340(230M)2957546(190M)2957590(190M): be:xxx.xxx.xxx.xxx"
}
errorLog: nullat com.starrocks.data.load.stream.v2.StreamLoadManagerV2.AssertNotException(StreamLoadManagerV2.java:427)at com.starrocks.data.load.stream.v2.StreamLoadManagerV2.write(StreamLoadManagerV2.java:252)at com.starrocks.connector.flink.table.sink.v2.StarRocksWriter.write(StarRocksWriter.java:143)at org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.processElement(SinkWriterOperator.java:182)at org.apache.flink.cdc.runtime.operators.sink.DataSinkWriterOperator.processElement(DataSinkWriterOperator.java:178)at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:75)at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:38)at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:245)at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:217)at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:169)at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:68)at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:616)at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:1071)at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:1020)at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:959)at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:938)at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751)at org.apache.flink.runtime.taskmanager.Task.run(Task.java:567)at java.lang.Thread.run(Thread.java:879)
Caused by: com.starrocks.data.load.stream.exception.StreamLoadFailException: Transaction prepare failed, db: xxx, table: xxx, label: flink-960f94fc-6fb1-43e4-aaa1-4c3938241ffa,
responseBody: {"Status": "MEM_LIMIT_EXCEEDED","Message": "primary key memory usage exceeds the limit. tablet_id: 479203, consumption: 15928614825, limit: 15790082457. Memory stats of top five tablets: 4258582(314M)4258578(272M)4258340(230M)2957546(190M)2957590(190M): be:xxx.xxx.xxx.xxx"
}
errorLog: nullat com.starrocks.data.load.stream.TransactionStreamLoader.prepare(TransactionStreamLoader.java:221)at com.starrocks.data.load.stream.v2.TransactionTableRegion.commit(TransactionTableRegion.java:247)at com.starrocks.data.load.stream.v2.StreamLoadManagerV2.lambda$init$0(StreamLoadManagerV2.java:210)... 1 more
除此之外,我們的業務場景就是會 更新 以往 的歷史數據,且這樣類似的任務有很多。我們的表結構是主鍵表。
分析
上述報錯,其實是BE報出來的,每次進行數據更新的時候,SR都會加載對應的tablet對應的主鍵索引,導致我們這邊的BE占用的內存比較大,如下所示:
。經過分析我發現我們這邊的分區是以月維度劃分的,而且bucket的個數為2,這樣每次寫入數據的時候,就會把一個月的的數據的索引加載到內存中,這樣就會導致BE的內存占用越來越大,
PARTITION BY date_trunc("month",created_time)
DISTRIBUTED BY HASH(xxx) BUCKETS 2
所以我們進行了bucket調整,
ALTER TABLE xxxx DISTRIBUTED BY HASH(xx) BUCKETS 50;
調整之后,對比了一下BE所占用的內存,如下:
內存占用節約了5GB。
其他
可以通過如下命令查看 索引所占用的內存
curl -XGET -s http://BE:8040/metrics | grep "update_mem_bytes"curl -XGET -s http://BE:8040/metrics |grep 'update_primary_index_bytes_total'
具體的指標參考:StarRocks metrics