基本概念
云原生這么多微服務,當然需要一個診斷利器來排查問題。
Arthas 是阿里開源的 Java 診斷工具,深受開發者喜愛。在線排查問題,無需重啟;動態跟蹤 Java 代碼;實時監控 JVM 狀態。Arthas 支持 JDK 6+,支持 Linux/Mac/Windows,采用命令行交互模式,同時提供豐富的 Tab 自動補全功能,進一步方便進行問題的定位和診斷。

官方定義為Java應用診斷利器,截至目前github收獲30.2K個star。
可以用來查看線程,內存,GC和運行時狀態,查看入參/返回值/異常,快速定位應用的熱點,生成火焰圖等功能,幫助更快排查疑難問題。本文主要講述常見命令的使用。
常見命令
啟動arthas-demo(案例程序)
執行如下命令下載?arthas-demo.jar,再用java -jar命令啟動案例程序:
wget?https://arthas.aliyun.com/arthas-demo.jar;
java?-jar?arthas-demo.jar
啟動arthas-boot(診斷工具程序)
執行如下命令下載arthas-boot.jar,再用java -jar命令啟動:
wget?https://arthas.aliyun.com/arthas-boot.jar;
java?-jar?arthas-boot.jar
arthas-boot是Arthas的啟動程序,它啟動后,會列出所有的Java進程,用戶可以選擇需要診斷的目標進程。

選擇要診斷的Java程序,我這里輸入 1 ,再按回車鍵(Enter)。
Attach成功之后,會打印Arthas LOGO。

輸入 help 可以獲取到Arthas相關命令幫助信息。
[arthas@1266]$?helpNAME?????????DESCRIPTION?????????????????????????????????????????????????????????????????????????????help?????????Display?Arthas?Help?????????????????????????????????????????????????????????????????????auth?????????Authenticates?the?current?session???????????????????????????????????????????????????????keymap???????Display?all?the?available?keymap?for?the?specified?connection.??????????????????????????sc???????????Search?all?the?classes?loaded?by?JVM????????????????????????????????????????????????????sm???????????Search?the?method?of?classes?loaded?by?JVM??????????????????????????????????????????????classloader??Show?classloader?info???????????????????????????????????????????????????????????????????jad??????????Decompile?class?????????????????????????????????????????????????????????????????????????getstatic????Show?the?static?field?of?a?class????????????????????????????????????????????????????????monitor??????Monitor?method?execution?statistics,?e.g.?total/success/failure?count,?average?rt,?fail?rate,?etc.?????????????????????????????????????????????????????????????????????????????stack????????Display?the?stack?trace?for?the?specified?class?and?method??????????????????????????????thread???????Display?thread?info,?thread?stack???????????????????????????????????????????????????????trace????????Trace?the?execution?time?of?specified?method?invocation.????????????????????????????????watch????????Display?the?input/output?parameter,?return?object,?and?thrown?exception?of?specified?me?thod?invocation?????????????????????????????????????????????????????????????????????????tt???????????Time?Tunnel?????????????????????????????????????????????????????????????????????????????jvm??????????Display?the?target?JVM?information??????????????????????????????????????????????????????memory???????Display?jvm?memory?info.????????????????????????????????????????????????????????????????perfcounter??Display?the?perf?counter?information.???????????????????????????????????????????????????ognl?????????Execute?ognl?expression.????????????????????????????????????????????????????????????????mc???????????Memory?compiler,?compiles?java?files?into?bytecode?and?class?files?in?memory.???????????redefine?????Redefine?classes.?@see?Instrumentation#redefineClasses(ClassDefinition...)??????????????retransform??Retransform?classes.?@see?Instrumentation#retransformClasses(Class...)??????????????????dashboard????Overview?of?target?jvm's?thread,?memory,?gc,?vm,?tomcat?info.???????????????????????????dump?????????Dump?class?byte?array?from?JVM??????????????????????????????????????????????????????????heapdump?????Heap?dump???????????????????????????????????????????????????????????????????????????????options??????View?and?change?various?Arthas?options??????????????????????????????????????????????????cls??????????Clear?the?screen????????????????????????????????????????????????????????????????????????reset????????Reset?all?the?enhanced?classes??????????????????????????????????????????????????????????version??????Display?Arthas?version??????????????????????????????????????????????????????????????????session??????Display?current?session?information?????????????????????????????????????????????????????sysprop??????Display,?and?change?the?system?properties.??????????????????????????????????????????????sysenv???????Display?the?system?env.?????????????????????????????????????????????????????????????????vmoption?????Display,?and?update?the?vm?diagnostic?options.??????????????????????????????????????????logger???????Print?logger?info,?and?update?the?logger?level??????????????????????????????????????????history??????Display?command?history?????????????????????????????????????????????????????????????????cat??????????Concatenate?and?print?files?????????????????????????????????????????????????????????????base64???????Encode?and?decode?using?Base64?representation???????????????????????????????????????????echo?????????write?arguments?to?the?standard?output??????????????????????????????????????????????????pwd??????????Return?working?directory?name???????????????????????????????????????????????????????????mbean????????Display?the?mbean?information???????????????????????????????????????????????????????????grep?????????grep?command?for?pipes.?????????????????????????????????????????????????????????????????tee??????????tee?command?for?pipes.??????????????????????????????????????????????????????????????????profiler?????Async?Profiler.?https://github.com/jvm-profiling-tools/async-profiler???????????????????vmtool???????jvm?tool????????????????????????????????????????????????????????????????????????????????stop?????????Stop/Shutdown?Arthas?server?and?exit?the?console.???
與linux同樣規則的命令此處不再贅述。如:history,cat,echo,pwd,grep。
系統的實時數據面板 dashboard 命令

dashboard 命令可以查看當前系統的實時數據面板。可以查看到CPU、內存、GC、運行環境等信息。
輸入 q 或者 Ctrl+C 可以退出dashboard命令。
打印線程ID 的棧 thread
thread 1 命令會打印線程ID 1的棧。用 thread 1 | grep 'main(' 查找到main class。

查找JVM里已加載的類 sc/sm
可以通過 sc 命令來查找JVM里已加載的類,通過-d參數,可以打印出類加載的具體信息,很方便查找類加載問題。
[arthas@1266]$?sc?-d?*MathGameclass-info????????demo.MathGame?????????????????????????????????????????????????????????????????????????????????????????????code-source???????/home/shell/arthas-demo.jar???????????????????????????????????????????????????????????????????????????????name??????????????demo.MathGame?????????????????????????????????????????????????????????????????????????????????????????????isInterface???????false?????????????????????????????????????????????????????????????????????????????????????????????????????isAnnotation??????false?????????????????????????????????????????????????????????????????????????????????????????????????????isEnum????????????false?????????????????????????????????????????????????????????????????????????????????????????????????????isAnonymousClass??false?????????????????????????????????????????????????????????????????????????????????????????????????????isArray???????????false?????????????????????????????????????????????????????????????????????????????????????????????????????isLocalClass??????false?????????????????????????????????????????????????????????????????????????????????????????????????????isMemberClass?????false?????????????????????????????????????????????????????????????????????????????????????????????????????isPrimitive???????false?????????????????????????????????????????????????????????????????????????????????????????????????????isSynthetic???????false?????????????????????????????????????????????????????????????????????????????????????????????????????simple-name???????MathGame??????????????????????????????????????????????????????????????????????????????????????????????????modifier??????????public????????????????????????????????????????????????????????????????????????????????????????????????????annotation??????????????????????????????????????????????????????????????????????????????????????????????????????????????????interfaces??????????????????????????????????????????????????????????????????????????????????????????????????????????????????super-class???????+-java.lang.Object????????????????????????????????????????????????????????????????????????????????????????class-loader??????+-sun.misc.Launcher$AppClassLoader@1b6d3586???????????????????????????????????????????????????????????????+-sun.misc.Launcher$ExtClassLoader@107df6e5?????????????????????????????????????????????????????????????classLoaderHash???1b6d3586??????????????????????????????????????????????????????????????????????????????????????????????????Affect(row-cnt:1)?cost?in?50?ms.
sc支持通配,比如搜索所有的StringUtils:
sc?*StringUtils
查找UserController的ClassLoader
[arthas@1266]$?sc?-d?com.example.demo.arthas.user.UserController?|?grep?classLoaderHashclassLoaderHash???19469ea2
sm命令則是查找類的具體函數。比如:
sm?java.math.RoundingMode
通過-d參數可以打印函數的具體屬性:
sm?-d?java.math.RoundingMode
查找特定的函數,比如查找構造函數:
sm?java.math.RoundingMode?<init>
反編譯代碼 jad命令
jad demo.MathGame

通過--source-only參數可以只打印出在反編譯的源代碼:
jad?--source-only?com.example.demo.arthas.user.UserController
動態執行代碼 ognl 命令
在Arthas里,有一個單獨的ognl命令,可以動態執行代碼。這個有點秀啊😯😯😯
調用static函數
ognl?'@java.lang.System@out.println("hello?ognl")'
獲取靜態類的靜態字段
獲取UserController類里的logger字段:
ognl?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?@com.example.demo.arthas.user.UserController@logger
通過-x參數控制返回值的展開層數。比如:
ognl?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?-x?2?@com.example.demo.arthas.user.UserController@logger
執行多行表達式,賦值給臨時變量,返回一個List
ognl?'#value1=@System@getProperty("java.home"),?#value2=@System@getProperty("java.runtime.name"),?{#value1,?#value2}'
-
OGNL特殊用法請參考:
https://github.com/alibaba/arthas/issues/71 -
OGNL表達式官方指南:
https://commons.apache.org/proper/commons-ognl/language-guide.html
查看函數的參數/返回值/異常信息 watch 命令
watch?demo.MathGame?primeFactors?returnObj

查看JVM信息 sysprop sysenv jvm dashboard
sysprop
-
sysprop :打印所有的
System Properties信息。 -
指定單個key:
sysprop user.dir。 -
通過grep過濾 :
sysprop | grep user。 -
設置新的value:
sysprop testKey testValue。
sysenv
sysenv 命令可以獲取到環境變量。和sysprop命令類似。
jvm
jvm 命令會打印出JVM的各種詳細信息。
dashboard
dashboard 命令可以查看當前系統的實時數據面板。
重置增強類 reset 命令
通過reset命令可以重置增強類,將被 Arthas 增強過的類全部還原,Arthas 服務端關閉時會重置所有增強過的類。Arthas在?watch/trace?等命令時,實際上是修改了應用的字節碼,插入增強的代碼。顯式執行 reset 命令,可以清除掉這些增強代碼。
reset 還原指定類:
reset?demo.MathGame
還原所有增強類:
reset
查看當前會話信息 session

tee 命令
類似傳統的tee命令 用于讀取標準輸入的數據,并將其內容輸出成文件。
tee指令會從標準輸入設備讀取數據,將其內容輸出到標準輸出設備,同時保存成文件。
查看當前Arthas版本 version
[arthas@1710]$?version
3.6.2
退出Arthas
輸入 exit 或者 quit 命令可以退出Arthas當前session。執行 stop 命令徹底退出Arthas。
📌PS:所有命令都可以通過 -h 參數查看幫助信息。
實操案例
排查函數調用異常
通過curl 請求接口只能看到返回異常,但是看不到具體的請求參數和堆棧信息。
shell@Alicloud:~$?curl?http://localhost:61000/user/0
{"timestamp":1655435063042,"status":500,"error":"Internal?Server?Error","exception":"java.lang.IllegalArgumentException","message":"id?<?1","path":"/user/0"}
查看UserController的 參數/異常
在Arthas里執行:
watch?com.example.demo.arthas.user.UserController?*?'{params,?throwExp}'
-
第一個參數是類名,支持通配
-
第二個參數是函數名,支持通配 訪問?
curl http://localhost:61000/user/0,watch命令會打印調用的參數和異常
再次通過curl 調用可以在arthas里面查看到具體的異常信息。

把獲取到的結果展開,可以用-x參數:
watch?com.example.demo.arthas.user.UserController?*?'{params,?throwExp}'?-x?2
返回值表達式
在上面的例子里,第三個參數是返回值表達式,它實際上是一個ognl表達式,它支持一些內置對象:
-
loader
-
clazz
-
method
-
target
-
params
-
returnObj
-
throwExp
-
isBefore
-
isThrow
-
isReturn
比如返回一個數組:
watch?com.example.demo.arthas.user.UserController?*?'{params[0],?target,?returnObj}'
條件表達式
watch命令支持在第4個參數里寫條件表達式,比如:
-
當訪問?
user/1?時,watch命令沒有輸出 -
當訪問?
user/101?時,watch會打印出結果。

當異常時捕獲
watch命令支持-e選項,表示只捕獲拋出異常時的請求:
watch?com.example.demo.arthas.user.UserController?*?"{params[0],throwExp}"?-e
按照耗時進行過濾
watch命令支持按請求耗時進行過濾,比如:
watch?com.example.demo.arthas.user.UserController?*?'{params,?returnObj}'?'#cost>200'
熱更新代碼
這個也是真的秀。
訪問?http://localhost:61000/user/0?,會返回500異常:
shell@Alicloud:~$?curl?http://localhost:61000/user/0
{"timestamp":1655436218020,"status":500,"error":"Internal?Server?Error","exception":"java.lang.IllegalArgumentException","message":"id?<?1","path":"/user/0"}
通過熱更新代碼,修改這個邏輯。
jad反編譯UserController
jad?--source-only?com.example.demo.arthas.user.UserController?>?/tmp/UserController.java
jad反編譯的結果保存在?/tmp/UserController.java文件里了。
再打開一個Terminal 窗口,然后用vim來編輯/tmp/UserController.java:
vim?/tmp/UserController.java
比如當?user id?小于1時,也正常返回,不拋出異常:
@GetMapping(value={"/user/{id}"})
public?User?findUserById(@PathVariable?Integer?id)?{logger.info("id:?{}",?(Object)id);if?(id?!=?null?&&?id?<?1)?{return?new?User(id,?"name"?+?id);//?throw?new?IllegalArgumentException("id?<?1");}return?new?User(id.intValue(),?"name"?+?id);
}
sc查找加載UserController的ClassLoader
[arthas@1266]$?sc?-d?*UserController?|?grep?classLoaderHashclassLoaderHash???19469ea2
classLoaderHash 是19469ea2,后面需要使用它。
mc
保存好/tmp/UserController.java之后,使用mc(Memory Compiler)命令來編譯,并且通過-c或者–classLoaderClass參數指定ClassLoader:
mc?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?/tmp/UserController.java?-d?/tmp
[arthas@1266]$?mc?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?/tmp/UserController.java?-d?/tmp
Memory?compiler?output:
/tmp/com/example/demo/arthas/user/UserController.class
Affect(row-cnt:1)?cost?in?2879?ms.
也可以通過mc -c /tmp/UserController.java -d /tmp,使用-c參數指定ClassLoaderHash:
mc?-c?19469ea2?/tmp/UserController.java?-d?/tmp
redefine
再使用redefine命令重新加載新編譯好的UserController.class:
[arthas@1266]$?redefine?/tmp/com/example/demo/arthas/user/UserController.class
redefine?success,?size:?1,?classes:
com.example.demo.arthas.user.UserController
熱修改代碼結果
redefine成功之后,再次訪問?user/0?,結果正常
shell@Alicloud:~$?curl?http://localhost:61000/user/0
{"id":0,"name":"name0"}
動態更新應用Logger Level
查找UserController的ClassLoader
[arthas@1266]$?sc?-d?*UserController?|?grep?classLoaderHashclassLoaderHash???19469ea2
用ognl獲取logger
ognl?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?‘@com.example.demo.arthas.user.UserController@logger’
[arthas@1266]$?ognl?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?'@com.example.demo.arthas.user.UserController@logger'
@Logger[serialVersionUID=@Long[5454405123156820674],FQCN=@String[ch.qos.logback.classic.Logger],name=@String[com.example.demo.arthas.user.UserController],level=null,effectiveLevelInt=@Integer[20000],parent=@Logger[Logger[com.example.demo.arthas.user]],childrenList=null,aai=null,additive=@Boolean[true],loggerContext=@LoggerContext[ch.qos.logback.classic.LoggerContext[default]],
]
可以知道UserController@logger實際使用的是logback。可以看到level=null,則說明實際最終的level是從root logger里來的。
單獨設置UserController的logger level
ognl?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?'@com.example.demo.arthas.user.UserController@logger.setLevel(@ch.qos.logback.classic.Level@DEBUG)'
再次獲取UserController@logger,可以發現已經是DEBUG了。
修改logback的全局logger level
通過獲取root logger,可以修改全局的logger level:
ognl?--classLoaderClass?org.springframework.boot.loader.LaunchedURLClassLoader?'@org.slf4j.LoggerFactory@getLogger("root").setLevel(@ch.qos.logback.classic.Level@DEBUG)'
獲取Spring Context,在獲取 bean,再調用函數
使用tt命令獲取到spring context
tt即 TimeTunnel,它可以記錄下指定方法每次調用的入參和返回信息,并能對這些不同的時間下調用進行觀測。
官方tt說明:https://arthas.aliyun.com/doc/tt.html
tt?-t?org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter?invokeHandlerMethod
訪問user/1:
?curl?http://localhost:61000/user/1
可以看到tt命令捕獲到了一個請求:

輸入 q 或者 Ctrl + C 退出上面的 tt -t命令。
使用tt命令從調用記錄里獲取到spring context
tt?-i?1000?-w?'target.getApplicationContext()'
獲取spring bean,并調用函數
tt?-i?1000?-w?‘target.getApplicationContext().getBean(“helloWorldService”).getHelloMessage()’
結果如下:
[arthas@1266]$?tt?-i?1000?-w?'target.getApplicationContext().getBean("helloWorldService").getHelloMessage()'
@String[Hello?World]
Affect(row-cnt:1)?cost?in?1?ms.
排查HTTP請求返回401
請求接口沒有權限的時候一般就返回401 Unauthorized。
401通常是被權限管理的Filter攔截了,那么到底是哪個Filter處理了這個請求,返回了401?
跟蹤所有的Filter函數
開始trace:
trace?javax.servlet.Filter?*
可以在調用樹的最深層,找到AdminFilterConfig$AdminFilter返回了401
+---[3.806273ms]?javax.servlet.FilterChain:doFilter()
|???`---[3.447472ms]?com.example.demo.arthas.AdminFilterConfig$AdminFilter:doFilter()
|???????`---[0.17259ms]?javax.servlet.http.HttpServletResponse:sendError()
通過stack獲取調用棧
上面是通過trace命令來獲取信息,從結果里,我們可以知道通過stack跟蹤HttpServletResponse:sendError(),同樣可以知道是哪個Filter返回了401
執行:
stack?javax.servlet.http.HttpServletResponse?sendError?'params[0]==401'
訪問可以看到如下堆棧信息:

查找Top N線程
查看所有線程信息
thread
查看具體線程的棧
查看線程ID 2的棧:
thread?2
查看CPU使用率top n線程的棧
thread?-n?3
查看5秒內的CPU使用率top n線程棧
thread?-n?3?-i?5000
查找線程是否有阻塞
thread?-b
更多使用查看:
-
Github地址:?
https://github.com/alibaba/arthas -
文檔地址:?
https://arthas.aliyun.com/doc/