gcc優化選項,可在編譯時間,目標文件長度,執行效率三個維度,進行不同的取舍和平衡。
gcc 常用編譯選項
arm-linux-gnueabihf-g++ -O3 -march=armv7-a -mcpu=cortex-a9 -ftree-vectorize -mfpu=neon -mfpu=vfpv3-fp16 -mfloat-abi=hard -ffast-math
-c 只編譯并生成目標文件。
-E 只運行 C 預編譯器。
-g 生成調試信息。GNU 調試器可利用該信息。
-Os 相對語-O2.5。
-o FILE 生成指定的輸出文件。用在生成可執行文件時。
-O0 不進行優化處理。
-O 或 -O1 優化生成代碼。
-O2 進一步優化。
-O3 比 -O2 更進一步優化,包括 inline 函數。
-shared 生成共享目標文件。通常用在建立共享庫時。
-W 開啟所有 gcc 能提供的警告。
-w 不生成任何警告信息。
-Wall 生成所有警告信息。
優化O0,O, O2, O3
These options control various sorts of optimizations.
Without any optimization option, the compiler’s goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the function and get exactly the results you would expect from the source code.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
The compiler performs optimization based on the knowledge it has of the program. Using the -funit-at-a-time flag will allow the compiler to consider information gained from later functions in the file when compiling a function. Compiling multiple files at once to a single output file (and using -funit-at-a-time) will allow the compiler to use information gained from all of the files when compiling each of them.
Not all optimizations are controlled directly by a flag. Only optimizations that have a flag are listed.
-O0
-O0: 不做任何優化,這是默認的編譯選項。
-O1
-O1:優化會消耗少多的編譯時間,它主要對代碼的分支,常量以及表達式等進行優化。
-O和-O1: 對程序做部分編譯優化,對于大函數,優化編譯占用稍微多的時間和相當大的內存。使用本項優化,編譯器會嘗試減小生成代碼的尺寸,以及縮短執行時間,但并不執行需要占用大量編譯時間的優化。 -O1打開的優化選項, 可參考最后的參考文獻。
Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function.
With -O, the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time.
-O turns on the following optimization flags:
-fdefer-pop -fmerge-constants -fthread-jumps -floop-optimize -fif-conversion -fif-conversion2 -fdelayed-branch -fguess-branch-probability -fcprop-registers
-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging.
-O2
-O2:會嘗試更多的寄存器級的優化以及指令級的優化,它會在編譯期間占用更多的內存和編譯時間。
Gcc將執行幾乎所有的不包含時間和空間折中的優化。當設置O2選項時,編譯器并不進行循環打開()loop unrolling以及函數內聯。
Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. The compiler does not perform loop unrolling or function inlining when you specify -O2. As compared to -O, this option increases both compilation time and the performance of the generated code.
-O2 turns on all optimization flags specified by -O. It also turns on the following optimization flags:
-fforce-mem -foptimize-sibling-calls -fstrength-reduce -fcse-follow-jumps -fcse-skip-blocks -frerun-cse-after-loop -frerun-loop-opt -fgcse -fgcse-lm -fgcse-sm -fgcse-las -fdelete-null-pointer-checks -fexpensive-optimizations -fregmove -fschedule-insns -fschedule-insns2 -fsched-interblock -fsched-spec -fcaller-saves -fpeephole2 -freorder-blocks -freorder-functions -fstrict-aliasing -funit-at-a-time -falign-functions -falign-jumps -falign-loops -falign-labels -fcrossjumping
Please note the warning under -fgcse about invoking -O2 on programs that use computed gotos.
-O3
-O3: 在O2的基礎上進行更多的優化。例如使用偽寄存器網絡,普通函數的內聯,以及針對循環的更多優化。在包含了O2所有的優化的基礎上,又打開了以下優化選項:
l -finline-functions:內聯簡單的函數到被調用函數中。
l -fweb:構建用于保存變量的偽寄存器網絡。 偽寄存器包含數據, 就像他們是寄存器一樣, 但是可以使用各種其他優化技術進行優化, 比如cse和loop優化技術。這種優化會使得調試變得更加的不可能,因為變量不再存放于原本的寄存器中。
l -frename-registers:在寄存器分配后,通過使用registers left over來避免預定代碼中的虛假依賴。這會使調試變得非常困難,因為變量不再存放于原本的寄存器中了。
l -funswitch-loops:將無變化的條件分支移出循環,取而代之的將結果副本放入循環中。
-Os
-Os:相當于-O2.5。是使用了所有-O2的優化選項,但又不縮減代碼尺寸的方法。
Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
-Os disables the following optimization flags:
-falign-functions -falign-jumps -falign-loops -falign-labels -freorder-blocks -fprefetch-loop-arrays
If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.
Reference
- gcc Options That Control Optimization
- gcc編譯優化-O0 -O1 -O2 -O3 -OS說明