看了一些博客,都是在說fuzzer和fork server進行交互,由fork server fork出子進程來執行程序,但是不太明白這兩者到底是如何在代碼層面進行交互的。
run_target中有這么一段代碼,大概意思是fuzzer給fork server傳遞prev_timed_out,然后再從fork server讀取子進程的pid,child_pid:
s32 res;/* In non-dumb mode, we have the fork server up and running, so simplytell it to have at it, and then read back PID. */if ((res = write(fsrv_ctl_fd, &prev_timed_out, 4)) != 4) {if (stop_soon) return 0;RPFATAL(res, "Unable to request new process from fork server (OOM?)");}if ((res = read(fsrv_st_fd, &child_pid, 4)) != 4) {if (stop_soon) return 0;RPFATAL(res, "Unable to request new process from fork server (OOM?)");}if (child_pid <= 0) FATAL("Fork server is misbehaving (OOM?)");
我現在的問題是,為什么fuzzer給fork server傳了個參數,fork server就直接返回pid了呢?這中間兩者是如何進行交互的?fork server做了什么,就傳遞了一個child_pid出來?
fork server進程是執行了下面這段代碼(刪去了一些不重要的代碼):
if (!forksrv_pid) {struct rlimit r;/* Isolate the process and configure standard descriptors. If out_file isspecified, stdin is /dev/null; otherwise, out_fd is cloned instead. */setsid();dup2(dev_null_fd, 1);dup2(dev_null_fd, 2);if (out_file) {dup2(dev_null_fd, 0);} else {dup2(out_fd, 0);close(out_fd);}/* Set up control and status pipes, close the unneeded original fds. */if (dup2(ctl_pipe[0], FORKSRV_FD) < 0) PFATAL("dup2() failed");if (dup2(st_pipe[1], FORKSRV_FD + 1) < 0) PFATAL("dup2() failed");close(ctl_pipe[0]);close(ctl_pipe[1]);close(st_pipe[0]);close(st_pipe[1]);close(out_dir_fd);close(dev_null_fd);close(dev_urandom_fd);close(fileno(plot_file));execv(target_path, argv);/* Use a distinctive bitmap signature to tell the parent about execv()falling through. */*(u32*)trace_bits = EXEC_FAIL_SIG;exit(0);}
可能需要理解setsid();?
簡單搜索了下,還得去理解進程相關只是,于是去問了bing,bing的回答告訴我:setsid()函數是一個系統調用,它的作用是創建一個新的會話(session),并使得當前進程成為會話的首進程(session leader),這個函數似乎和我想知道的東西沒有聯系。
問了下bing,并參考了這個博客:https://blog.csdn.net/Little_Bro/article/details/122694054,fork server的交互還和插樁有關系。
查看了AFL白皮書:https://github.com/mirrorer/afl/blob/master/docs/technical_details.txt,寫的很粗略,還是得去看作者的博客:https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html
Unfortunately, there is also a problem: especially for simple libraries, you may end up spending most of the time waiting for execve(), the linker, and all the library initialization routines to do their job. I’ve been thinking of ways to minimize this overhead in american fuzzy lop, but most of the ideas I had were annoyingly complicated. For example, it is possible to write a custom ELF loader and execute the program in-process while using mprotect() to temporarily lock down the memory used by the fuzzer itself - but things such as signal handling would be a mess. Another option would be to execute in a single child process, make a snapshot of the child’s process memory and then “rewind” to that image later on via /proc/pid/mem - but likewise, dealing with signals or file descriptors would require a ton of fragile hacks.
為什么不直接多次調用execve()?因為每次調用 execve()都會有一些預處理的開銷,作者想要加快這個過程。(不太了解預處理的過程,后續有需要再了解)
Luckily, Jann Horn figured a different, much simpler approach, and sent me a patch for afl out of the blue 😃 It boils down to injecting a small piece of code into the fuzzed binary - a feat that can be achieved via LD_PRELOAD, via PTRACE_POKETEXT, via compile-time instrumentation, or simply by rewriting the ELF binary ahead of the time. The purpose of the injected shim is to let execve() happen, get past the linker (ideally with LD_BIND_NOW=1, so that all the hard work is done beforehand), and then stop early on in the actual program, before it gets to processing any inputs generated by the fuzzer or doing anything else of interest. In fact, in the simplest variant, we can simply stop at main().
作者給出了一個很巧妙的解決方法,在被fuzzed的程序中插樁,讓這個程序在完成預處理后暫停(比如再main函數的第一句話暫停),然后在這里調用fork(),被fork出來的子進程將會直接跳過預處理過程,開始執行實際處理。
Once the designated point in the program is reached, our shim simply waits for commands from the fuzzer; when it receives a “go” message, it calls fork() to create an identical clone of the already-loaded program; thanks to the powers of copy-on-write, the clone is created very quickly yet enjoys a robust level of isolation from its older twin. Within the child process(fork server創建的子進程), the injected code returns control to the original binary, letting it process the fuzzer-supplied input data (and suffer any consequences of doing so). Within the parent, the shim relays the PID of the newly-crated process to the fuzzer and goes back to the command-wait loop.
作者把插入的代碼叫做slim(分隔片,還是很形象的),slim等待來自fuzzer的命令(對應run_target中的write(fsrv_ctl_fd, &prev_timed_out, 4)?),在收到fuzzer的命令后,fork server fork出來一個真正執行二進制程序的fuzzed進程,并給fuzzer返回一個pid。
這里有一個問題,函數參數是在哪里傳遞的呢?write(fsrv_ctl_fd, &prev_timed_out, 4)似乎沒有傳遞參數。
接下倆作者還討論了實際實現可能遇到的問題,以及插樁的匯編代碼
https://blog.csdn.net/Little_Bro/article/details/12269405,這個博客對插樁代碼進行了解釋,但是我目前不需要對插樁代碼理解的那么清楚,已經明白了fork server和fuzzer之間交互的邏輯