Linux系統編程 day04 文件和目錄操作

1. 文件IO
- 1.1 open 函數
- 1.2 close函數
- 1.3 read函數
- 1.4 write函數
- 1.5 lseek函數
- 1.6 errno變量
- 1.7 文件示例1 讀寫文件
- 1.8 文件示例2 文件大小的計算
- 1.9 文件示例3 擴展文件大小
- 1.10 文件示例4 perror函數的使用
- 1.11 阻塞與非阻塞的測試
2. 文件和目錄
- 2.1 文件操作相關函數
- 2.2 目錄操作相關函數
- 2.3 dup/dup2/fcntl函數

1. 文件IO

在C語言階段學習了關于文件操作的一系列C標準函數，如fopen、fclose、fread、fwrite、fscanf、fprintf等，這一系列函數無不是以f開頭。而這一節中關于文件IO操作的函數則是Linux的系統函數。在Linux中，fopen函數會調用Linux系統調用中的open函數，fclose函數會調用Linux系統調用中的close函數。

C標準函數和系統調用函數是不同的，系統調用是由操作系統實現并給外部應用程序提供的編程接口，也就是含有Linux系統的系統調用函數的程序離開了Linux就會不能再編譯運行。也就是移植性變差了，不能實現跨平臺。而只使用C標準函數的程序是可以跨平臺的，不受操作系統的限制。

在我們之前調用fopen的時候會返回一個FILE *類型的指針，實際上這個指針維護著三個很重要的區域，分別是文件描述符、文件指針、文件緩沖區。每一個FILE文件流的緩沖區默認大小是8192字節。Linux系統的IO函數默認是沒有緩沖區的。關于文件描述符在上一節也就提過，本質是一個int類型的整數。

在一個進程啟動的時候，會默認打開三個文件描述符，分別如下：

#define STDIN_FILENO 0
#define STDOUT_FILENO 1
#define STDOUT_FILENO 2

而新打開的文件返回的是文件描述符表中未使用的最小文件描述符，一個文件描述符表最多可以存1024個文件描述符。調用open函數就可以打開或者創建文件，得到一個文件描述符。

1.1 open 函數

下面是一些關鍵描述：

SYNOPSIS#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>int open(const char *pathname, int flags);int open(const char *pathname, int flags, mode_t mode);DESCRIPTIONThe open() system call opens the file specified by pathname.  If the specified file does not exist, it may optionally (ifO_CREAT is specified in flags) be created by open().The return value of open() is a file descriptor, a small, nonnegative integer that is used  in  subsequent  system  calls(read(2),  write(2),  lseek(2),  fcntl(2), etc.) to refer to the open file.  The file descriptor returned by a successfulcall will be the lowest-numbered file descriptor not currently open for the process.A call to open() creates a new open file description, an entry in the system-wide table of open files.  The open file de‐scription  records  the  file  offset and the file status flags (see below).  A file descriptor is a reference to an openfile description; this reference is unaffected if pathname is subsequently removed or modified to refer  to  a  differentfile.  For further details on open file descriptions, see NOTES.The  argument flags must include one of the following access modes: O_RDONLY, O_WRONLY, or O_RDWR.  These request openingthe file read-only, write-only, or read/write, respectively.In addition, zero or more file creation flags and file status flags can be bitwise-or'd  in  flags.   The  file  creationflags  are  O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and O_TRUNC.  The file status flagsare all of the remaining flags listed below.  The distinction between these two groups of flags is that the file creationflags  affect  the semantics of the open operation itself, while the file status flags affect the semantics of subsequentI/O operations.  The file status flags can be retrieved and (in some cases) modified; see fcntl(2) for details.The full list of file creation flags and file status flags is as follows:O_APPENDThe file is opened in append mode.  Before each write(2), the file offset is positioned at the end of the file, asif  with  lseek(2).   The modification of the file offset and the write operation are performed as a single atomicstep.O_APPEND may lead to corrupted files on NFS filesystems if more than one process appends data to a file  at  once.This  is because NFS does not support appending to a file, so the client kernel has to simulate it, which can't bedone without a race condition.O_CREATIf pathname does not exist, create it as a regular file.The owner (user ID) of the new file is set to the effective user ID of the process.The group ownership (group ID) of the new file is set either to the effective group ID of the  process  (System  Vsemantics)  or to the group ID of the parent directory (BSD semantics).  On Linux, the behavior depends on whetherthe set-group-ID mode bit is set on the parent directory: if that bit is set, then BSD semantics apply; otherwise,System  V  semantics apply.  For some filesystems, the behavior also depends on the bsdgroups and sysvgroups mountoptions described in mount(8)).The mode argument specifies the file mode bits be applied when a new file is created.  This argument must be  sup‐plied when O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT nor O_TMPFILE is specified, then mode isignored.  The effective mode is modified by the process's umask in the usual way: in the absence of a default ACL,the mode of the created file is (mode & ~umask).  Note that this mode applies only to future accesses of the newlycreated file; the open() call that creates a read-only file may well return a read/write file descriptor.The following symbolic constants are provided for mode:S_IRWXU  00700 user (file owner) has read, write, and execute permissionS_IRUSR  00400 user has read permissionS_IWUSR  00200 user has write permissionS_IXUSR  00100 user has execute permissionS_IRWXG  00070 group has read, write, and execute permissionS_IRGRP  00040 group has read permissionS_IWGRP  00020 group has write permissionS_IXGRP  00010 group has execute permissionS_IRWXO  00007 others have read, write, and execute permissionS_IROTH  00004 others have read permissionS_IWOTH  00002 others have write permissionS_IXOTH  00001 others have execute permissionAccording to POSIX, the effect when other bits are set in mode is unspecified.  On Linux, the following  bits  arealso honored in mode:S_ISUID  0004000 set-user-ID bitS_ISGID  0002000 set-group-ID bit (see inode(7)).S_ISVTX  0001000 sticky bit (see inode(7)).O_TRUNCIf the file already exists and is a regular file and the access mode allows writing (i.e., is O_RDWR or  O_WRONLY)it  will  be  truncated  to length 0.  If the file is a FIFO or terminal device file, the O_TRUNC flag is ignored.Otherwise, the effect of O_TRUNC is unspecified.RETURN VALUEopen(), openat(), and creat() return the new file descriptor, or -1 if an error occurred (in which case, errno is set ap‐propriately).

上面的內容大概介紹了open函數的使用，通過上面的描述可以知道要是用open函數需要包含三個頭文件，分別是sys/types.h、sys/stat.h和fcntl.h。open函數有兩種調用形式，分別是

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);

該函數的作用是打開一個文件，并返回其文件描述符。其中前兩個參數都是一樣的，第一個參數pathname表示文件的路徑名字，第二個參數flags是一些標志，部分重要的標志如下：

標志	作用
O_RDWR	可讀可寫
O_RDONLY	只讀
O_WRONLY	只寫
O_APPEND	追加
O_CREAT	創建，這個flag需要指定最后一個參數`mode`
O_TRUNC	文件存在截斷文件內容為長度0

當指定了O_CREAT需要指定第三個參數mode，其中mode為用戶的權限，權限如下：

mode	權限
S_IRWXU	屬主可讀可寫可執行
S_IRUSR	屬主可讀
S_IWUSR	屬主可寫
S_IXUSR	屬主可執行
S_IRWXG	屬組可讀可寫可執行
S_IRGRP	屬組可讀
S_IWGRP	屬組可寫
S_IXGRP	屬組可執行
S_IRWXO	其它用戶可讀可寫可執行
S_IROTH	其它用戶可讀
S_IWOTH	其它用戶可寫
S_IXOTH	其它用戶可執行

上面的flag和mode如果想要使用多個都可以用位運算符|連接起來。

最后來看看該函數的返回值。該函數的返回值為一個新的文件描述符；如果發生了錯誤則返回-1，并會設置相應的errno。

1.2 close函數

SYNOPSIS#include <unistd.h>int close(int fd);DESCRIPTIONclose()  closes  a  file descriptor, so that it no longer refers to any file and may be reused.  Any record locks (see fc‐ntl(2)) held on the file it was associated with, and owned by the process, are removed (regardless of the file  descriptorthat was used to obtain the lock).If  fd  is the last file descriptor referring to the underlying open file description (see open(2)), the resources associ‐ated with the open file description are freed; if the file descriptor was the last reference to a file which has been  re‐moved using unlink(2), the file is deleted.RETURN VALUEclose() returns zero on success.  On error, -1 is returned, and errno is set appropriately.

該函數的原型為

int close(int fd);

該函數的作用是關閉打開的文件。參數fd為打開的文件描述符，關閉成功返回值為0，失敗返回-1，并設置相應的errno。需要注意的是這個函數的open函數的需要包含的頭文件并不一樣，該函數需要包含頭文件unistd.h。

1.3 read函數

SYNOPSIS#include <unistd.h>ssize_t read(int fd, void *buf, size_t count);DESCRIPTIONread() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.On  files that support seeking, the read operation commences at the file offset, and the file offset is incremented by thenumber of bytes read.  If the file offset is at or past the end of file, no bytes are read, and read() returns zero.If count is zero, read() may detect the errors described below.  In the absence of any errors, or if read() does not checkfor errors, a read() with a count of 0 returns zero and has no other effects.According  to  POSIX.1,  if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upperlimit on Linux.RETURN VALUEOn success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced  by  thisnumber.   It is not an error if this number is smaller than the number of bytes requested; this may happen for example be‐cause fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are  readingfrom a pipe, or from a terminal), or because read() was interrupted by a signal.  See also NOTES.On  error, -1 is returned, and errno is set appropriately.  In this case, it is left unspecified whether the file position(if any) changes.

read函數也需要頭文件unistd.h，其函數原型為：

ssize_t read(int fd, void *buf, size_t count);

該函數的作用是從fd指向的文件中讀取count和字節放入buf中。其中參數fd是文件描述符，buf是緩沖區的地址，count是讀取的字節數目。該函數的返回值為讀取到的字節數，如果是0表示已經到文件尾。如果失敗了就返回-1，并設置相應的errno。

1.4 write函數

SYNOPSIS#include <unistd.h>ssize_t write(int fd, const void *buf, size_t count);DESCRIPTIONwrite() writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd.The number of bytes written may be less than count if, for example, there is insufficient space on the underlying phys‐ical medium, or the RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the call  was  interrupted  by  asignal handler after having written less than count bytes.  (See also pipe(7).)For  a  seekable  file (i.e., one to which lseek(2) may be applied, for example, a regular file) writing takes place atthe file offset, and the file offset is incremented by the number of bytes actually written.  If the file was open(2)edwith  O_APPEND,  the file offset is first set to the end of the file before writing.  The adjustment of the file offsetand the write operation are performed as an atomic step.POSIX requires that a read(2) that can be proved to occur after a write() has returned will return the new data.   Notethat not all filesystems are POSIX conforming.According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upperlimit on Linux.RETURN VALUEOn success, the number of bytes written is returned.  On error, -1 is returned, and errno is set to indicate the  causeof the error.Note that a successful write() may transfer fewer than count bytes.  Such partial writes can occur for various reasons;for example, because there was insufficient space on the disk device to write all of the requested bytes, or because  ablocked  write()  to  a socket, pipe, or similar was interrupted by a signal handler after it had transferred some, butbefore it had transferred all of the requested bytes.  In the event of a partial write, the  caller  can  make  anotherwrite()  call to transfer the remaining bytes.  The subsequent call will either transfer further bytes or may result inan error (e.g., if the disk is now full).If count is zero and fd refers to a regular file, then write() may return a failure status if one of the  errors  belowis  detected.   If  no errors are detected, or error detection is not performed, 0 will be returned without causing anyother effect.  If count is zero and fd refers to a file other than a regular file, the results are not specified.

該函數所需要的頭文件和前面的read函數是一樣的，該函數原型為：

 ssize_t write(int fd, const void *buf, size_t count);

該函數的作用是將buf中的數據的前count個字節寫入到fd指向的文件中。其中fd是文件描述符，buf是需要進行寫操作數據的緩沖區，count是需要寫入的字節數。該函數的返回值為成功寫入的字節數目，失敗了返回-1并設置相應的errno。

1.5 lseek函數

SYNOPSIS#include <sys/types.h>#include <unistd.h>off_t lseek(int fd, off_t offset, int whence);DESCRIPTIONlseek() repositions the file offset of the open file description associated with the file descriptor fd to the argumentoffset according to the directive whence as follows:SEEK_SETThe file offset is set to offset bytes.SEEK_CURThe file offset is set to its current location plus offset bytes.SEEK_ENDThe file offset is set to the size of the file plus offset bytes.lseek() allows the file offset to be set beyond the end of the file (but this does not change the size  of  the  file).If data is later written at this point, subsequent reads of the data in the gap (a "hole") return null bytes ('\0') un‐til data is actually written into the gap.RETURN VALUEUpon successful completion, lseek() returns the resulting offset location as measured in bytes from  the  beginning  ofthe file.  On error, the value (off_t) -1 is returned and errno is set to indicate the error.

該函數需要頭文件sys/types.h和頭文件unistd.h。該函數的原型為：

 off_t lseek(int fd, off_t offset, int whence);

該函數的作用是改變文件指針的位置，將fd指向的文件的文件指針從whence處移動offset字節。參數fd是文件描述符，offset是偏移量，whence表示移動的起始位置。該函數的返回值為文件指針距離文件開頭處的偏移字節數，失敗則返回-1，并設置相應的errno。

1.6 errno變量

ERRNO(3)                                           Linux Programmer's Manual                                          ERRNO(3)NAMEerrno - number of last errorSYNOPSIS#include <errno.h>DESCRIPTIONThe  <errno.h>  header file defines the integer variable errno, which is set by system calls and some library functionsin the event of an error to indicate what went wrong.errnoThe value in errno is significant only when the return value of the call indicated an error (i.e., -1 from most  systemcalls;  -1 or NULL from most library functions); a function that succeeds is allowed to change errno.  The value of er‐rno is never set to zero by any system call or library function.For some system calls and library functions (e.g., getpriority(2)), -1 is a valid return on success.  In such cases,  asuccessful  return can be distinguished from an error return by setting errno to zero before the call, and then, if thecall returns a status that indicates that an error may have occurred, checking to see if errno has a nonzero value.errno is defined by the ISO C standard to be a modifiable lvalue of type int, and must not be explicitly declared;  er‐rno may be a macro.  errno is thread-local; setting it in one thread does not affect its value in any other thread.Error numbers and namesValid  error numbers are all positive numbers.  The <errno.h> header file defines symbolic names for each of the possi‐ble error numbers that may appear in errno.All the error names specified by POSIX.1 must have distinct values, with the exception of EAGAIN and EWOULDBLOCK, whichmay be the same.  On Linux, these two have the same value on all architectures.The  error  numbers that correspond to each symbolic name vary across UNIX systems, and even across different architec‐tures on Linux.  Therefore, numeric values are not included as part of the list of error names  below.   The  perror(3)and strerror(3) functions can be used to convert these names to corresponding textual error messages.On  any  particular Linux system, one can obtain a list of all symbolic error names and the corresponding error numbersusing the errno(1) command (part of the moreutils package):$ errno -lEPERM 1 Operation not permittedENOENT 2 No such file or directoryESRCH 3 No such processEINTR 4 Interrupted system callEIO 5 Input/output error...The errno(1) command can also be used to look up individual error numbers and names, and to  search  for  errors  usingstrings from the error description, as in the following examples:$ errno 2ENOENT 2 No such file or directory$ errno ESRCHESRCH 3 No such process$ errno -s permissionEACCES 13 Permission deniedList of error namesIn the list of the symbolic error names below, various names are marked as follows:*  POSIX.1-2001:  The name is defined by POSIX.1-2001, and is defined in later POSIX.1 versions, unless otherwise indi‐cated.*  POSIX.1-2008: The name is defined in POSIX.1-2008, but was not present in earlier POSIX.1 standards.*  C99: The name is defined by C99.  Below is a list of the symbolic error names that are defined on Linux:E2BIG           Argument list too long (POSIX.1-2001).EACCES          Permission denied (POSIX.1-2001).EADDRINUSE      Address already in use (POSIX.1-2001).EADDRNOTAVAIL   Address not available (POSIX.1-2001).EAFNOSUPPORT    Address family not supported (POSIX.1-2001).EAGAIN          Resource temporarily unavailable (may be the same value as EWOULDBLOCK) (POSIX.1-2001).EALREADY        Connection already in progress (POSIX.1-2001).EBADE           Invalid exchange.EBADF           Bad file descriptor (POSIX.1-2001).EBADFD          File descriptor in bad state.EBADMSG         Bad message (POSIX.1-2001).EBADR           Invalid request descriptor.EBADRQC         Invalid request code.EBADSLT         Invalid slot.EBUSY           Device or resource busy (POSIX.1-2001).ECANCELED       Operation canceled (POSIX.1-2001).ECHILD          No child processes (POSIX.1-2001).ECHRNG          Channel number out of range.ECOMM           Communication error on send.ECONNABORTED    Connection aborted (POSIX.1-2001).ECONNREFUSED    Connection refused (POSIX.1-2001).ECONNRESET      Connection reset (POSIX.1-2001).EDEADLK         Resource deadlock avoided (POSIX.1-2001).EDEADLOCK       On most architectures, a synonym for EDEADLK.   On  some  architectures  (e.g.,  Linux  MIPS,  PowerPC,SPARC), it is a separate error code "File locking deadlock error".EDESTADDRREQ    Destination address required (POSIX.1-2001).EDOM            Mathematics argument out of domain of function (POSIX.1, C99).EDQUOT          Disk quota exceeded (POSIX.1-2001).EEXIST          File exists (POSIX.1-2001).EFAULT          Bad address (POSIX.1-2001).EFBIG           File too large (POSIX.1-2001).EHOSTDOWN       Host is down.EHOSTUNREACH    Host is unreachable (POSIX.1-2001).EHWPOISON       Memory page has hardware error.EIDRM           Identifier removed (POSIX.1-2001).EILSEQ          Invalid or incomplete multibyte or wide character (POSIX.1, C99).The  text  shown  here  is the glibc error description; in POSIX.1, this error is described as "Illegalbyte sequence".EINPROGRESS     Operation in progress (POSIX.1-2001).EINTR           Interrupted function call (POSIX.1-2001); see signal(7).EINVAL          Invalid argument (POSIX.1-2001).EIO             Input/output error (POSIX.1-2001).EISCONN         Socket is connected (POSIX.1-2001).EISDIR          Is a directory (POSIX.1-2001).EISNAM          Is a named type file.EKEYEXPIRED     Key has expired.EKEYREJECTED    Key was rejected by service.EKEYREVOKED     Key has been revoked.EL2HLT          Level 2 halted.EL2NSYNC        Level 2 not synchronized.EL3HLT          Level 3 halted.EL3RST          Level 3 reset.ELIBACC         Cannot access a needed shared library.ELIBBAD         Accessing a corrupted shared library.ELIBMAX         Attempting to link in too many shared libraries.ELIBSCN         .lib section in a.out corruptedELIBEXEC        Cannot exec a shared library directly.ELNRANGE        Link number out of range.ELOOP           Too many levels of symbolic links (POSIX.1-2001).EMEDIUMTYPE     Wrong medium type.EMFILE          Too many open files (POSIX.1-2001).  Commonly caused by exceeding the RLIMIT_NOFILE resource limit  de‐scribed in getrlimit(2).EMLINK          Too many links (POSIX.1-2001).EMSGSIZE        Message too long (POSIX.1-2001).EMULTIHOP       Multihop attempted (POSIX.1-2001).ENAMETOOLONG    Filename too long (POSIX.1-2001).ENETDOWN        Network is down (POSIX.1-2001).ENETRESET       Connection aborted by network (POSIX.1-2001).ENETUNREACH     Network unreachable (POSIX.1-2001).ENFILE          Too  many open files in system (POSIX.1-2001).  On Linux, this is probably a result of encountering the/proc/sys/fs/file-max limit (see proc(5)).ENOANO          No anode.ENOBUFS         No buffer space available (POSIX.1 (XSI STREAMS option)).ENODATA         No message is available on the STREAM head read queue (POSIX.1-2001).ENODEV          No such device (POSIX.1-2001).ENOENT          No such file or directory (POSIX.1-2001).Typically, this error results when a specified pathname does not exist, or one of the components in thedirectory prefix of a pathname does not exist, or the specified pathname is a dangling symbolic link.ENOEXEC         Exec format error (POSIX.1-2001).ENOKEY          Required key not available.ENOLCK          No locks available (POSIX.1-2001).ENOLINK         Link has been severed (POSIX.1-2001).ENOMEDIUM       No medium found.ENOMEM          Not enough space/cannot allocate memory (POSIX.1-2001).ENOMSG          No message of the desired type (POSIX.1-2001).ENONET          Machine is not on the network.ENOPKG          Package not installed.ENOPROTOOPT     Protocol not available (POSIX.1-2001).ENOSPC          No space left on device (POSIX.1-2001).ENOSR           No STREAM resources (POSIX.1 (XSI STREAMS option)).ENOSTR          Not a STREAM (POSIX.1 (XSI STREAMS option)).ENOSYS          Function not implemented (POSIX.1-2001).ENOTBLK         Block device required.ENOTCONN        The socket is not connected (POSIX.1-2001).ENOTDIR         Not a directory (POSIX.1-2001).ENOTEMPTY       Directory not empty (POSIX.1-2001).ENOTRECOVERABLE State not recoverable (POSIX.1-2008).ENOTSOCK        Not a socket (POSIX.1-2001).ENOTSUP         Operation not supported (POSIX.1-2001).ENOTTY          Inappropriate I/O control operation (POSIX.1-2001).ENOTUNIQ        Name not unique on network.ENXIO           No such device or address (POSIX.1-2001).EOPNOTSUPP      Operation not supported on socket (POSIX.1-2001).(ENOTSUP  and  EOPNOTSUPP  have  the  same  value on Linux, but according to POSIX.1 these error valuesshould be distinct.)EOVERFLOW       Value too large to be stored in data type (POSIX.1-2001).EOWNERDEAD      Owner died (POSIX.1-2008).EPERM           Operation not permitted (POSIX.1-2001).EPFNOSUPPORT    Protocol family not supported.EPIPE           Broken pipe (POSIX.1-2001).EPROTO          Protocol error (POSIX.1-2001).EPROTONOSUPPORT Protocol not supported (POSIX.1-2001).EPROTOTYPE      Protocol wrong type for socket (POSIX.1-2001).ERANGE          Result too large (POSIX.1, C99).EREMCHG         Remote address changed.EREMOTE         Object is remote.EREMOTEIO       Remote I/O error.ERESTART        Interrupted system call should be restarted.ERFKILL         Operation not possible due to RF-kill.EROFS           Read-only filesystem (POSIX.1-2001).ESHUTDOWN       Cannot send after transport endpoint shutdown.ESPIPE          Invalid seek (POSIX.1-2001).ESOCKTNOSUPPORT Socket type not supported.ESRCH           No such process (POSIX.1-2001).ESTALE          Stale file handle (POSIX.1-2001).This error can occur for NFS and for other filesystems.ESTRPIPE        Streams pipe error.ETIME           Timer expired (POSIX.1 (XSI STREAMS option)).(POSIX.1 says "STREAM ioctl(2) timeout".)ETIMEDOUT       Connection timed out (POSIX.1-2001).ETOOMANYREFS    Too many references: cannot splice.ETXTBSY         Text file busy (POSIX.1-2001).EUCLEAN         Structure needs cleaning.EUNATCH         Protocol driver not attached.EUSERS          Too many users.EWOULDBLOCK     Operation would block (may be same value as EAGAIN) (POSIX.1-2001).EXDEV           Improper link (POSIX.1-2001).EXFULL          Exchange full.

需要注意的是如果需要設置errno變量需要引入頭文件errno.h。若發生錯誤了，使用perror函數即可打印相應的錯誤。如果想要看對應的錯誤指代的是什么字符串，可以使用strerror函數。函數原型為：

char *strerror(int errnum);

函數的參數為errno，返回值為該錯誤編號指代的錯誤信息。

1.7 文件示例1 讀寫文件

在這里使用Linux的系統調用函數的編寫一個程序可以打開一個文件，使用write向文件中寫入數據，再使用read函數將內容讀出來。

// open的使用
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>int main(int argc, char *argv[])
{printf("filename = [%s]\n", argv[1]);// 打開文件返回文件的文件描述符//int open(const char *pathname, int flags);//int open(const char *pathname, int flags, mode_t mode);int fd = open(argv[1], O_RDWR | O_CREAT, S_IRWXU | S_IRWXG | S_IRWXO);// 打開失敗會返回-1if(fd < 0){perror("file open error");return -1;}printf("fd = [%d]\n", fd);// 寫文件//ssize_t write(int fd, const void *buf, size_t count);int size = write(fd, "hello world", strlen("hello world"));printf("write size = [%d]\n", size);// 移動文件指針到開始處//off_t lseek(int fd, off_t offset, int whence);off_t offset = lseek(fd, 0, SEEK_SET);printf("offset = [%lu]\n", offset);// 讀文件//ssize_t read(int fd, void *buf, size_t count);char buf[128];memset(buf, 0, sizeof buf);size = read(fd, buf, sizeof buf);printf("read size = [%d]\n", size);printf("read = [%s]\n", buf);close(fd);return 0;
}

1.8 文件示例2 文件大小的計算

通過lseek函數去計算一個文件的大小。

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>int main(int argc, char *argv[])
{int fd = open(argv[1], O_RDWR);if(fd < 0){perror("file open error");return -1;}off_t size = lseek(fd, 0, SEEK_END);printf("[%s] size = [%ld]\n", argv[1], size);close(fd);return 0;
}

1.9 文件示例3 擴展文件大小

使用lseek函數使一個小文件擴展成大文件。方法為將文件指針移動到需要擴展大小的偏移處，再進行一次寫操作即可。

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>int main(int argc, char *argv[])
{int fd = open(argv[1], O_RDWR);if(fd < 0){perror("file open error");return -1;}// 擴展到200字節大小off_t offset = lseek(fd, 200, SEEK_SET);// 進行一次寫操作write(fd, "a", 1);close(fd);return 0;
}

1.10 文件示例4 perror函數的使用

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>int main(int argc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");if(errno == ENOENT){printf("same\n");}return -1;}int n = 0;for(n = 0; n < 64; n ++){errno = n;printf("[%d]:[%s]\n", errno, strerror(errno));}close(fd);return 0;
}

1.11 阻塞與非阻塞的測試

在Linux中我們讀取文件會有阻塞與非阻塞一說。那么我們如何判斷這個阻塞和非阻塞是文件的特性還是read函數的特性呢？這里我們會使用read函數去讀取不同類型的文件。如果讀取多個類型的文件得到的都是阻塞或者非阻塞，則說明阻塞和非阻塞是read函數的特性；如果多個類型的文件得到的阻塞和非阻塞并不一樣，那么說明阻塞和非阻塞是文件的特性，而不是read函數的特性。

使用read函數讀取普通文件。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#include <fcntl.h>// 驗證read漢書讀普通文件是否阻塞
int main(int agrc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}// 讀文件char buf[1024];memset(buf, 0, 1024);int n = read(fd, buf, sizeof(buf));printf("first: n = [%d], buf = [%s]\n", n, buf);// 再次讀文件，驗證read函數是否阻塞memset(buf, 0, sizeof(buf));n = read(fd, buf, sizeof(buf));printf("second: n = [%d], buf = [%s]\n", n, buf);// 關閉文件close(fd);return 0;
}

用read讀取設備文件：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <fcntl.h>// 驗證read函數讀設備文件是阻塞的
int main()
{// 標準輸入char buf[1024];memset(buf, 0, sizeof(buf));int n = read(STDIN_FILENO, buf, sizeof(buf));printf("n = [%d], buf = [%s]\n", n, buf);return 0;
}

通過這兩個例子的測試，我們可以得到阻塞和非阻塞是文件本身的屬性，而不是read函數的屬性。

2. 文件和目錄

在上面的內容里，我們寫了很多英文的內容。這些內容其實是Linux中為系統開發人員提供的幫助文檔。這個幫助文檔可以使用man命令進行查看。執行格式如下：

man 需要查看的內容
man 卷號 需要查看的內容

其中一共有9卷。默認不加卷號使用就是查看的第一次出現的卷號的位置，如果有多個卷都有相同的內容，則需要加卷號進行區分。這說說一下在系統編程中我們需要查詢的一些卷對應的內容。首先卷1對應了可執行程序以及shell命令；卷2對應著系統調用；卷3對應著C語言庫調用。其余的在C語言基礎的Linux和Unix中就已經提及過。

在Linux系統編程這一節，我們需要進行大量使用man命令查閱開發文檔，要學會如何查詢開發文檔以及使用開發文檔進行編程，這一點是很重要的。在接下來的后續內容中，將不會再繼續展示函數使用的開發文檔，需要查看需要讀者自行在Linux中執行man命令進行查閱。

2.1 文件操作相關函數

在文件操作的函數如下，

函數名	函數原型	函數參數	函數返回值	作用
stat	int stat(const char pathname, struct stat statbuf);	pathname: 文件路徑 statbuf: 存儲文件狀態內存	成功返回0，失敗返回-1并設置errno	將文件pathname的狀態信息保存到statbuf中
lstat	int lstat(const char pathname, struct stat statbuf);	pathname: 文件路徑 statbuf: 存儲文件狀態內存	成功返回0，失敗返回-1并設置errno	將文件pathname的狀態信息保存到statbuf中

這些函數的調用需要頭文件sys/types.h、sys/stat.h、unistd.h。上面的struct stat的結構體定義如下：

           struct stat {dev_t     st_dev;         /* ID of device containing file */ino_t     st_ino;         /* Inode number */mode_t    st_mode;        /* File type and mode */nlink_t   st_nlink;       /* Number of hard links */uid_t     st_uid;         /* User ID of owner */gid_t     st_gid;         /* Group ID of owner */dev_t     st_rdev;        /* Device ID (if special file) */off_t     st_size;        /* Total size, in bytes */blksize_t st_blksize;     /* Block size for filesystem I/O */blkcnt_t  st_blocks;      /* Number of 512B blocks allocated *//* Since Linux 2.6, the kernel supports nanosecondprecision for the following timestamp fields.For the details before Linux 2.6, see NOTES. */struct timespec st_atim;  /* Time of last access */struct timespec st_mtim;  /* Time of last modification */struct timespec st_ctim;  /* Time of last status change */#define st_atime st_atim.tv_sec      /* Backward compatibility */#define st_mtime st_mtim.tv_sec#define st_ctime st_ctim.tv_sec};

從上面的英文描述來看可以知道在st_mode成員中存儲了文件的類型和權限管理，這些存儲的信息都是依靠二進制位進行存儲的。文件類型如下：

           S_IFMT     0170000   bit mask for the file type bit fieldS_IFSOCK   0140000   socketS_IFLNK    0120000   symbolic linkS_IFREG    0100000   regular fileS_IFBLK    0060000   block deviceS_IFDIR    0040000   directoryS_IFCHR    0020000   character deviceS_IFIFO    0010000   FIFO

其中S_IFMT是文件類型的掩碼，具體的文件類型需要使用st_mode & S_IFMT進行確定，得到的值時什么就對應這上述的文件類型，如判斷一個文件是否為文件夾文件可以使用語句(st_mode & S_IFMT) == S_IFDIR。除此之外我們還有另外一種判斷文件類型的宏函數，如下：

           S_ISREG(m)  is it a regular file?S_ISDIR(m)  directory?S_ISCHR(m)  character device?S_ISBLK(m)  block device?S_ISFIFO(m) FIFO (named pipe)?S_ISLNK(m)  symbolic link?  (Not in POSIX.1-1996.)S_ISSOCK(m) socket?  (Not in POSIX.1-1996.)

其中這里的m傳入的就是st_mode。根據函數的真假來判斷這個文件的具體類型。與第一種方法不同的是，第一種可以使用switch來進行判斷，而這種方法只能使用if。如判斷一個文件是否是塊設備文件可以使用語句S_ISBLK(st_mode)。

在st_mode中，有屬主、屬組、其他人的各種權限，權限如下：

           S_IRWXU     00700   owner has read, write, and execute permissionS_IRUSR     00400   owner has read permissionS_IWUSR     00200   owner has write permissionS_IXUSR     00100   owner has execute permissionS_IRWXG     00070   group has read, write, and execute permissionS_IRGRP     00040   group has read permissionS_IWGRP     00020   group has write permissionS_IXGRP     00010   group has execute permissionS_IRWXO     00007   others (not in group) have read,  write,  andexecute permissionS_IROTH     00004   others have read permissionS_IWOTH     00002   others have write permissionS_IXOTH     00001   others have execute permission

判斷權限的時候只需要將st_mode與上述的權限進行與&操作，如果為真則表示有相應的權限。如判斷屬主是否有讀權限可以使用語句st_mode & S_IRUSR。

一般來說，對于時間我們更傾向于使用后面的宏定義出來的st_atime、st_mtime、st_ctime，這些可以與以前的兼容。而這里的st_atim、st_mtime、st_ctime也可以使用，兩者實際上是一樣的，都是秒數。其中struct timespec的結構體定義如下：

           struct timespec {time_t tv_sec;        /* seconds */long   tv_nsec;       /* nanoseconds */};

最后需要注意的是雖然stat函數與lstat函數使用是一樣的，甚至他們的作用都是一樣的，但是兩者對于鏈接文件還是有區別的。對于stat函數來說，調用之后得到的是鏈接文件指向文件的屬性，而lstat調用之后得到的是鏈接文件本身的屬性。當對普通文件進行操作的時候，兩者是沒有任何區別的。

接下來看一個關于stat函數的示例：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>// stat函數測試: 獲取文件大小，文件屬主和組
int main(int argc, char *argv[])
{// int stat(const char *pathname, struct stat *statbuf);struct stat st;stat(argv[1], &st);printf("uid = %d\n", st.st_uid);printf("gid = %d\n", st.st_gid);printf("size = %ld\n", st.st_size);printf("inode = %ld\n", st.st_ino);// 第一種方法判斷文件類型switch(st.st_mode & S_IFMT){case S_IFSOCK:printf("socket\n");break;case S_IFREG:printf("regular file\n");break;case S_IFLNK:printf("symbolic link\n");break;case S_IFBLK:printf("block device\n");break;case S_IFDIR:printf("directory\n");break;case S_IFCHR:printf("character device\n");break;case S_IFIFO:printf("FIFO\n");break;default:printf("unknown file\n");}// 第二種方法判斷文件類型if(S_ISREG(st.st_mode)){printf("regular file\n");}if(S_ISDIR(st.st_mode)){printf("directory\n");}if(S_ISCHR(st.st_mode)){printf("character device\n");}if(S_ISBLK(st.st_mode)){printf("block device\n");}if(S_ISFIFO(st.st_mode)){printf("FIFO\n");}if(S_ISLNK(st.st_mode)){printf("symbolic link\n");}if(S_ISSOCK(st.st_mode)){printf("socket\n");}// 權限// 屬主if(st.st_mode & S_IRUSR){printf("r");}else{printf("-");}if(st.st_mode & S_IWUSR){printf("w");}else{printf("-");}if(st.st_mode & S_IXUSR){printf("x");}else{printf("-");}// 組if(st.st_mode & S_IRGRP){printf("r");}else{printf("-");}if(st.st_mode & S_IWGRP){printf("w");}else{printf("-");}if(st.st_mode & S_IXGRP){printf("x");}else{printf("-");}// 其它人if(st.st_mode & S_IROTH){printf("r");}else{printf("-");}if(st.st_mode & S_IWOTH){printf("w");}else{printf("-");}if(st.st_mode & S_IXOTH){printf("x\n");}else{printf("-\n");}return 0;
}

2.2 目錄操作相關函數

目錄操作的相關函數如下：

函數名	函數原型	函數參數	函數返回值	作用
opendir	DIR opendir(const char name);	name: 目錄名	成功返回指向目錄流的指針，失敗返回NULL并設置errno	打開一個目錄
readdir	struct dirent readdir(DIR dirp);	dirp: 目錄流指針	返回一個指向目錄結構的指針，失敗返回NULL并設置errno	讀取目錄流的一個目錄結構
closedir	int closedir(DIR *dirp);	dirp: 目錄流指針	成功返回0，失敗返回-1并設置errno	關閉目錄

上面這些函數的調用需要頭文件sys/types.h、dirent.h。其中結構體struct dirent的定義如下：

           struct dirent {ino_t          d_ino;       /* Inode number */off_t          d_off;       /* Not an offset; see below */unsigned short d_reclen;    /* Length of this record */unsigned char  d_type;      /* Type of file; not supportedby all filesystem types */char           d_name[256]; /* Null-terminated filename */};

其中d_name是該文件的名字，d_type是文件類型。文件類型的值如下：

              DT_BLK      This is a block device.DT_CHR      This is a character device.DT_DIR      This is a directory.DT_FIFO     This is a named pipe (FIFO).DT_LNK      This is a symbolic link.DT_REG      This is a regular file.DT_SOCK     This is a UNIX domain socket.DT_UNKNOWN  The file type could not be determined.

關于目錄操作的函數使用的示例代碼如下。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dirent.h>int main(int argc, char *argv[])
{// 打開文件夾DIR *dir = opendir(argv[1]);if(dir == NULL){perror("opendir error");return -1;}// 讀取文件夾內容struct dirent *ds = NULL;while((ds = readdir(dir)) != NULL){printf("filename: [%s] ", ds->d_name);// 文件類型判斷if(ds->d_type == DT_BLK){printf("This is a block device!\n");}else if(ds->d_type == DT_CHR){printf("This is a character device!\n");}else if(ds->d_type == DT_DIR){printf("This is a derectory!\n");}else if(ds->d_type == DT_FIFO){printf("This is a named pipe!\n");}else if(ds->d_type == DT_LNK){printf("This is a symbolic link!\n");}else if(ds->d_type == DT_REG){printf("This is a regular file!\n");}else if(ds->d_type == DT_SOCK){printf("This is a UNIX domain socket!\n");}else{printf("The file type could not be determined!\n");}}return 0;
}

2.3 dup/dup2/fcntl函數

dup和dup2主要用于復制文件描述符，而fcntl不僅可以復制文件描述符，也可以獲取文件的flags并且設置flags。其中flags是打開文件open函數的第二個參數。這些函數的原型如下：

函數名	函數原型	函數參數	函數返回值	作用
dup	int dup(int oldfd);	oldfd: 需要復制的文件描述符	新的文件描述符，失敗返回-1并設置errno	復制文件描述符
dup2	int dup2(int oldfd, int newfd);	oldfd: 舊文件描述符 newfd: 新文件描述符	成功返回新的文件描述符即newfd，失敗返回-1并設置errno	復制文件描述符并指定為newfd
fcntl	int fcntl(int fd, int cmd, … /* arg */ );	fd: 文件描述符 cmd: 需要進行的操作 … ：參數取決于cmd	根據cmd不同返回值不一樣	復制文件描述符，獲取文件flags，設置flags等等，功能強大

在上面的函數中需要使用頭文件unistd.h，其中fcntl函數需要多加一個fcntl.h頭文件。其中fcntl函數的中常用cmd如下。

cmd	作用	函數返回值
F_DUPFD	復制文件描述符	成功返回文件描述符，失敗返回-1并設置errno
F_GETFL	獲取文件flags	成功返回文件flags，失敗返回-1并設置errno
F_SETFL	設置文件flags	成功返回0，失敗返回-1并設置errno

關于這個函數的cmd參數還有非常多，想要了解可以使用man fcntl進行查閱相關文檔。

常見的fcntl操作如下：

// 1 復制一個新的文件描述符:
int newfd = fcntl(fd, F_DUPFD, 0);
// 2 獲取文件的屬性標志
int flag = fcntl(fd, F_GETFL, 0)
// 3 設置文件狀態標志
flag = flag | O_APPEND;
fcntl(fd, F_SETFL, flag)

復制文件描述符使的工作原理如下：

在這里插入圖片描述

可以看到實際上是多個文件描述符指向同一個文件，此時我們對其中的一個文件描述符使用close操作的時候并不能真正關閉文件，需要所有的文件描述符都調用close才能真正關閉文件。由于多個文件描述符操作一個文件，所以都是共用的第一個文件指針。

在dup2中我們可以指定文件描述符，所以我們可以實現文件輸出或者輸入的重定向操作。

下面來看一些關于這些函數的例子。

關于dup函數的使用。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>int main(int argc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}// 復制文件描述符int newfd = dup(fd);// 寫文件write(fd, "helloworld", strlen("helloworld"));// 移動文件指針到文件開頭lseek(fd, 0, SEEK_SET);// 使用newfd讀文件char buf[1024];memset(buf, 0x00, sizeof buf);read(fd, buf, sizeof(buf));printf("%s\n", buf);close(fd);close(newfd);return 0;
}

關于dup2函數的使用。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>int main(int argc, char *argv[])
{int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}int newfd = 3;dup2(newfd, fd);// 向fd中寫數據write(fd, "nihaoya,damahou", strlen("nihaoya,damahou"));lseek(fd, 0, SEEK_SET);// 讀newfd的數據char buf[1024];memset(buf, 0x00, sizeof buf);read(fd, buf, sizeof buf);printf("buf = %s\n", buf);close(fd);close(newfd);return 0;
}

關于dup2函數的重定向使用。

// 實現文件重定向
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>int main(int argc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR | O_CREAT, 0777);if(fd < 0){perror("open error");return -1;}// 重定向輸出dup2(fd, STDOUT_FILENO);printf("老鐵6666\n");printf("老鐵NB Plus\n");printf("大馬猴，奧利給\n");close(fd);return 0;
}

關于fcntl函數的使用。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>int main(int argc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}// 獲得和設置flags屬性int flags = fcntl(fd, F_GETFL, 0);flags = flags | O_APPEND;fcntl(fd, F_SETFL, flags);// 寫文件write(fd, "hello world", strlen("hello world"));// 關閉文件close(fd);return 0;
}