Linux系統編程 day04 文件和目錄操作
- 1. 文件IO
- 1.1 open 函數
- 1.2 close函數
- 1.3 read函數
- 1.4 write函數
- 1.5 lseek函數
- 1.6 errno變量
- 1.7 文件示例1 讀寫文件
- 1.8 文件示例2 文件大小的計算
- 1.9 文件示例3 擴展文件大小
- 1.10 文件示例4 perror函數的使用
- 1.11 阻塞與非阻塞的測試
- 2. 文件和目錄
- 2.1 文件操作相關函數
- 2.2 目錄操作相關函數
- 2.3 dup/dup2/fcntl函數
1. 文件IO
在C語言階段學習了關于文件操作的一系列C標準函數,如fopen
、fclose
、fread
、fwrite
、fscanf
、fprintf
等,這一系列函數無不是以f
開頭。而這一節中關于文件IO操作的函數則是Linux的系統函數。在Linux中,fopen
函數會調用Linux系統調用中的open
函數,fclose
函數會調用Linux系統調用中的close
函數。
C標準函數和系統調用函數是不同的,系統調用是由操作系統實現并給外部應用程序提供的編程接口,也就是含有Linux系統的系統調用函數的程序離開了Linux就會不能再編譯運行。也就是移植性變差了,不能實現跨平臺。而只使用C標準函數的程序是可以跨平臺的,不受操作系統的限制。
在我們之前調用fopen
的時候會返回一個FILE *
類型的指針,實際上這個指針維護著三個很重要的區域,分別是文件描述符、文件指針、文件緩沖區。每一個FILE
文件流的緩沖區默認大小是8192字節。Linux系統的IO函數默認是沒有緩沖區的。關于文件描述符在上一節也就提過,本質是一個int
類型的整數。
在一個進程啟動的時候,會默認打開三個文件描述符,分別如下:
#define STDIN_FILENO 0
#define STDOUT_FILENO 1
#define STDOUT_FILENO 2
而新打開的文件返回的是文件描述符表中未使用的最小文件描述符,一個文件描述符表最多可以存1024個文件描述符。調用open
函數就可以打開或者創建文件,得到一個文件描述符。
1.1 open 函數
下面是一些關鍵描述:
SYNOPSIS#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>int open(const char *pathname, int flags);int open(const char *pathname, int flags, mode_t mode);DESCRIPTIONThe open() system call opens the file specified by pathname. If the specified file does not exist, it may optionally (ifO_CREAT is specified in flags) be created by open().The return value of open() is a file descriptor, a small, nonnegative integer that is used in subsequent system calls(read(2), write(2), lseek(2), fcntl(2), etc.) to refer to the open file. The file descriptor returned by a successfulcall will be the lowest-numbered file descriptor not currently open for the process.A call to open() creates a new open file description, an entry in the system-wide table of open files. The open file de‐scription records the file offset and the file status flags (see below). A file descriptor is a reference to an openfile description; this reference is unaffected if pathname is subsequently removed or modified to refer to a differentfile. For further details on open file descriptions, see NOTES.The argument flags must include one of the following access modes: O_RDONLY, O_WRONLY, or O_RDWR. These request openingthe file read-only, write-only, or read/write, respectively.In addition, zero or more file creation flags and file status flags can be bitwise-or'd in flags. The file creationflags are O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and O_TRUNC. The file status flagsare all of the remaining flags listed below. The distinction between these two groups of flags is that the file creationflags affect the semantics of the open operation itself, while the file status flags affect the semantics of subsequentI/O operations. The file status flags can be retrieved and (in some cases) modified; see fcntl(2) for details.The full list of file creation flags and file status flags is as follows:O_APPENDThe file is opened in append mode. Before each write(2), the file offset is positioned at the end of the file, asif with lseek(2). The modification of the file offset and the write operation are performed as a single atomicstep.O_APPEND may lead to corrupted files on NFS filesystems if more than one process appends data to a file at once.This is because NFS does not support appending to a file, so the client kernel has to simulate it, which can't bedone without a race condition.O_CREATIf pathname does not exist, create it as a regular file.The owner (user ID) of the new file is set to the effective user ID of the process.The group ownership (group ID) of the new file is set either to the effective group ID of the process (System Vsemantics) or to the group ID of the parent directory (BSD semantics). On Linux, the behavior depends on whetherthe set-group-ID mode bit is set on the parent directory: if that bit is set, then BSD semantics apply; otherwise,System V semantics apply. For some filesystems, the behavior also depends on the bsdgroups and sysvgroups mountoptions described in mount(8)).The mode argument specifies the file mode bits be applied when a new file is created. This argument must be sup‐plied when O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT nor O_TMPFILE is specified, then mode isignored. The effective mode is modified by the process's umask in the usual way: in the absence of a default ACL,the mode of the created file is (mode & ~umask). Note that this mode applies only to future accesses of the newlycreated file; the open() call that creates a read-only file may well return a read/write file descriptor.The following symbolic constants are provided for mode:S_IRWXU 00700 user (file owner) has read, write, and execute permissionS_IRUSR 00400 user has read permissionS_IWUSR 00200 user has write permissionS_IXUSR 00100 user has execute permissionS_IRWXG 00070 group has read, write, and execute permissionS_IRGRP 00040 group has read permissionS_IWGRP 00020 group has write permissionS_IXGRP 00010 group has execute permissionS_IRWXO 00007 others have read, write, and execute permissionS_IROTH 00004 others have read permissionS_IWOTH 00002 others have write permissionS_IXOTH 00001 others have execute permissionAccording to POSIX, the effect when other bits are set in mode is unspecified. On Linux, the following bits arealso honored in mode:S_ISUID 0004000 set-user-ID bitS_ISGID 0002000 set-group-ID bit (see inode(7)).S_ISVTX 0001000 sticky bit (see inode(7)).O_TRUNCIf the file already exists and is a regular file and the access mode allows writing (i.e., is O_RDWR or O_WRONLY)it will be truncated to length 0. If the file is a FIFO or terminal device file, the O_TRUNC flag is ignored.Otherwise, the effect of O_TRUNC is unspecified.RETURN VALUEopen(), openat(), and creat() return the new file descriptor, or -1 if an error occurred (in which case, errno is set ap‐propriately).
上面的內容大概介紹了open
函數的使用,通過上面的描述可以知道要是用open
函數需要包含三個頭文件,分別是sys/types.h
、sys/stat.h
和fcntl.h
。open
函數有兩種調用形式,分別是
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
該函數的作用是打開一個文件,并返回其文件描述符。其中前兩個參數都是一樣的,第一個參數pathname
表示文件的路徑名字,第二個參數flags
是一些標志,部分重要的標志如下:
標志 | 作用 |
---|---|
O_RDWR | 可讀可寫 |
O_RDONLY | 只讀 |
O_WRONLY | 只寫 |
O_APPEND | 追加 |
O_CREAT | 創建, 這個flag需要指定最后一個參數mode |
O_TRUNC | 文件存在截斷文件內容為長度0 |
當指定了O_CREAT
需要指定第三個參數mode
,其中mode
為用戶的權限,權限如下:
mode | 權限 |
---|---|
S_IRWXU | 屬主可讀可寫可執行 |
S_IRUSR | 屬主可讀 |
S_IWUSR | 屬主可寫 |
S_IXUSR | 屬主可執行 |
S_IRWXG | 屬組可讀可寫可執行 |
S_IRGRP | 屬組可讀 |
S_IWGRP | 屬組可寫 |
S_IXGRP | 屬組可執行 |
S_IRWXO | 其它用戶可讀可寫可執行 |
S_IROTH | 其它用戶可讀 |
S_IWOTH | 其它用戶可寫 |
S_IXOTH | 其它用戶可執行 |
上面的flag
和mode
如果想要使用多個都可以用位運算符|
連接起來。
最后來看看該函數的返回值。該函數的返回值為一個新的文件描述符;如果發生了錯誤則返回-1
,并會設置相應的errno
。
1.2 close函數
SYNOPSIS#include <unistd.h>int close(int fd);DESCRIPTIONclose() closes a file descriptor, so that it no longer refers to any file and may be reused. Any record locks (see fc‐ntl(2)) held on the file it was associated with, and owned by the process, are removed (regardless of the file descriptorthat was used to obtain the lock).If fd is the last file descriptor referring to the underlying open file description (see open(2)), the resources associ‐ated with the open file description are freed; if the file descriptor was the last reference to a file which has been re‐moved using unlink(2), the file is deleted.RETURN VALUEclose() returns zero on success. On error, -1 is returned, and errno is set appropriately.
該函數的原型為
int close(int fd);
該函數的作用是關閉打開的文件。參數fd
為打開的文件描述符,關閉成功返回值為0
,失敗返回-1
,并設置相應的errno
。需要注意的是這個函數的open
函數的需要包含的頭文件并不一樣,該函數需要包含頭文件unistd.h
。
1.3 read函數
SYNOPSIS#include <unistd.h>ssize_t read(int fd, void *buf, size_t count);DESCRIPTIONread() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.On files that support seeking, the read operation commences at the file offset, and the file offset is incremented by thenumber of bytes read. If the file offset is at or past the end of file, no bytes are read, and read() returns zero.If count is zero, read() may detect the errors described below. In the absence of any errors, or if read() does not checkfor errors, a read() with a count of 0 returns zero and has no other effects.According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upperlimit on Linux.RETURN VALUEOn success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by thisnumber. It is not an error if this number is smaller than the number of bytes requested; this may happen for example be‐cause fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are readingfrom a pipe, or from a terminal), or because read() was interrupted by a signal. See also NOTES.On error, -1 is returned, and errno is set appropriately. In this case, it is left unspecified whether the file position(if any) changes.
read
函數也需要頭文件unistd.h
,其函數原型為:
ssize_t read(int fd, void *buf, size_t count);
該函數的作用是從fd
指向的文件中讀取count
和字節放入buf
中。其中參數fd
是文件描述符,buf
是緩沖區的地址,count
是讀取的字節數目。該函數的返回值為讀取到的字節數,如果是0
表示已經到文件尾。如果失敗了就返回-1
,并設置相應的errno
。
1.4 write函數
SYNOPSIS#include <unistd.h>ssize_t write(int fd, const void *buf, size_t count);DESCRIPTIONwrite() writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd.The number of bytes written may be less than count if, for example, there is insufficient space on the underlying phys‐ical medium, or the RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the call was interrupted by asignal handler after having written less than count bytes. (See also pipe(7).)For a seekable file (i.e., one to which lseek(2) may be applied, for example, a regular file) writing takes place atthe file offset, and the file offset is incremented by the number of bytes actually written. If the file was open(2)edwith O_APPEND, the file offset is first set to the end of the file before writing. The adjustment of the file offsetand the write operation are performed as an atomic step.POSIX requires that a read(2) that can be proved to occur after a write() has returned will return the new data. Notethat not all filesystems are POSIX conforming.According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upperlimit on Linux.RETURN VALUEOn success, the number of bytes written is returned. On error, -1 is returned, and errno is set to indicate the causeof the error.Note that a successful write() may transfer fewer than count bytes. Such partial writes can occur for various reasons;for example, because there was insufficient space on the disk device to write all of the requested bytes, or because ablocked write() to a socket, pipe, or similar was interrupted by a signal handler after it had transferred some, butbefore it had transferred all of the requested bytes. In the event of a partial write, the caller can make anotherwrite() call to transfer the remaining bytes. The subsequent call will either transfer further bytes or may result inan error (e.g., if the disk is now full).If count is zero and fd refers to a regular file, then write() may return a failure status if one of the errors belowis detected. If no errors are detected, or error detection is not performed, 0 will be returned without causing anyother effect. If count is zero and fd refers to a file other than a regular file, the results are not specified.
該函數所需要的頭文件和前面的read
函數是一樣的,該函數原型為:
ssize_t write(int fd, const void *buf, size_t count);
該函數的作用是將buf
中的數據的前count
個字節寫入到fd
指向的文件中。其中fd
是文件描述符,buf
是需要進行寫操作數據的緩沖區,count
是需要寫入的字節數。該函數的返回值為成功寫入的字節數目,失敗了返回-1
并設置相應的errno
。
1.5 lseek函數
SYNOPSIS#include <sys/types.h>#include <unistd.h>off_t lseek(int fd, off_t offset, int whence);DESCRIPTIONlseek() repositions the file offset of the open file description associated with the file descriptor fd to the argumentoffset according to the directive whence as follows:SEEK_SETThe file offset is set to offset bytes.SEEK_CURThe file offset is set to its current location plus offset bytes.SEEK_ENDThe file offset is set to the size of the file plus offset bytes.lseek() allows the file offset to be set beyond the end of the file (but this does not change the size of the file).If data is later written at this point, subsequent reads of the data in the gap (a "hole") return null bytes ('\0') un‐til data is actually written into the gap.RETURN VALUEUpon successful completion, lseek() returns the resulting offset location as measured in bytes from the beginning ofthe file. On error, the value (off_t) -1 is returned and errno is set to indicate the error.
該函數需要頭文件sys/types.h
和頭文件unistd.h
。該函數的原型為:
off_t lseek(int fd, off_t offset, int whence);
該函數的作用是改變文件指針的位置,將fd
指向的文件的文件指針從whence
處移動offset
字節。參數fd
是文件描述符,offset
是偏移量,whence
表示移動的起始位置。該函數的返回值為文件指針距離文件開頭處的偏移字節數,失敗則返回-1
,并設置相應的errno
。
1.6 errno變量
ERRNO(3) Linux Programmer's Manual ERRNO(3)NAMEerrno - number of last errorSYNOPSIS#include <errno.h>DESCRIPTIONThe <errno.h> header file defines the integer variable errno, which is set by system calls and some library functionsin the event of an error to indicate what went wrong.errnoThe value in errno is significant only when the return value of the call indicated an error (i.e., -1 from most systemcalls; -1 or NULL from most library functions); a function that succeeds is allowed to change errno. The value of er‐rno is never set to zero by any system call or library function.For some system calls and library functions (e.g., getpriority(2)), -1 is a valid return on success. In such cases, asuccessful return can be distinguished from an error return by setting errno to zero before the call, and then, if thecall returns a status that indicates that an error may have occurred, checking to see if errno has a nonzero value.errno is defined by the ISO C standard to be a modifiable lvalue of type int, and must not be explicitly declared; er‐rno may be a macro. errno is thread-local; setting it in one thread does not affect its value in any other thread.Error numbers and namesValid error numbers are all positive numbers. The <errno.h> header file defines symbolic names for each of the possi‐ble error numbers that may appear in errno.All the error names specified by POSIX.1 must have distinct values, with the exception of EAGAIN and EWOULDBLOCK, whichmay be the same. On Linux, these two have the same value on all architectures.The error numbers that correspond to each symbolic name vary across UNIX systems, and even across different architec‐tures on Linux. Therefore, numeric values are not included as part of the list of error names below. The perror(3)and strerror(3) functions can be used to convert these names to corresponding textual error messages.On any particular Linux system, one can obtain a list of all symbolic error names and the corresponding error numbersusing the errno(1) command (part of the moreutils package):$ errno -lEPERM 1 Operation not permittedENOENT 2 No such file or directoryESRCH 3 No such processEINTR 4 Interrupted system callEIO 5 Input/output error...The errno(1) command can also be used to look up individual error numbers and names, and to search for errors usingstrings from the error description, as in the following examples:$ errno 2ENOENT 2 No such file or directory$ errno ESRCHESRCH 3 No such process$ errno -s permissionEACCES 13 Permission deniedList of error namesIn the list of the symbolic error names below, various names are marked as follows:* POSIX.1-2001: The name is defined by POSIX.1-2001, and is defined in later POSIX.1 versions, unless otherwise indi‐cated.* POSIX.1-2008: The name is defined in POSIX.1-2008, but was not present in earlier POSIX.1 standards.* C99: The name is defined by C99. Below is a list of the symbolic error names that are defined on Linux:E2BIG Argument list too long (POSIX.1-2001).EACCES Permission denied (POSIX.1-2001).EADDRINUSE Address already in use (POSIX.1-2001).EADDRNOTAVAIL Address not available (POSIX.1-2001).EAFNOSUPPORT Address family not supported (POSIX.1-2001).EAGAIN Resource temporarily unavailable (may be the same value as EWOULDBLOCK) (POSIX.1-2001).EALREADY Connection already in progress (POSIX.1-2001).EBADE Invalid exchange.EBADF Bad file descriptor (POSIX.1-2001).EBADFD File descriptor in bad state.EBADMSG Bad message (POSIX.1-2001).EBADR Invalid request descriptor.EBADRQC Invalid request code.EBADSLT Invalid slot.EBUSY Device or resource busy (POSIX.1-2001).ECANCELED Operation canceled (POSIX.1-2001).ECHILD No child processes (POSIX.1-2001).ECHRNG Channel number out of range.ECOMM Communication error on send.ECONNABORTED Connection aborted (POSIX.1-2001).ECONNREFUSED Connection refused (POSIX.1-2001).ECONNRESET Connection reset (POSIX.1-2001).EDEADLK Resource deadlock avoided (POSIX.1-2001).EDEADLOCK On most architectures, a synonym for EDEADLK. On some architectures (e.g., Linux MIPS, PowerPC,SPARC), it is a separate error code "File locking deadlock error".EDESTADDRREQ Destination address required (POSIX.1-2001).EDOM Mathematics argument out of domain of function (POSIX.1, C99).EDQUOT Disk quota exceeded (POSIX.1-2001).EEXIST File exists (POSIX.1-2001).EFAULT Bad address (POSIX.1-2001).EFBIG File too large (POSIX.1-2001).EHOSTDOWN Host is down.EHOSTUNREACH Host is unreachable (POSIX.1-2001).EHWPOISON Memory page has hardware error.EIDRM Identifier removed (POSIX.1-2001).EILSEQ Invalid or incomplete multibyte or wide character (POSIX.1, C99).The text shown here is the glibc error description; in POSIX.1, this error is described as "Illegalbyte sequence".EINPROGRESS Operation in progress (POSIX.1-2001).EINTR Interrupted function call (POSIX.1-2001); see signal(7).EINVAL Invalid argument (POSIX.1-2001).EIO Input/output error (POSIX.1-2001).EISCONN Socket is connected (POSIX.1-2001).EISDIR Is a directory (POSIX.1-2001).EISNAM Is a named type file.EKEYEXPIRED Key has expired.EKEYREJECTED Key was rejected by service.EKEYREVOKED Key has been revoked.EL2HLT Level 2 halted.EL2NSYNC Level 2 not synchronized.EL3HLT Level 3 halted.EL3RST Level 3 reset.ELIBACC Cannot access a needed shared library.ELIBBAD Accessing a corrupted shared library.ELIBMAX Attempting to link in too many shared libraries.ELIBSCN .lib section in a.out corruptedELIBEXEC Cannot exec a shared library directly.ELNRANGE Link number out of range.ELOOP Too many levels of symbolic links (POSIX.1-2001).EMEDIUMTYPE Wrong medium type.EMFILE Too many open files (POSIX.1-2001). Commonly caused by exceeding the RLIMIT_NOFILE resource limit de‐scribed in getrlimit(2).EMLINK Too many links (POSIX.1-2001).EMSGSIZE Message too long (POSIX.1-2001).EMULTIHOP Multihop attempted (POSIX.1-2001).ENAMETOOLONG Filename too long (POSIX.1-2001).ENETDOWN Network is down (POSIX.1-2001).ENETRESET Connection aborted by network (POSIX.1-2001).ENETUNREACH Network unreachable (POSIX.1-2001).ENFILE Too many open files in system (POSIX.1-2001). On Linux, this is probably a result of encountering the/proc/sys/fs/file-max limit (see proc(5)).ENOANO No anode.ENOBUFS No buffer space available (POSIX.1 (XSI STREAMS option)).ENODATA No message is available on the STREAM head read queue (POSIX.1-2001).ENODEV No such device (POSIX.1-2001).ENOENT No such file or directory (POSIX.1-2001).Typically, this error results when a specified pathname does not exist, or one of the components in thedirectory prefix of a pathname does not exist, or the specified pathname is a dangling symbolic link.ENOEXEC Exec format error (POSIX.1-2001).ENOKEY Required key not available.ENOLCK No locks available (POSIX.1-2001).ENOLINK Link has been severed (POSIX.1-2001).ENOMEDIUM No medium found.ENOMEM Not enough space/cannot allocate memory (POSIX.1-2001).ENOMSG No message of the desired type (POSIX.1-2001).ENONET Machine is not on the network.ENOPKG Package not installed.ENOPROTOOPT Protocol not available (POSIX.1-2001).ENOSPC No space left on device (POSIX.1-2001).ENOSR No STREAM resources (POSIX.1 (XSI STREAMS option)).ENOSTR Not a STREAM (POSIX.1 (XSI STREAMS option)).ENOSYS Function not implemented (POSIX.1-2001).ENOTBLK Block device required.ENOTCONN The socket is not connected (POSIX.1-2001).ENOTDIR Not a directory (POSIX.1-2001).ENOTEMPTY Directory not empty (POSIX.1-2001).ENOTRECOVERABLE State not recoverable (POSIX.1-2008).ENOTSOCK Not a socket (POSIX.1-2001).ENOTSUP Operation not supported (POSIX.1-2001).ENOTTY Inappropriate I/O control operation (POSIX.1-2001).ENOTUNIQ Name not unique on network.ENXIO No such device or address (POSIX.1-2001).EOPNOTSUPP Operation not supported on socket (POSIX.1-2001).(ENOTSUP and EOPNOTSUPP have the same value on Linux, but according to POSIX.1 these error valuesshould be distinct.)EOVERFLOW Value too large to be stored in data type (POSIX.1-2001).EOWNERDEAD Owner died (POSIX.1-2008).EPERM Operation not permitted (POSIX.1-2001).EPFNOSUPPORT Protocol family not supported.EPIPE Broken pipe (POSIX.1-2001).EPROTO Protocol error (POSIX.1-2001).EPROTONOSUPPORT Protocol not supported (POSIX.1-2001).EPROTOTYPE Protocol wrong type for socket (POSIX.1-2001).ERANGE Result too large (POSIX.1, C99).EREMCHG Remote address changed.EREMOTE Object is remote.EREMOTEIO Remote I/O error.ERESTART Interrupted system call should be restarted.ERFKILL Operation not possible due to RF-kill.EROFS Read-only filesystem (POSIX.1-2001).ESHUTDOWN Cannot send after transport endpoint shutdown.ESPIPE Invalid seek (POSIX.1-2001).ESOCKTNOSUPPORT Socket type not supported.ESRCH No such process (POSIX.1-2001).ESTALE Stale file handle (POSIX.1-2001).This error can occur for NFS and for other filesystems.ESTRPIPE Streams pipe error.ETIME Timer expired (POSIX.1 (XSI STREAMS option)).(POSIX.1 says "STREAM ioctl(2) timeout".)ETIMEDOUT Connection timed out (POSIX.1-2001).ETOOMANYREFS Too many references: cannot splice.ETXTBSY Text file busy (POSIX.1-2001).EUCLEAN Structure needs cleaning.EUNATCH Protocol driver not attached.EUSERS Too many users.EWOULDBLOCK Operation would block (may be same value as EAGAIN) (POSIX.1-2001).EXDEV Improper link (POSIX.1-2001).EXFULL Exchange full.
需要注意的是如果需要設置errno
變量需要引入頭文件errno.h
。若發生錯誤了,使用perror
函數即可打印相應的錯誤。如果想要看對應的錯誤指代的是什么字符串,可以使用strerror
函數。函數原型為:
char *strerror(int errnum);
函數的參數為errno
,返回值為該錯誤編號指代的錯誤信息。
1.7 文件示例1 讀寫文件
在這里使用Linux的系統調用函數的編寫一個程序可以打開一個文件,使用write
向文件中寫入數據,再使用read
函數將內容讀出來。
// open的使用
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>int main(int argc, char *argv[])
{printf("filename = [%s]\n", argv[1]);// 打開文件返回文件的文件描述符//int open(const char *pathname, int flags);//int open(const char *pathname, int flags, mode_t mode);int fd = open(argv[1], O_RDWR | O_CREAT, S_IRWXU | S_IRWXG | S_IRWXO);// 打開失敗會返回-1if(fd < 0){perror("file open error");return -1;}printf("fd = [%d]\n", fd);// 寫文件//ssize_t write(int fd, const void *buf, size_t count);int size = write(fd, "hello world", strlen("hello world"));printf("write size = [%d]\n", size);// 移動文件指針到開始處//off_t lseek(int fd, off_t offset, int whence);off_t offset = lseek(fd, 0, SEEK_SET);printf("offset = [%lu]\n", offset);// 讀文件//ssize_t read(int fd, void *buf, size_t count);char buf[128];memset(buf, 0, sizeof buf);size = read(fd, buf, sizeof buf);printf("read size = [%d]\n", size);printf("read = [%s]\n", buf);close(fd);return 0;
}
1.8 文件示例2 文件大小的計算
通過lseek
函數去計算一個文件的大小。
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>int main(int argc, char *argv[])
{int fd = open(argv[1], O_RDWR);if(fd < 0){perror("file open error");return -1;}off_t size = lseek(fd, 0, SEEK_END);printf("[%s] size = [%ld]\n", argv[1], size);close(fd);return 0;
}
1.9 文件示例3 擴展文件大小
使用lseek
函數使一個小文件擴展成大文件。方法為將文件指針移動到需要擴展大小的偏移處,再進行一次寫操作即可。
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>int main(int argc, char *argv[])
{int fd = open(argv[1], O_RDWR);if(fd < 0){perror("file open error");return -1;}// 擴展到200字節大小off_t offset = lseek(fd, 200, SEEK_SET);// 進行一次寫操作write(fd, "a", 1);close(fd);return 0;
}
1.10 文件示例4 perror函數的使用
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>int main(int argc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");if(errno == ENOENT){printf("same\n");}return -1;}int n = 0;for(n = 0; n < 64; n ++){errno = n;printf("[%d]:[%s]\n", errno, strerror(errno));}close(fd);return 0;
}
1.11 阻塞與非阻塞的測試
在Linux中我們讀取文件會有阻塞與非阻塞一說。那么我們如何判斷這個阻塞和非阻塞是文件的特性還是read
函數的特性呢?這里我們會使用read
函數去讀取不同類型的文件。如果讀取多個類型的文件得到的都是阻塞或者非阻塞,則說明阻塞和非阻塞是read
函數的特性;如果多個類型的文件得到的阻塞和非阻塞并不一樣,那么說明阻塞和非阻塞是文件的特性,而不是read
函數的特性。
使用read
函數讀取普通文件。
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#include <fcntl.h>// 驗證read漢書讀普通文件是否阻塞
int main(int agrc, char *argv[])
{// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}// 讀文件char buf[1024];memset(buf, 0, 1024);int n = read(fd, buf, sizeof(buf));printf("first: n = [%d], buf = [%s]\n", n, buf);// 再次讀文件,驗證read函數是否阻塞memset(buf, 0, sizeof(buf));n = read(fd, buf, sizeof(buf));printf("second: n = [%d], buf = [%s]\n", n, buf);// 關閉文件close(fd);return 0;
}
用read
讀取設備文件:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <fcntl.h>// 驗證read函數讀設備文件是阻塞的
int main()
{// 標準輸入char buf[1024];memset(buf, 0, sizeof(buf));int n = read(STDIN_FILENO, buf, sizeof(buf));printf("n = [%d], buf = [%s]\n", n, buf);return 0;
}
通過這兩個例子的測試,我們可以得到阻塞和非阻塞是文件本身的屬性,而不是read
函數的屬性。
2. 文件和目錄
在上面的內容里,我們寫了很多英文的內容。這些內容其實是Linux中為系統開發人員提供的幫助文檔。這個幫助文檔可以使用man
命令進行查看。執行格式如下:
man 需要查看的內容
man 卷號 需要查看的內容
其中一共有9卷。默認不加卷號使用就是查看的第一次出現的卷號的位置,如果有多個卷都有相同的內容,則需要加卷號進行區分。這說說一下在系統編程中我們需要查詢的一些卷對應的內容。首先卷1對應了可執行程序以及shell命令;卷2對應著系統調用;卷3對應著C語言庫調用。其余的在C語言基礎的Linux和Unix中就已經提及過。
在Linux系統編程這一節,我們需要進行大量使用man
命令查閱開發文檔,要學會如何查詢開發文檔以及使用開發文檔進行編程,這一點是很重要的。在接下來的后續內容中,將不會再繼續展示函數使用的開發文檔,需要查看需要讀者自行在Linux中執行man
命令進行查閱。
2.1 文件操作相關函數
在文件操作的函數如下,
函數名 | 函數原型 | 函數參數 | 函數返回值 | 作用 |
---|---|---|---|---|
stat | int stat(const char *pathname, struct stat *statbuf); | pathname: 文件路徑 statbuf: 存儲文件狀態內存 | 成功返回0,失敗返回-1并設置errno | 將文件pathname的狀態信息保存到statbuf中 |
lstat | int lstat(const char *pathname, struct stat *statbuf); | pathname: 文件路徑 statbuf: 存儲文件狀態內存 | 成功返回0,失敗返回-1并設置errno | 將文件pathname的狀態信息保存到statbuf中 |
這些函數的調用需要頭文件sys/types.h
、sys/stat.h
、unistd.h
。上面的struct stat
的結構體定義如下:
struct stat {dev_t st_dev; /* ID of device containing file */ino_t st_ino; /* Inode number */mode_t st_mode; /* File type and mode */nlink_t st_nlink; /* Number of hard links */uid_t st_uid; /* User ID of owner */gid_t st_gid; /* Group ID of owner */dev_t st_rdev; /* Device ID (if special file) */off_t st_size; /* Total size, in bytes */blksize_t st_blksize; /* Block size for filesystem I/O */blkcnt_t st_blocks; /* Number of 512B blocks allocated *//* Since Linux 2.6, the kernel supports nanosecondprecision for the following timestamp fields.For the details before Linux 2.6, see NOTES. */struct timespec st_atim; /* Time of last access */struct timespec st_mtim; /* Time of last modification */struct timespec st_ctim; /* Time of last status change */#define st_atime st_atim.tv_sec /* Backward compatibility */#define st_mtime st_mtim.tv_sec#define st_ctime st_ctim.tv_sec};
從上面的英文描述來看可以知道在st_mode
成員中存儲了文件的類型和權限管理,這些存儲的信息都是依靠二進制位進行存儲的。文件類型如下:
S_IFMT 0170000 bit mask for the file type bit fieldS_IFSOCK 0140000 socketS_IFLNK 0120000 symbolic linkS_IFREG 0100000 regular fileS_IFBLK 0060000 block deviceS_IFDIR 0040000 directoryS_IFCHR 0020000 character deviceS_IFIFO 0010000 FIFO
其中S_IFMT
是文件類型的掩碼,具體的文件類型需要使用st_mode & S_IFMT
進行確定,得到的值時什么就對應這上述的文件類型,如判斷一個文件是否為文件夾文件可以使用語句(st_mode & S_IFMT) == S_IFDIR
。除此之外我們還有另外一種判斷文件類型的宏函數,如下:
S_ISREG(m) is it a regular file?S_ISDIR(m) directory?S_ISCHR(m) character device?S_ISBLK(m) block device?S_ISFIFO(m) FIFO (named pipe)?S_ISLNK(m) symbolic link? (Not in POSIX.1-1996.)S_ISSOCK(m) socket? (Not in POSIX.1-1996.)
其中這里的m
傳入的就是st_mode
。根據函數的真假來判斷這個文件的具體類型。與第一種方法不同的是,第一種可以使用switch
來進行判斷,而這種方法只能使用if
。如判斷一個文件是否是塊設備文件可以使用語句S_ISBLK(st_mode)
。
在st_mode
中,有屬主、屬組、其他人的各種權限,權限如下:
S_IRWXU 00700 owner has read, write, and execute permissionS_IRUSR 00400 owner has read permissionS_IWUSR 00200 owner has write permissionS_IXUSR 00100 owner has execute permissionS_IRWXG 00070 group has read, write, and execute permissionS_IRGRP 00040 group has read permissionS_IWGRP 00020 group has write permissionS_IXGRP 00010 group has execute permissionS_IRWXO 00007 others (not in group) have read, write, andexecute permissionS_IROTH 00004 others have read permissionS_IWOTH 00002 others have write permissionS_IXOTH 00001 others have execute permission
判斷權限的時候只需要將st_mode
與上述的權限進行與&
操作,如果為真則表示有相應的權限。如判斷屬主是否有讀權限可以使用語句st_mode & S_IRUSR
。
一般來說,對于時間我們更傾向于使用后面的宏定義出來的st_atime
、st_mtime
、st_ctime
,這些可以與以前的兼容。而這里的st_atim
、st_mtime
、st_ctime
也可以使用,兩者實際上是一樣的,都是秒數。其中struct timespec
的結構體定義如下:
struct timespec {time_t tv_sec; /* seconds */long tv_nsec; /* nanoseconds */};
最后需要注意的是雖然stat
函數與lstat
函數使用是一樣的,甚至他們的作用都是一樣的,但是兩者對于鏈接文件還是有區別的。對于stat
函數來說,調用之后得到的是鏈接文件指向文件的屬性,而lstat
調用之后得到的是鏈接文件本身的屬性。當對普通文件進行操作的時候,兩者是沒有任何區別的。
接下來看一個關于stat
函數的示例:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>// stat函數測試: 獲取文件大小,文件屬主和組
int main(int argc, char *argv[])
{// int stat(const char *pathname, struct stat *statbuf);struct stat st;stat(argv[1], &st);printf("uid = %d\n", st.st_uid);printf("gid = %d\n", st.st_gid);printf("size = %ld\n", st.st_size);printf("inode = %ld\n", st.st_ino);// 第一種方法判斷文件類型switch(st.st_mode & S_IFMT){case S_IFSOCK:printf("socket\n");break;case S_IFREG:printf("regular file\n");break;case S_IFLNK:printf("symbolic link\n");break;case S_IFBLK:printf("block device\n");break;case S_IFDIR:printf("directory\n");break;case S_IFCHR:printf("character device\n");break;case S_IFIFO:printf("FIFO\n");break;default:printf("unknown file\n");}// 第二種方法判斷文件類型if(S_ISREG(st.st_mode)){printf("regular file\n");}if(S_ISDIR(st.st_mode)){printf("directory\n");}if(S_ISCHR(st.st_mode)){printf("character device\n");}if(S_ISBLK(st.st_mode)){printf("block device\n");}if(S_ISFIFO(st.st_mode)){printf("FIFO\n");}if(S_ISLNK(st.st_mode)){printf("symbolic link\n");}if(S_ISSOCK(st.st_mode)){printf("socket\n");}// 權限// 屬主if(st.st_mode & S_IRUSR){printf("r");}else{printf("-");}if(st.st_mode & S_IWUSR){printf("w");}else{printf("-");}if(st.st_mode & S_IXUSR){printf("x");}else{printf("-");}// 組if(st.st_mode & S_IRGRP){printf("r");}else{printf("-");}if(st.st_mode & S_IWGRP){printf("w");}else{printf("-");}if(st.st_mode & S_IXGRP){printf("x");}else{printf("-");}// 其它人if(st.st_mode & S_IROTH){printf("r");}else{printf("-");}if(st.st_mode & S_IWOTH){printf("w");}else{printf("-");}if(st.st_mode & S_IXOTH){printf("x\n");}else{printf("-\n");}return 0;
}
2.2 目錄操作相關函數
目錄操作的相關函數如下:
函數名 | 函數原型 | 函數參數 | 函數返回值 | 作用 |
---|---|---|---|---|
opendir | DIR *opendir(const char *name); | name: 目錄名 | 成功返回指向目錄流的指針,失敗返回NULL并設置errno | 打開一個目錄 |
readdir | struct dirent *readdir(DIR *dirp); | dirp: 目錄流指針 | 返回一個指向目錄結構的指針,失敗返回NULL并設置errno | 讀取目錄流的一個目錄結構 |
closedir | int closedir(DIR *dirp); | dirp: 目錄流指針 | 成功返回0,失敗返回-1并設置errno | 關閉目錄 |
上面這些函數的調用需要頭文件sys/types.h
、dirent.h
。其中結構體struct dirent
的定義如下:
struct dirent {ino_t d_ino; /* Inode number */off_t d_off; /* Not an offset; see below */unsigned short d_reclen; /* Length of this record */unsigned char d_type; /* Type of file; not supportedby all filesystem types */char d_name[256]; /* Null-terminated filename */};
其中d_name
是該文件的名字,d_type
是文件類型。文件類型的值如下:
DT_BLK This is a block device.DT_CHR This is a character device.DT_DIR This is a directory.DT_FIFO This is a named pipe (FIFO).DT_LNK This is a symbolic link.DT_REG This is a regular file.DT_SOCK This is a UNIX domain socket.DT_UNKNOWN The file type could not be determined.
關于目錄操作的函數使用的示例代碼如下。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dirent.h>int main(int argc, char *argv[])
{// 打開文件夾DIR *dir = opendir(argv[1]);if(dir == NULL){perror("opendir error");return -1;}// 讀取文件夾內容struct dirent *ds = NULL;while((ds = readdir(dir)) != NULL){printf("filename: [%s] ", ds->d_name);// 文件類型判斷if(ds->d_type == DT_BLK){printf("This is a block device!\n");}else if(ds->d_type == DT_CHR){printf("This is a character device!\n");}else if(ds->d_type == DT_DIR){printf("This is a derectory!\n");}else if(ds->d_type == DT_FIFO){printf("This is a named pipe!\n");}else if(ds->d_type == DT_LNK){printf("This is a symbolic link!\n");}else if(ds->d_type == DT_REG){printf("This is a regular file!\n");}else if(ds->d_type == DT_SOCK){printf("This is a UNIX domain socket!\n");}else{printf("The file type could not be determined!\n");}}return 0;
}
2.3 dup/dup2/fcntl函數
dup
和dup2
主要用于復制文件描述符,而fcntl
不僅可以復制文件描述符,也可以獲取文件的flags
并且設置flags
。其中flags
是打開文件open
函數的第二個參數。這些函數的原型如下:
函數名 | 函數原型 | 函數參數 | 函數返回值 | 作用 |
---|---|---|---|---|
dup | int dup(int oldfd); | oldfd: 需要復制的文件描述符 | 新的文件描述符,失敗返回-1并設置errno | 復制文件描述符 |
dup2 | int dup2(int oldfd, int newfd); | oldfd: 舊文件描述符 newfd: 新文件描述符 | 成功返回新的文件描述符即newfd,失敗返回-1并設置errno | 復制文件描述符并指定為newfd |
fcntl | int fcntl(int fd, int cmd, … /* arg */ ); | fd: 文件描述符 cmd: 需要進行的操作 … :參數取決于cmd | 根據cmd不同返回值不一樣 | 復制文件描述符,獲取文件flags,設置flags等等,功能強大 |
在上面的函數中需要使用頭文件unistd.h
,其中fcntl
函數需要多加一個fcntl.h
頭文件。其中fcntl
函數的中常用cmd
如下。
cmd | 作用 | 函數返回值 |
---|---|---|
F_DUPFD | 復制文件描述符 | 成功返回文件描述符,失敗返回-1并設置errno |
F_GETFL | 獲取文件flags | 成功返回文件flags,失敗返回-1并設置errno |
F_SETFL | 設置文件flags | 成功返回0,失敗返回-1并設置errno |
關于這個函數的cmd
參數還有非常多,想要了解可以使用man fcntl
進行查閱相關文檔。
常見的fcntl
操作如下:
// 1 復制一個新的文件描述符:
int newfd = fcntl(fd, F_DUPFD, 0);
// 2 獲取文件的屬性標志
int flag = fcntl(fd, F_GETFL, 0)
// 3 設置文件狀態標志
flag = flag | O_APPEND;
fcntl(fd, F_SETFL, flag)
復制文件描述符使的工作原理如下:
可以看到實際上是多個文件描述符指向同一個文件,此時我們對其中的一個文件描述符使用close
操作的時候并不能真正關閉文件,需要所有的文件描述符都調用close
才能真正關閉文件。由于多個文件描述符操作一個文件,所以都是共用的第一個文件指針。
在dup2
中我們可以指定文件描述符,所以我們可以實現文件輸出或者輸入的重定向操作。
下面來看一些關于這些函數的例子。
-
關于dup函數的使用。
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> #include <sys/stat.h>int main(int argc, char *argv[]) {// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}// 復制文件描述符int newfd = dup(fd);// 寫文件write(fd, "helloworld", strlen("helloworld"));// 移動文件指針到文件開頭lseek(fd, 0, SEEK_SET);// 使用newfd讀文件char buf[1024];memset(buf, 0x00, sizeof buf);read(fd, buf, sizeof(buf));printf("%s\n", buf);close(fd);close(newfd);return 0; }
-
關于dup2函數的使用。
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h>int main(int argc, char *argv[]) {int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}int newfd = 3;dup2(newfd, fd);// 向fd中寫數據write(fd, "nihaoya,damahou", strlen("nihaoya,damahou"));lseek(fd, 0, SEEK_SET);// 讀newfd的數據char buf[1024];memset(buf, 0x00, sizeof buf);read(fd, buf, sizeof buf);printf("buf = %s\n", buf);close(fd);close(newfd);return 0; }
-
關于dup2函數的重定向使用。
// 實現文件重定向 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h>int main(int argc, char *argv[]) {// 打開文件int fd = open(argv[1], O_RDWR | O_CREAT, 0777);if(fd < 0){perror("open error");return -1;}// 重定向輸出dup2(fd, STDOUT_FILENO);printf("老鐵6666\n");printf("老鐵NB Plus\n");printf("大馬猴,奧利給\n");close(fd);return 0; }
-
關于fcntl函數的使用。
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h>int main(int argc, char *argv[]) {// 打開文件int fd = open(argv[1], O_RDWR);if(fd < 0){perror("open error");return -1;}// 獲得和設置flags屬性int flags = fcntl(fd, F_GETFL, 0);flags = flags | O_APPEND;fcntl(fd, F_SETFL, flags);// 寫文件write(fd, "hello world", strlen("hello world"));// 關閉文件close(fd);return 0; }