文件操作基本流程。
計算機系統分為:計算機硬件,操作系統,應用程序三部分。
我們用python或其他語言編寫的應用程序若想要把數據永久保存下來,必須要保存于硬盤中,這就涉及到應用程序要操作硬件,眾所周知,應用程序是無法直接操作硬件的,這就用到了操作系統。操作系統把復雜的硬件操作封裝成簡單的接口給用戶/應用程序使用,其中文件就是操作系統提供給應用程序來操作硬盤虛擬概念,用戶或應用程序通過操作文件,可以將自己的數據永久保存下來。
有了文件的概念,我們無需再去考慮操作硬盤的細節,只需要關注操作文件的流程:
#1. 打開文件,得到文件句柄并賦值給一個變量 f=open('a.txt','r',encoding='utf-8') #默認打開模式就為r#2. 通過句柄對文件進行操作 data=f.read()#3. 關閉文件 f.close()
關閉文件的注意事項:


打開一個文件包含兩部分資源:操作系統級打開的文件+應用程序的變量。在操作完畢一個文件時,必須把與該文件的這兩部分資源一個不落地回收,回收方法為: 1、f.close() #回收操作系統級打開的文件 2、del f #回收應用程序級的變量 其中del f一定要發生在f.close()之后,否則就會導致操作系統打開的文件還沒有關閉,白白占用資源, 而python自動的垃圾回收機制決定了我們無需考慮del f,這就要求我們,在操作完畢文件后,一定要記住f.close()雖然我這么說,但是很多同學還是會很不要臉地忘記f.close(),對于這些不長腦子的同學,我們推薦傻瓜式操作方式:使用with關鍵字來幫我們管理上下文 with open('a.txt','w') as f:passwith open('a.txt','r') as read_f,open('b.txt','w') as write_f:data=read_f.read()write_f.write(data)注意
文件編碼
f=open(...)是由操作系統打開文件,那么如果我們沒有為open指定編碼,那么打開文件的默認編碼很明顯是操作系統說了算了,操作系統會用自己的默認編碼去打開文件,在windows下是gbk,在linux下是utf-8。
#這就用到了上節課講的字符編碼的知識:若要保證不亂碼,文件以什么方式存的,就要以什么方式打開。 f=open('a.txt','r',encoding='utf-8')
文件的打開模式
文件句柄 = open(‘文件路徑’,‘模式’)
#1. 打開文件的模式有(默認為文本模式): r ,只讀模式【默認模式,文件必須存在,不存在則拋出異常】 w,只寫模式【不可讀;不存在則創建;存在則清空內容】 a, 只追加寫模式【不可讀;不存在則創建;存在則只追加內容】#2. 對于非文本文件,我們只能使用b模式,"b"表示以字節的方式操作(而所有文件也都是以字節的形式存儲的,使用這種模式無需考慮文本文件的字符編碼、圖片文件的jgp格式、視頻文件的avi格式) rb wb ab 注:以b方式打開時,讀取到的內容是字節類型,寫入時也需要提供字節類型,不能指定編碼#3,‘+’模式(就是增加了一個功能) r+, 讀寫【可讀,可寫】 w+,寫讀【可寫,可讀】 a+, 寫讀【可寫,可讀】#4,以bytes類型操作的讀寫,寫讀,寫讀模式 r+b, 讀寫【可讀,可寫】 w+b,寫讀【可寫,可讀】 a+b, 寫讀【可寫,可讀】
?文件操作方法。
常用操作方法。
read(3):
1. 文件打開方式為文本模式時,代表讀取3個字符
2. 文件打開方式為b模式時,代表讀取3個字節
其余的文件內光標移動都是以字節為單位的如:seek,tell,truncate
注意:
1. seek有三種移動方式0,1,2,其中1和2必須在b模式下進行,但無論哪種模式,都是以bytes為單位移動的
2. truncate是截斷文件,所以文件的打開方式必須可寫,但是不能用w或w+等方式打開,因為那樣直接清空文件了,所以truncate要在r+或a或a+等模式下測試效果。
所有操作方法。


class file(object)def close(self): # real signature unknown; restored from __doc__ 關閉文件"""close() -> None or (perhaps) an integer. Close the file.Sets data attribute .closed to True. A closed file cannot be used forfurther I/O operations. close() may be called more than once withouterror. Some kinds of file objects (for example, opened by popen())may return an exit status upon closing."""def fileno(self): # real signature unknown; restored from __doc__ 文件描述符"""fileno() -> integer "file descriptor".This is needed for lower-level file interfaces, such os.read()."""return 0def flush(self): # real signature unknown; restored from __doc__ 刷新文件內部緩沖區""" flush() -> None. Flush the internal I/O buffer. """passdef isatty(self): # real signature unknown; restored from __doc__ 判斷文件是否是同意tty設備""" isatty() -> true or false. True if the file is connected to a tty device. """return Falsedef next(self): # real signature unknown; restored from __doc__ 獲取下一行數據,不存在,則報錯""" x.next() -> the next value, or raise StopIteration """passdef read(self, size=None): # real signature unknown; restored from __doc__ 讀取指定字節數據"""read([size]) -> read at most size bytes, returned as a string.If the size argument is negative or omitted, read until EOF is reached.Notice that when in non-blocking mode, less data than what was requestedmay be returned, even if no size parameter was given."""passdef readinto(self): # real signature unknown; restored from __doc__ 讀取到緩沖區,不要用,將被遺棄""" readinto() -> Undocumented. Don't use this; it may go away. """passdef readline(self, size=None): # real signature unknown; restored from __doc__ 僅讀取一行數據"""readline([size]) -> next line from the file, as a string.Retain newline. A non-negative size argument limits the maximumnumber of bytes to return (an incomplete line may be returned then).Return an empty string at EOF."""passdef readlines(self, size=None): # real signature unknown; restored from __doc__ 讀取所有數據,并根據換行保存值列表"""readlines([size]) -> list of strings, each a line from the file.Call readline() repeatedly and return a list of the lines so read.The optional size argument, if given, is an approximate bound on thetotal number of bytes in the lines returned."""return []def seek(self, offset, whence=None): # real signature unknown; restored from __doc__ 指定文件中指針位置"""seek(offset[, whence]) -> None. Move to new file position.Argument offset is a byte count. Optional argument whence defaults to (offset from start of file, offset should be >= 0); other values are 1(move relative to current position, positive or negative), and 2 (moverelative to end of file, usually negative, although many platforms allowseeking beyond the end of a file). If the file is opened in text mode,only offsets returned by tell() are legal. Use of other offsets causesundefined behavior.Note that not all file objects are seekable."""passdef tell(self): # real signature unknown; restored from __doc__ 獲取當前指針位置""" tell() -> current file position, an integer (may be a long integer). """passdef truncate(self, size=None): # real signature unknown; restored from __doc__ 截斷數據,僅保留指定之前數據"""truncate([size]) -> None. Truncate the file to at most size bytes.Size defaults to the current file position, as returned by tell()."""passdef write(self, p_str): # real signature unknown; restored from __doc__ 寫內容"""write(str) -> None. Write string str to file.Note that due to buffering, flush() or close() may be needed beforethe file on disk reflects the data written."""passdef writelines(self, sequence_of_strings): # real signature unknown; restored from __doc__ 將一個字符串列表寫入文件"""writelines(sequence_of_strings) -> None. Write the strings to the file.Note that newlines are not added. The sequence can be any iterable objectproducing strings. This is equivalent to calling write() for each string."""passdef xreadlines(self): # real signature unknown; restored from __doc__ 可用于逐行讀取文件,非全部"""xreadlines() -> returns self.For backward compatibility. File objects now include the performanceoptimizations previously implemented in the xreadlines module."""pass2.x


class TextIOWrapper(_TextIOBase):"""Character and line based layer over a BufferedIOBase object, buffer.encoding gives the name of the encoding that the stream will bedecoded or encoded with. It defaults to locale.getpreferredencoding(False).errors determines the strictness of encoding and decoding (seehelp(codecs.Codec) or the documentation for codecs.register) anddefaults to "strict".newline controls how line endings are handled. It can be None, '','\n', '\r', and '\r\n'. It works as follows:* On input, if newline is None, universal newlines mode isenabled. Lines in the input can end in '\n', '\r', or '\r\n', andthese are translated into '\n' before being returned to thecaller. If it is '', universal newline mode is enabled, but lineendings are returned to the caller untranslated. If it has any ofthe other legal values, input lines are only terminated by the givenstring, and the line ending is returned to the caller untranslated.* On output, if newline is None, any '\n' characters written aretranslated to the system default line separator, os.linesep. Ifnewline is '' or '\n', no translation takes place. If newline is anyof the other legal values, any '\n' characters written are translatedto the given string.If line_buffering is True, a call to flush is implied when a call towrite contains a newline character."""def close(self, *args, **kwargs): # real signature unknown 關閉文件passdef fileno(self, *args, **kwargs): # real signature unknown 文件描述符passdef flush(self, *args, **kwargs): # real signature unknown 刷新文件內部緩沖區passdef isatty(self, *args, **kwargs): # real signature unknown 判斷文件是否是同意tty設備passdef read(self, *args, **kwargs): # real signature unknown 讀取指定字節數據passdef readable(self, *args, **kwargs): # real signature unknown 是否可讀passdef readline(self, *args, **kwargs): # real signature unknown 僅讀取一行數據passdef seek(self, *args, **kwargs): # real signature unknown 指定文件中指針位置passdef seekable(self, *args, **kwargs): # real signature unknown 指針是否可操作passdef tell(self, *args, **kwargs): # real signature unknown 獲取指針位置passdef truncate(self, *args, **kwargs): # real signature unknown 截斷數據,僅保留指定之前數據passdef writable(self, *args, **kwargs): # real signature unknown 是否可寫passdef write(self, *args, **kwargs): # real signature unknown 寫內容passdef __getstate__(self, *args, **kwargs): # real signature unknownpassdef __init__(self, *args, **kwargs): # real signature unknownpass@staticmethod # known case of __new__def __new__(*args, **kwargs): # real signature unknown""" Create and return a new object. See help(type) for accurate signature. """passdef __next__(self, *args, **kwargs): # real signature unknown""" Implement next(self). """passdef __repr__(self, *args, **kwargs): # real signature unknown""" Return repr(self). """passbuffer = property(lambda self: object(), lambda self, v: None, lambda self: None) # default closed = property(lambda self: object(), lambda self, v: None, lambda self: None) # default encoding = property(lambda self: object(), lambda self, v: None, lambda self: None) # default errors = property(lambda self: object(), lambda self, v: None, lambda self: None) # default line_buffering = property(lambda self: object(), lambda self, v: None, lambda self: None) # default name = property(lambda self: object(), lambda self, v: None, lambda self: None) # default newlines = property(lambda self: object(), lambda self, v: None, lambda self: None) # default _CHUNK_SIZE = property(lambda self: object(), lambda self, v: None, lambda self: None) # default _finalizing = property(lambda self: object(), lambda self, v: None, lambda self: None) # default3.x
文件的修改。
文件的數據是存放于硬盤上的,因而只存在覆蓋、不存在修改這么一說,我們平時看到的修改文件,都是模擬出來的效果,具體的說有兩種實現方式:
方式一:將硬盤存放的該文件的內容全部加載到內存,在內存中是可以修改的,修改完畢后,再由內存覆蓋到硬盤(word,vim,nodpad++等編輯器)


import os # 調用系統模塊 with open('a.txt') as read_f,open('.a.txt.swap','w') as write_f:data=read_f.read() #全部讀入內存,如果文件很大,會很卡data=data.replace('alex','SB') #在內存中完成修改 write_f.write(data) #一次性寫入新文件 os.remove('a.txt') #刪除原文件 os.rename('.a.txt.swap','a.txt') #將新建的文件重命名為原文件
方式二:將硬盤存放的該文件的內容一行一行地讀入內存,修改完畢就寫入新文件,最后用新文件覆蓋源文件


import oswith open('a.txt') as read_f,open('.a.txt.swap','w') as write_f:for line in read_f:line=line.replace('alex','SB')write_f.write(line)os.remove('a.txt') os.rename('.a.txt.swap','a.txt')
?