Due to the type of the stream’s contents, all the streams are divided into text and binary streams. The text streams ones are structured in lines; that is, they contain typographical characters (letters, digits, punctuation, etc.) arranged in rows (lines), as seen with the naked eye when you look at the contents of the file in the editor. This file is written (or read) mostly character by character, or line by line.
The binary streams don’t contain text but a sequence of bytes of any value. This sequence can be, for example, an executable program, an image, an audio or a video clip, a database file, etc.
Because these files don’t contain lines, the reads and writes relate to portions of data of any size. Hence the data is read/written byte by byte, or block by block, where the size of the block usually ranges from one to an arbitrarily chosen value.
1. A file needs to be open before it can be processed by a program, and it should be closed when the processing is finished.
Opening the file associates it with the stream, which is an abstract representation of the physical data stored on the media. The way in which the stream is processed is called open mode. Three open modes exist:
- read mode – only read operations are allowed; trying to write to the stream will cause an exception (the exception is named UnsupportedOperation, which inherits OSError and ValueError, and comes from the io module);
- write mode – only write operations are allowed; attempting to read the stream will cause the exception mentioned above;
- update mode – both writes and reads are allowed.
2. Depending on the physical file content, different Python classes can be used to process files. In general, the BufferedIOBase is able to process any file, while TextIOBase is a specialized class dedicated to processing text files (i.e. files containing human-visible texts divided into lines using new-line markers). Thus, the streams can be divided into binary and text ones.
The text streams ones are structured in lines; that is, they contain typographical characters (letters, digits, punctuation, etc.) arranged in rows (lines), as seen with the naked eye when you look at the contents of the file in the editor.
The binary streams don’t contain text but a sequence of bytes of any value. This sequence can be, for example, an executable program, an image, an audio or a video clip, a database file, etc.
3. The following open() function syntax is used to open a file:
|
1 |
open(file_name, mode=open_mode, encoding=text_encoding) |
The invocation creates a stream object and associates it with the file named file_name, using the specified open_mode and setting the specified text_encoding, or it raises an exception in the case of an error.
Example:
|
1 2 3 4 5 6 |
try: stream = open("C:\Users\User\Desktop\file.txt", "rt") # Processing goes here. stream.close() except Exception as exc: print("Cannot open the file:", exc) |
If the opening is successful, the function returns a stream object; otherwise, an exception is raised (e.g., FileNotFoundError if the file you’re going to read doesn’t exist);
Note: the mode and encoding arguments may be omitted – their default values are assumed then. The default opening mode is reading in text mode, while the default encoding depends on the platform used. As text is the default setting, we can skip the t in mode string.
Opening the streams: modes
r open mode: read
-
- the stream will be opened in read mode;
- the file associated with the stream must exist and has to be readable, otherwise the
open()function raises an exception.
w open mode: write
-
- the stream will be opened in write mode;
- the file associated with the stream doesn’t need to exist; if it doesn’t exist it will be created; if it exists, it will be truncated to the length of zero (erased); if the creation isn’t possible (e.g., due to system permissions) the
open()function raises an exception.
a open mode: append
-
- the stream will be opened in append mode;
- the file associated with the stream doesn’t need to exist; if it doesn’t exist, it will be created; if it exists the virtual recording head will be set at the end of the file (the previous content of the file remains untouched.)
r+ open mode: read and update
-
- the stream will be opened in read and update mode;
- the file associated with the stream must exist and has to be writeable, otherwise the
open()function raises an exception; - both read and write operations are allowed for the stream.
w+ open mode: write and update
-
- the stream will be opened in write and update mode;
- the file associated with the stream doesn’t need to exist; if it doesn’t exist, it will be created; the previous content of the file remains untouched;
- both read and write operations are allowed for the stream.
Selecting text and binary modes
If there is a letter b at the end of the mode string it means that the stream is to be opened in the binary mode.
If the mode string ends with a letter t the stream is opened in the text mode.
Text mode is the default behaviour assumed when no binary/text mode specifier is used.
Finally, the successful opening of the file will set the current file position (the virtual reading/writing head) before the first byte of the file if the mode is not a and after the last byte of file if the mode is set to a.
| Text mode | Binary mode | Description |
|---|---|---|
rt |
rb |
read |
wt |
wb |
write |
at |
ab |
append |
r+t |
r+b |
read and update |
w+t |
w+b |
write and update |
EXTRA
You can also open a file for its exclusive creation. You can do this using the x open mode. If the file already exists, the open() function will raise an exception.
Closing streams
The last operation performed on a stream (this doesn’t include the stdin, stdout, and stderr streams which don’t require it) should be closing.
That action is performed by a method invoked from within open stream object: stream.close().
4. Three predefined streams are already open when the program starts:
-
sys.stdin(as standard input)- the
stdinstream is normally associated with the keyboard, pre-open for reading and regarded as the primary data source for the running programs; - the well-known
input()function reads data fromstdinby default.
- the
-
sys.stdout(as standard output)- the
stdoutstream is normally associated with the screen, pre-open for writing, regarded as the primary target for outputting data by the running program; - the well-known
print()function outputs the data to thestdoutstream.
- the
-
sys.stderr(as standard error output)- the
stderrstream is normally associated with the screen, pre-open for writing, regarded as the primary place where the running program should send information on the errors encountered during its work; - the separation of
stdout(useful results produced by the program) from thestderr(error messages, undeniably useful but does not provide results) gives the possibility of redirecting these two types of information to the different targets.
- the
When our program starts, the three streams are already opened and don’t require any extra preparations. What’s more, your program can use these streams explicitly if you take care to import the sys module:
|
1 |
import sys |
5. The IOError object is equipped with a property named errno and you can access it as follows:
|
1 2 3 4 |
try: # Some stream operations. except IOError as exc: print(exc.errno) |
The value of the errno attribute can be compared with one of the predefined symbolic constants defined in the errno module.
Let’s take a look at some selected constants useful for detecting stream errors:
-
errno.EACCES→ Permission denied The error occurs when you try, for example, to open a file with the read only attribute for writing.
-
errno.EBADF→ Bad file number The error occurs when you try, for example, to operate with an unopened stream.
-
errno.EEXIST→ File exists The error occurs when you try, for example, to rename a file with its previous name.
-
errno.EFBIG→ File too large The error occurs when you try to create a file that is larger than the maximum allowed by the operating system.
-
errno.EISDIR→ Is a directory The error occurs when you try to treat a directory name as the name of an ordinary file.
-
errno.EMFILE→ Too many open files The error occurs when you try to simultaneously open more streams than acceptable for your operating system.
-
errno.ENOENT→ No such file or directory The error occurs when you try to access a non-existent file/directory.
-
errno.ENOSPC→ No space left on device The error occurs when there is no free space on the media.
The complete list is much longer (it includes also some error codes not related to the stream processing.)
Example:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import errno try: s = open("c:/users/user/Desktop/file.txt", "rt") # Actual processing goes here. s.close() except Exception as exc: if exc.errno == errno.ENOENT: print("The file doesn't exist.") elif exc.errno == errno.EMFILE: print("You've opened too many files.") else: print("The error number is:", exc.errno) |
6. There is a function that can dramatically simplify the error handling code. Its name is strerror(), and it comes from the os module and expects just one argument – an error number.
Its role is simple: you give an error number and get a string describing the meaning of the error.
Note: if you pass a non-existent error code (a number which is not bound to any actual error), the function will raise ValueError exception.
|
1 2 3 4 5 6 7 8 |
from os import strerror try: s = open("c:/users/user/Desktop/file.txt", "rt") # Actual processing goes here. s.close() except Exception as exc: print("The file could not be opened:", strerror(exc.errno)) |
7. To read a file’s contents, the following stream methods can be used:
-
read(number)– reads thenumbercharacters/bytes from the file and returns them as a string; is able to read the whole file at once;If applied to a text file, the function is able to:- read a desired number of characters (including just one) from the file, and return them as a string;
- read all the file contents, and return them as a string;
- if there is nothing more to read (the virtual reading head reaches the end of the file), the function returns an empty string.
|
1 2 3 4 |
# Opening tzop.txt in read mode, returning it as a file object: stream = open("tzop.txt", "rt", encoding = "utf-8") print(stream.read()) # printing the content of the file |
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
or
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
from os import strerror try: cnt = 0 s = open('text.txt', "rt") ch = s.read(1) while ch != '': print(ch, end='') cnt += 1 ch = s.read(1) s.close() print("\n\nCharacters in file:", cnt) except IOError as e: print("I/O error occurred: ", strerror(e.errno)) |
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Characters in file: 131
-
readline()– reads a single line from the text file;
-
1234567891011121314151617from os import strerrortry:ccnt = lcnt = 0s = open('text.txt', 'rt')line = s.readline()while line != '':lcnt += 1for ch in line:print(ch, end='')ccnt += 1line = s.readline()s.close()print("\n\nCharacters in file:", ccnt)print("Lines in file: ", lcnt)except IOError as e:print("I/O error occurred:", strerror(e.errno))
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Characters in file: 131Lines in file: 4
-
readlines(number)– reads thenumberlines from the text file; when invoked without arguments, tries to read all the file contents, and returns a list of strings, one element per file line.
-
123456789101112131415161718from os import strerrortry:ccnt = lcnt = 0s = open('text.txt', 'rt')lines = s.readlines(20)while len(lines) != 0:for line in lines:lcnt += 1for ch in line:print(ch, end='')ccnt += 1lines = s.readlines(10)s.close()print("\n\nCharacters in file:", ccnt)print("Lines in file: ", lcnt)except IOError as e:print("I/O error occurred:", strerror(e.errno))
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.Characters in file: 131
Lines in file: 4
8. To write new content into a file, the following stream methods can be used:
-
write(string)– writes astringto a text file;
|
1 2 3 4 5 6 7 8 9 10 11 |
from os import strerror try: fo = open('newtext.txt', 'wt') # A new file (newtext.txt) is created. for i in range(10): s = "line #" + str(i+1) + "\n" for ch in s: fo.write(ch) fo.close() except IOError as e: print("I/O error occurred: ", strerror(e.errno)) |
The code creates a file filled with the following text:
line #1
line #2
line #3
line #4
line #5
line #6
line #7
line #8
line #9
line #10
Writing whole lines to the text file:
|
1 2 3 4 5 6 7 8 9 |
from os import strerror try: fo = open('newtext.txt', 'wt') for i in range(10): fo.write("line #" + str(i+1) + "\n") fo.close() except IOError as e: print("I/O error occurred: ", strerror(e.errno)) |
if you want to send a message string to stderr to distinguish it from normal program output, it may look like this:
|
1 2 |
import sys sys.stderr.write("Error message") |
9. The open() method returns an iterable object which can be used to iterate through all the file’s lines inside a for loop. For example:
|
1 2 |
for line in open("file", "rt"): print(line, end='') |
The code copies the file’s contents to the console, line by line.
Note: the stream closes itself automatically when it reaches the end of the file.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from os import strerror try: ccnt = lcnt = 0 for line in open('text.txt', 'rt'): lcnt += 1 for ch in line: print(ch, end='') ccnt += 1 print("\n\nCharacters in file:", ccnt) print("Lines in file: ", lcnt) except IOError as e: print("I/O error occurred: ", strerror(e.errno)) |
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Characters in file: 131
Lines in file: 4
10. Amorphous data is data which have no specific shape or form – they are just a series of bytes. Amorphous data cannot be stored using any of the previously presented means – they are neither strings nor lists. There should be a special container able to handle such data. Python has more than one such container – one of them is a specialized class name bytearray – as the name suggests, it’s an array containing (amorphous) bytes.
|
1 |
data = bytearray(10) |
Such an invocation creates a bytearray object able to store ten bytes.
Note: such a constructor fills the whole array with zeros.
Bytearrays resemble lists in many respects. For example, they are mutable, they’re a subject of the len() function, and you can access any of their elements using conventional indexing.
There is one important limitation – you mustn’t set any byte array elements with a value which is not an integer (violating this rule will cause a TypeError exception) and you’re not allowed to assign a value that doesn’t come from the range 0 to 255 inclusive (unless you want to provoke a ValueError exception).
You can treat any byte array elements as integer values – just like in the example in the editor.
|
1 2 3 4 5 6 7 |
data = bytearray(10) for i in range(len(data)): data[i] = 10 - i for b in data: print(hex(b)) |
0xa
0x9
0x8
0x7
0x6
0x5
0x4
0x3
0x2
0x1
-
write(bytearray)– writes all the bytes ofbytearrayto a file;
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
from os import strerror data = bytearray(10) for i in range(len(data)): data[i] = 10 + i try: bf = open('file.bin', 'wb') bf.write(data) bf.close() except IOError as e: print("I/O error occurred:", strerror(e.errno)) # Your code that reads bytes from the stream should go here. |
-
- Reading from a binary file requires use of a specialized method name
readinto(bytearray), as the method doesn’t create a new byte array object, but fills a previously created one with the values taken from the binary file.
- Reading from a binary file requires use of a specialized method name
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from os import strerror data = bytearray(10) try: bf = open('file.bin', 'rb') bf.readinto(data) bf.close() for b in data: print(hex(b), end=' ') except IOError as e: print("I/O error occurred:", strerror(e.errno)) |
An alternative way of reading the contents of a binary file is offered by the method named read().
|
1 2 3 4 5 6 7 8 9 10 11 12 |
from os import strerror try: bf = open('file.bin', 'rb') data = bytearray(bf.read()) bf.close() for b in data: print(hex(b), end=' ') except IOError as e: print("I/O error occurred:", strerror(e.errno)) |
This class has some similarities to bytearray, with the exception of one significant difference – it’s immutable. Be careful – don’t use this kind of read if you’re not sure that the file’s contents will fit the available memory.
f the read() method is invoked with an argument, it specifies the maximum number of bytes to be read.
|
1 2 3 4 5 6 7 8 9 10 |
try: bf = open('file.bin', 'rb') data = bytearray(bf.read(5)) bf.close() for b in data: print(hex(b), end=' ') except IOError as e: print("I/O error occurred:", strerror(e.errno)) |
Copying files
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
from os import strerror srcname = input("Enter the source file name: ") try: src = open(srcname, 'rb') except IOError as e: print("Cannot open the source file: ", strerror(e.errno)) exit(e.errno) dstname = input("Enter the destination file name: ") try: dst = open(dstname, 'wb') except Exception as e: print("Cannot create the destination file: ", strerror(e.errno)) src.close() exit(e.errno) buffer = bytearray(65536) total = 0 try: readin = src.readinto(buffer) while readin > 0: written = dst.write(buffer[:readin]) total += written readin = src.readinto(buffer) except IOError as e: print("Cannot create the destination file: ", strerror(e.errno)) exit(e.errno) print(total,'byte(s) succesfully written') src.close() dst.close() |
Exercise 1
How do you encode an open() function’s mode argument value if you’re going to create a new text file to only fill it with an article?
"wt" or "w"
Exercise 2
What is the meaning of the value represented by errno.EACCES?
Permission denied: you’re not allowed to access the file’s contents.
Exercise 3
What is the expected output of the following code, assuming that the file named file does not exist?
|
1 2 3 4 5 6 7 8 9 10 11 |
import errno try: stream = open("file", "rb") print("exists") stream.close() except IOError as error: if error.errno == errno.ENOENT: print("absent") else: print("unknown") |
absent
Exercise 4
What do we expect from the readlines() method when the stream is associated with an empty file?
An empty list (a zero-length list).
Exercise 5
What is the following code intended to do?
|
1 2 3 4 |
for line in open("file", "rt"): for char in line: if char.lower() not in "aeiouy ": print(char, end='') |
It copies the file‘s contents to the console, ignoring all vowels.
Exercise 6
You’re going to process a bitmap stored in a file named image.png, and you want to read its contents as a whole into a bytearray variable named image. Add a line to the following code to achieve this goal.
|
1 2 3 4 5 6 7 8 |
try: stream = open("image.png", "rb") # Insert a line here. stream.close() except IOError: print("failed") else: print("success") |
image = bytearray(stream.read())

