Python language comes with a lot of inbuilt modules and packages. When writing a program, the need may arise to use the contents of a file. It could be a .doc, .docx, .jpg, .png, .pdf, .mp4, .txt or .csv file, to mention but a few. Using Python to get the file size can be crucial in determining the flow of your program. It can help ascertain the computational requirements needed or to make sure a user doesn’t submit a file beyond an allowable file size threshold. On the other hand, checking the file size could be used to make sure the file is not empty. In the following sections, we will be looking at different approaches used to get the size of files in Python.
The following are some of the ways you can get the size of files in Python.
As shown in the code snippet below, the stat function is an easy way for Python to get the file size in bytes. The relative path to the file is used as an argument for the stat function which is used to obtain the statistics of the file. In the example below, a .csv file was used. However, another file of a different format can be used and the stat function will still get the file size.
''' Using the OS module's stat function ''' import os file_size = os.stat('/content/sample_file.csv') print("file size :", file_size.st_size, "bytes")
Still using the OS module, the getsize function can be used in Python to get the file size. It is easy and straightforward and only requires one argument. As shown below, the file path is the argument which was used in Python to get the size of the file.
''' Using the os.path module's getsize function ''' import os file_size = os.path.getsize('/content/sample_file.csv') print("file size :", file_size, "bytes")
The pathlib module is simply a wrapper for most functions in the OS module. It still uses the stat function from the module. As shown below, using the pathlib module to get the size of a file in Python still shows a consistent result.
''' Using the pathlib module getsize function ''' import pathlib file_path = pathlib.Path('/content/sample_file.csv') file_size = file_path.stat().st_size print("file size :", file_size, "bytes")
The file object can be used in Python to get the file size. This approach takes a longer approach than the previously discussed ones. To implement this, the file is first opened after which the seek method is used to set the location of the cursor. Taking two arguments, the seek method sets the cursor at the beginning and end of the file's location. Furthermore, the tell method gets the corresponding cursor location or total bytes moved by the cursor.
''' Using the file object ''' # open the file file = open('/content/sample_file.csv') # position the cursor at the end file.seek(0, os.SEEK_END) # use tell to get size of file print("Size of file is :", file.tell(), "bytes")
Having gone through the different methods used to get the size of files in Python, you can now choose which one best works for you. It should be worth noting that the getsize function of the os.path module is the fastest, followed by the stat function. On the other hand, the file object performed the least and took more time. Knowing the speed of performance helps as the file size increases. The method to be used should be the one that gives the best performance in the least possible amount of time. This will enable you to write efficient code that optimally utilizes resources.
Arinze is an experienced Data Scientist (ML), driven by a strong desire to solve business challenges with Advanced technologies. He is also passionate about sharing knowledge through technical writing.
Tell us the skills you need and we'll find the best developer for you in days, not weeks.