How to Download Files in Python

Downloading files from the Internet over HTTP in Python using requests library and tqdm to print nice progress bars.
  · 4 min read · Updated jul 2020 · General Python Topics

Downloading files from the Internet is one of the most common daily tasks to perform on the Web. Also, it is important due to the fact that a lot of successful softwares allow their users to download files from the Internet. In this tutorial, you will learn how you can download files over HTTP in Python using requests library.

Related: How to Use Hash Algorithms in Python using hashlib.

Let's get started, installing required dependencies:

pip3 install requests tqdm

We gonna use tqdm module here just to print a good looking progress bar in the downloading process.

Open up a new Python file and import:

from tqdm import tqdm
import requests

Choose any file from the internet to download, just make sure it ends with a file (.exe, .pdf, .png, etc.):

# the url of file you want to download
url = ""

Now the method we gonna use to download content from the web is requests.get(), but the problem is it downloads the file immediately and we don't want that, as it will get stuck on large files and the memory will be filled. Luckily for us, there is an attribute we can set to True, which is stream parameter:

# read 1024 bytes every time 
buffer_size = 1024
# download the body of response by chunk, not immediately
response = requests.get(url, stream=True)

Now only the response headers is downloaded and the connection remains open, hence allowing us to control the workflow by the use of iter_content() method. Before we see it in action, we first need to retrieve the total file size and the file name:

# get the total file size
file_size = int(response.headers.get("Content-Length", 0))
# get the file name
filename = url.split("/")[-1]

Content-Length header parameter is the total size of the file in bytes.

Let's download the file now:

# progress bar, changing the unit to bytes instead of iteration (default by tqdm)
progress = tqdm(res.iter_content(buffer_size), f"Downloading {filename}", total=file_size, unit="B", unit_scale=True, unit_divisor=1024)
with open(filename, "wb") as f:
    for data in progress:
        # write data read to the file
        # update the progress bar manually

iter_content() method iterates over the response data, this avoids reading the content at once into memory for large responses, we specified buffer_size as the number of bytes it should read into memory in every loop.

We then wrapped the iteration with a tqdm object, which will print a fancy progress bar. We also changed the tqdm default unit from iteration to bytes.

After that, in each iteration, we read a chunk of data and write it to the file opened and update the progress bar.

Here is my result:

Downloading winzip24-downwz.exe:   6%|█████▏                                                                         | 779k/11.8M [00:03<00:55, 210kB/s]

It is working!

Alright, we are done, as you may see, downloading files in Python is pretty easy using powerful libraries like requests, you can now use this on your Python applications, good luck!

Here are some ideas you can implement:

By the way, if you wish to download files in torrent, check this tutorial.

Finally, many of the Python concepts aren't discussed in detail here, if you feel you want to dig more to Python, I highly suggest you get one of these amazing courses:

Happy Coding ♥

View Full Code
Sharing is caring!

Read Also

Comment panel