How to Download Files from URL in Python

Abdeladim Fadheli · 3 min read · Updated may 2024 · General Python Tutorials

Want to code faster? Our Python Code Generator lets you create Python scripts with just a few clicks. Try it now!

Downloading files from the Internet is one of the most common daily tasks to perform on the Web. It is important because a lot of successful software allows their users to download files from the Internet.

In this tutorial, you will learn how to download files over HTTP in Python using the requests library.

Let's get started, installing the required dependencies:

pip3 install requests tqdm

We gonna use the tqdm module here just to print a good-looking progress bar in the downloading process.

Open up a new Python file and import:

from tqdm import tqdm
import requests
import cgi
import sys

We'll be getting the file URL from the command line arguments:

# the url of file you want to download, passed from command line arguments
url = sys.argv[1]

Now the method we gonna use to download content from the web is requests.get(), but the problem is it downloads the file immediately and we don't want that, as it will get stuck on large files and the memory will be filled. Luckily for us, there is an attribute we can set to True, which is stream parameter:

# read 1024 bytes every time 
buffer_size = 1024
# download the body of response by chunk, not immediately
response = requests.get(url, stream=True)

Now only the response headers are downloaded and the connection remains open, hence allowing us to control the workflow by the use of iter_content() method. Before we see it in action, we first need to retrieve the total file size and the file name:

# get the total file size
file_size = int(response.headers.get("Content-Length", 0))
# get the default filename
default_filename = url.split("/")[-1]
# get the content disposition header
content_disposition = response.headers.get("Content-Disposition")
if content_disposition:
    # parse the header using cgi
    value, params = cgi.parse_header(content_disposition)
    # extract filename from content disposition
    filename = params.get("filename", default_filename)
else:
    # if content dispotion is not available, just use default from URL
    filename = default_filename

We get the file size in bytes from Content-Length response header, we also get the file name in Content-Disposition header, but we need to parse it using cgi.parse_header() function.

Let's download the file now:

# progress bar, changing the unit to bytes instead of iteration (default by tqdm)
progress = tqdm(response.iter_content(buffer_size), f"Downloading {filename}", total=file_size, unit="B", unit_scale=True, unit_divisor=1024)
with open(filename, "wb") as f:
    for data in progress.iterable:
        # write data read to the file
        f.write(data)
        # update the progress bar manually
        progress.update(len(data))

iter_content() method iterates over the response data, this avoids reading the content at once into memory for large responses, we specified buffer_size as the number of bytes it should read into memory in every loop.

We then wrapped the iteration with a tqdm object, which will print a fancy progress bar. We also changed the tqdm default unit from iteration to bytes.

After that, in each iteration, we read a chunk of data and write it to the file opened, and update the progress bar.

Here is my result after trying to download a file, you can choose any file you want, just make sure it ends with the file extension (.exe, .pdf, etc.):

C:\file-downloader>python download.py https://download.virtualbox.org/virtualbox/6.1.18/VirtualBox-6.1.18-142142-Win.exe
Downloading VirtualBox-6.1.18-142142-Win.exe:   8%|██▍                             | 7.84M/103M [00:06<01:14, 1.35MB/s]

It is working!

Alright, we are done, as you may see, downloading files in Python is pretty easy using powerful libraries like requests, you can now use this on your Python applications, good luck!

Here are some ideas you can implement:

Downloading all images from a web page.
A Python script to download compressed archive files from the Internet and extract them automatically.

By the way, if you wish to download torrent files, check this tutorial.

Happy Coding ♥

Finished reading? Keep the learning going with our AI-powered Code Explainer. Try it now!

View Full Code Analyze My Code

Sharing is caring!

Comment panel

Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!