PDF File Handling Tutorials


Learn how to handle PDF files in Python, from extracting links, images to inserting watermarks and manipulating text.

How to Convert PDF to Images in Python
How to Convert PDF to Images in Python

Learn how to use PyMuPDF library to convert PDF files into individual images per page in Python.

How to Convert PDF to Docx in Python
How to Convert PDF to Docx in Python

Learn how you can use pdf2docx library to convert PDF files to docx word files in Python

How to Extract Text from Images in PDF Files with Python
How to Extract Text from Images in PDF Files with Python

Learn how to leverage tesseract, OpenCV, PyMuPDF and many other libraries to extract text from images in PDF files with Python

How to Highlight and Redact Text in PDF Files with Python
How to Highlight and Redact Text in PDF Files with Python

Learn how to use PyMuPDF library to highlight, frame, underline, strikeout and redact text in PDF Files with Python.

How to Watermark PDF Files in Python
How to Watermark PDF Files in Python

Learn how to add and remove watermarks to/from PDF files with PyPDF4 and reportlab libraries in Python.

How to Extract Images from PDF in Python
How to Extract Images from PDF in Python

Learn how to extract and save images from PDF files in Python using PyMuPDF and Pillow libraries.

How to Extract All PDF Links in Python
How to Extract All PDF Links in Python

Learn how you can extract links and URLs from PDF files with Python using pikepdf and PyMuPDF libraries.

How to Crack PDF Files in Python
How to Crack PDF Files in Python

Learn how you can use pikepdf, pdf2john and other tools to crack password protected PDF files in Python.

How to Extract Tables from PDF in Python
How to Extract Tables from PDF in Python

Learning how to extract tables from PDF files in Python using camelot and tabula libraries and export them into several formats such as CSV, excel, Pandas dataframe and HTML.