Today we teach you how to create a simple python script which will convert a PDF file to txt.
If you want to become an expert with the necessary skills to work with Python, with the Master in Advanced Programming in Python for Hacking, BigData and Machine Learning You will be trained in just 12 months.
Steps to follow
The first thing is create a PDF file or look for one we have. We can do this through Word by saving any document in PDF File > Save as…
Need install PyPDF2, a Python PDF library that can split, merge, crop, and transform PDF files. According to the PyPDF2 website it can also be used to add data, viewing options and passwords to PDF files.
To install the PyPDF2 package, we will only have to write in the Windows command prompt or in the terminal of our favorite IDLE pip install PyPDF2.

Later we have to create a new Python file in the same location where we will have saved the pdf file and write our code.

Although we have detailed the code, we give you a quick explanation of it:
- First we create a Python file object and we open the PDF file in read binary (rb) mode.
- We create the PdfFileReader object which will read the open file.
- We will use a variable to store the number of pages in the file.
- Finally we will indicate the path of the txt file where lines from the PDF file will be written.