Skip to content

nikipr1999/PDF-data-extraction

Repository files navigation

PDF-data-extraction

Working on different packages of python for extracting data from a pdf

There are many python packages to play with pdf files namely:

  1. PyPDF2
  2. PDFMiner
  3. Slate
  4. Tabula

PyPDF2

A python library built as a PDF toolkit. It is capable of:
  • Extracting document information (title, author, …)
  • Splitting documents page by page
  • Merging documents page by page
  • Cropping pages
  • Merging multiple pages into a single page
  • Encrypting and decrypting PDF file

Check Working with PyPDF2

PDFMiner

Check Working with PDFMiner

About

Working on different packaged of python for extracting data from a pdf

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors