https://stackoverflow.com/questions/3203790/parsing-pdf-file...
https://excalibur-py.readthedocs.io/en/master/
https://ledgerbox.io/blog/extract-tables-with-tesseract-ocr
https://www.johnsnowlabs.com/extract-tabular-data-from-pdf-i...
bit more in-depth review : https://dev.to/upsilon_it/how-to-extract-tabular-data-from-p...