|
|
|
|
|
by convivialdingo
1072 days ago
|
|
I second this suggestion. I tested numerous Python tools to extract text - nothing matches Tika for general extraction of just about any data format. However - if you can expect a certain format beforehand - then Python is better since you can extract higher-quality data (tables, lists) with the appropriate tool. |
|