Would you consider rolling your own? Python’s goose3 has worked well for me in article extraction. It seemed to be successful more often than trafilatura and newspaper3k.
I was not aware of any of those projects - thank you for pointing me in the right direction!
goose3, trafilatura, newspaper3k (and newspaper4k even) all look like great tools. We were not planning on rolling our own, but that might be the right way to go after all. Thanks again.
goose3, trafilatura, newspaper3k (and newspaper4k even) all look like great tools. We were not planning on rolling our own, but that might be the right way to go after all. Thanks again.