|
|
|
|
|
by JohannesKauf
731 days ago
|
|
Cool to see another library in this space! I see that you took the test cases from Turndown. However Turndown isn’t actually that accurate. This is especially noticeable when converting entires websites. The best comparison would be against Pandoc. That is (in my opinion) the best html to markdown converter right now. Although it is extremely difficult to handle every edge case. As an example, this usually causes problems: <p>nitty<em>-gritty-</em>details</p>
Note: Six years ago I open sourced a Golang library [1]. Currently I am re-writing it completely with the aim of getting even better than Pandoc. And wrote about the encountered edge-cases [2].[1] https://github.com/JohannesKaufmann/html-to-markdown [2] https://html-to-markdown.com/edge-cases |
|