|
|
|
|
|
by mathis-l
434 days ago
|
|
You might want to take a look at https://github.com/segment-any-text/wtpsplit It uses a similar approach but the focus is on sentence/paragraph segmentation generally and not specifically focused on RAG. It also has some benchmarks. Might be a good source of inspiration for where to take chonky next. |
|