|
|
|
|
|
by truskovskiyk
808 days ago
|
|
This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation. questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem? Would love to see more progress toward this direction! |
|
As for the test coverage, right now, the toolkit includes property-based unit tests. For instance, for an LLM fine-tuned on summarization, a property-test will evaluate if the summarized text is smaller in length compared to the actual input text.
Similar to the above test, we have a handful of property-based tests. Of course, the list is not exhaustive at this time. As more progress is being made on the testing side, we aim to distill the most relevant tests depending on use-cases.
Hope this helps.