Hacker News new | ask | show | jobs
Show HN: Prompt-Engineering Tool: AI-to-AI Testing for LLM (github.com)
37 points by artas728 965 days ago
Spelltest framework simulates conversations between AI ‘synthetic users' in an environment to test and refine LLM-based applications. It ensures your app converse with utmost accuracy and relevance. Post-chat, Spelltest assesses responses, providing qualitative and quantitative feedback on performance. Suitable for both chat and completion modes.

When to use: - After modifying your prompt. - When your LLM provider updates. - As a CI step for you repo.

All feedback and collaborations appreciated!

1 comments

Super interesting. We've been experimenting with promptfoo[1] at my work, and this looks very similar.

[1]: https://github.com/promptfoo/promptfoo

Thanks! While promptfoo offers great strict metrics, Spelltest differentiates by allowing custom prompt-based metrics. This lets you measure unique attributes like empathy or creativity in dialogues and completions.