|
We were frustrated with dataset generation DX: ad‑hoc scripts and JSON templates break as flows branch, tool‑calls drift, and reproducibility is hard. So we built Torque: a schema‑first, declarative, fully typesafe DSL to generate conversational datasets. What it is:
Declarative DSL - Compose conversations like React components
Fully Typesafe - Zod schemas with complete type inference
Provider Agnostic - Generate with any AI SDK provider (OpenAI, Anthropic, DeepSeek, vLLM, LLaMA.cpp etc.)
AI-Powered Content - Generate realistic varied datasets automatically without complicated scripts
Faker Integration - Built-in Faker.js with automatic seed synchronization for reproducible fake data
Cache Optimized - Reuses context across generations to reduce costs
Prompt Optimized - Concise, optimized structures, prompts and generation workflow lets you use smaller, cheaper models
Quick example
Concurrent Generation - Beautiful async CLI with real-time progress tracking while generating concurrently import { generateDataset, generatedUser, generatedAssistant, oneOf, assistant } from “@qforge/torque”;
import { openai } from “@ai-sdk/openai”; await generateDataset(
() => [
generatedUser({ prompt: “Friendly greeting” }),
oneOf([assistant({ content: “Hello!” }), generatedAssistant({ prompt: “Respond to greeting” })]),
],
{ count: 2, model: openai(“gpt-5-mini”), seed: 42 }
); Links
• GitHub: https://github.com/qforge-dev/torque
• Try in browser: https://stackblitz.com/github/qforge-dev/torque/tree/main/st...
• npm: https://www.npmjs.com/package/@qforge/torque What feedback would help most:
• What dataset would you like us to create / recreate?
• Do you like the API? Any suggestions on how to change it? License: MIT. Happy to answer questions in the thread! |