| Traditional databases rely on RAG and vector databases or SQL-based transformations/analytics. But will they be able to preserve per-row contextual understanding? We’ve released Agents as part of Datatune: https://github.com/vitalops/datatune In a single prompt, you can define multiple tasks for data transformations, and Datatune performs the transformations on your data at a per-row level, with contextual understanding. Example prompt: "Extract categories from the product description and name. Keep only electronics products. Add a column called ProfitMargin = (Total Profit / Revenue) * 100" Datatune interprets the prompt and applies the right operation (map, filter, or an LLM-powered agent pipeline) on your data using OpenAI, Azure, Ollama, or other LLMs via LiteLLM. Key Features - Row-level map() and filter() operations using natural language - Agent interface for auto-generating multi-step transformations - Built-in support for Dask DataFrames (for scalability) - Works with multiple LLM backends (OpenAI, Azure, Ollama, etc.) - Compatible with LiteLLM for flexibility across providers - Auto-token batching, metadata tracking, and smart pipeline composition Token & Cost Optimization - Datatune gives you explicit control over which columns are sent to the LLM, reducing token usage and API cost: - Use input_fields to send only relevant columns - Automatically handles batching and metadata internally - Supports setting tokens-per-minute and requests-per-minute limits - Defaults to known model limits (e.g., GPT-3.5) if not specified - This makes it possible to run LLM-based transformations over large datasets without incurring runaway costs. |