| We've been building TuFT (Tenant-unified FineTuning), an open-source platform that lets multiple users fine-tune LLMs on shared GPU infrastructure through a unified API. It's MIT licensed. *The problem we're solving:* If you have a team or org where multiple people need to fine-tune models, the typical setup is everyone gets their own GPU allocation and manages their own training stack. That's expensive and wasteful — GPUs sit idle between runs, and everyone is reinventing the same wheel. TuFT provides a single server that manages base models, LoRA adapters, and checkpoint storage, so multiple users can share the same GPU(s) and run training and sampling jobs through a clean API. *Why Tinker compatibility matters:* We expose a native Tinker-compatible API, so if you're already using the Tinker SDK for fine-tuning, you can point it at a TuFT server and it just works — no code changes needed. This was a deliberate choice to lower the adoption barrier. *What works today:* - Single-machine setup with multi-GPU support
- LoRA fine-tuning (SFT and RL with GRPO-style training)
- Sampling/inference from fine-tuned models
- Checkpoint management (save/restore training state and sampler weights)
- Redis-based persistence for crash recovery
- OpenTelemetry integration for observability
- One-line install script, Docker image, or pip install You can get it running with: ```
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/agentscope-ai/tuft/main/sc...)"
``` *Where we want to go (and where we'd love feedback):* Our roadmap focuses on post-training for agentic models — the RL training loop where rollouts involve reasoning, multi-turn conversations, and tool use. Near-term priorities: - Multi-machine distributed training (FSDP, DeepSpeed, etc.)
- Cloud-native deployment on AWS/GCP/Azure/Kubernetes
- Serverless GPU runtime with better multi-tenant resource sharing
- Longer term: standardized interfaces with agent training environments (WebShop, BrowserEnv, etc.) and automated training pipelines *What we'd like to hear from you:* - Does the multi-tenant framing match a real pain point you've experienced?
- If you've done RL-based fine-tuning for agents, what were the biggest infrastructure headaches?
- Are there integration points or features that would make this useful for your workflow? We're early and actively iterating, so honest feedback — including "this doesn't solve my problem because X" — is exactly what we need. Docs: https://agentscope-ai.github.io/TuFT
Discord: https://discord.gg/BCNCaQGxBH |