|
|
|
|
|
by carpo
18 days ago
|
|
Yeah, it's been awesome! I'm so excited about tool calls and function use, the possibilities are huge. I ran it over 1494 videos that range in length from a few seconds to over 3 hours. Total duration 260 hours and a total size of 3795 GB. I don't know exactly how long it took to run, as I found some bugs I needed to fix when processing mkv files, but it was probably around 24 hours in total. That wasn't all LLM requests, but also the local Whisper transcription and frame extraction / analysis. I used gemini-3.1-flash-lite-preview for the content analysis and tagging. Analysis cost $9.22 and Tagging cost $2.72 and the results seem great (for comparison, I did 885 videos a few weeks ago with Sonnet and it cost $130 in total). Gemini seems much less verbose than Sonnet, even with the same prompt, so the descriptions are much shorter, but they seem very good. The tagging is great. Another added bonus has been that with the larger screenshots being sent, the LLM can now read much more of the text it sees on screen. Some of my videos are top-down showing me drawing and writing, and now it picks that up, so it's all indexed and searchable. I tested a few models with the RAG Chat feature, and the best one so far is GPT4.1-Mini. Before, when asking questions about the library or a video it was around 4 cents each query, now its averaging about half a cent. |
|