Hacker News new | ask | show | jobs
by spstoyanov 1054 days ago
Thank you for your feedback!

Building apps with LLMs is fundamentally an exercise of manipulating strings and making API calls (Both LLM and vector db APIs). When apps are more complex and start including multiple LLM calls and prompts that can change dynamically some additional challenges start to emerge. e.g figuring out:

- When was a particular LLM called?

- How much time did it take?

- What were the input variables?

- What did the prompt look like?

- What was the exact configuration of each LLM?

- How many times did we retry the request?

- What was the raw data the API returned?

- What was exactly returned from the vector store and fed into an LLM?

- How many tokens were used?

- What was the final result for each call?

- How do you make API calls in parallel?

I think a framework like this should provide abstractions that allow people to focus on the important part like prompt engineering and productionizing the app and worry less about how to figure out stuff like this.

Right now LLMFlows supports only OpenAI models and Pinecone but I am working on classes for Chroma and Weaviate. And I would like to also provide support for Bard and Claude once I get access.

2 comments

> I think a framework like this should provide abstractions that allow people to focus on the important part like prompt engineering and productionizing the app and worry less about how to figure out stuff like this.

I appreciate the detailed explanation, thank you!

If you'll permit another noob question, can/should frameworks like this include source attribution for responses? Based on articles like https://jamesg.blog/2023/04/02/llm-prompts-source-attributio... I'd guess the strict answer would be no, and that any strategies you might use to elicit them might hallucinate citations.

This is a very hard problem. If you base your response on a text that has been returned from a vector DB you can reference this text. But if you rely on a citation generated directly from the LLM is probably impossible to reliably guarantee the source is real. I guess if it’s a link you can check if it’s actually a real link, scrape the content and do some extra trickery to figure out if it contains the text produced by the LLM but I’ve never tried this myself.
> If you base your response on a text that has been returned from a vector DB you can reference this text.

Oh, sweet! That would be a wonderful feature to have in a framework. For personal or corporate knowledge bases, sources would be really helpful for people who want to dig deeper.

> What did the prompt look like?

This is the main thing I'm struggling with langchain. I needed to see this info for troubleshooting. I don't know if it's easily visible and I'm being thick, or it's just hidden away.