Hacker News new | ask | show | jobs
by tikkun 1077 days ago
Doing this. We soft launched yesterday with a paid Falcon-40B playground - 3 models for now Falcon 40b instruct, uncensored, and base. Adding API and per token pricing this week.

https://api.llm-utils.org/

And more models coming soon.

Vector storage isn’t on the roadmap (what stops using a separate vector store from working well? Could add to roadmap but want to add understand more first), and we could add fine tuning if it’s a common request.

3 comments

Lots of people using LLMs to make chat bots from their existing datasets: customer service troubleshooting, FAQs, billing, scheduling. Being able to upload their own pdfs, spreadsheets, docx, crawl their home page, lets the chat bot become personalized to their use case. While you could locally query your own vectordb before prompting, people buy paid service so they won't have to manage any of the technical details.

If people can drag and drop some files from their nas, you parse them with apache tika or similar https://tika.apache.org/ , they can start using personalized branded bots. It also lets you do things like refusing to answer, if the vector database returns nothing and the use case requires a specific answer from the docs only (not the llm to make stuff up).

For those use cases the “custom ChatGPT” tools I linked here might be better https://news.ycombinator.com/item?id=36649777
Shouldn't you use a .com tld?

Will your pricing be competitive with Replicate?

Not secure... NET::ERR_CERT_COMMON_NAME_INVALID Subject: *.safezone.mcafee.com

Issuer: McAfee OV SSL CA 2

Expires on: Aug 3, 2023

Current date: Jul 8, 2023

PEM encoded chain: -----BEGIN CERTIFICATE----- MIIGfzCCBWegAwIBAgIQKt9VNrFtaozA1bILX1OcfzANBgkqhkiG9w0BAQsFADBk MQswCQYDVQQGEwJVUzELMAkGA1UECBMCQ0