| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by psyklic 821 days ago

These have been on HN:

- https://www.bookai.chat/

- https://www.myreader.ai/

1 comments

blizzardman 821 days ago

This is so cool, does anyone know how to build an app that answer question from the book? What would you need to learn to build an llm app like this? Can you use llama directly?

link

psyklic 821 days ago

Building a prototype of this would be very straightforward. The difficulty would be ensuring high-quality results.

1. You'd only have to know basic web dev. Enough to get input from the user, send it to an API (e.g. https://openai.com/blog/openai-api), and display results. There are many tutorials online showing how to do this.

2. You'd then design a prompt ("prompt engineering") that tells the GPT it's read the book, asking it to converse in a friendly manner as an expert literary critic, etc. Place this prompt and the prior conversation before the user's new question. Then send it to the API. (It's likely the GPT at least knows something about most popular books.)

3. You can now try to improve the results. A popular method: If you have the full text for the book, you can do a text search for the user's query. Then just include relevant parts of the book in the prompt.

link

blizzardman 820 days ago

Thank you

I am imagine something more complicated, like having a chatbot with the personality of the character in the book answering hypothetical questions. Like asking gandolf why he didn't send the eagles to drop the ring!!!! Or asking dumbledledore why don't he create horcrux himself and fight voldemore, since he was able to defeat him once at the ministry of magic.

So here is kind of my understanding of pre gpt3 like models like BERT

1. Bert or any sentence transformer models generate embedding on the entire book (search space)

2. You pipe in your query to the same model generate embedding (query)

3. you do ANN or bruteforce KNN (lsh, pq) on top of the search space embedding with your query, essentially finding dot product with lowest value

What I am having trouble understanding is using sentence transformer does not give you answer using the character of the book, but LLM does.

How do I build a chat app that do that? Do I just use openai api? Or can I train my own llm or use off the shelf llm like llama?

link

vunderba 820 days ago

There is an AI app for this already:

https://chatfai.com/characters/book

I haven't tried it, but if I had to guess how it is built they're probably just setting up RAG vector databases at a per-book level and then augmenting a given character's context window with information from the vector database relevant to the conversation.

It would be relatively trivial (weekend project) to roll your own using streamlit + quant/pgvector + ggerganov llama.cpp and a suitable model such as Vicuna/Mistral/etc. Hardest part would be separating an entire book into a well representated set of embeddings.

link

blizzardman 819 days ago

Thank you for the recommendation. RAG would make sense Here is my understanding of how to do it

1. Use sentence transformer, transform the entire harry potter or lord of the ring book into embeddings

2. transform query into embeddings -> "why don't gandolf sent the eagles"

3. Find most relevant text using ANN through the query embeddings

4. pipe in the context + query to llama

However the result is not very good, am I missing something in RAG?

link

vunderba 818 days ago

Make sure you're using a SOTA embedding model (UAE, embedding-ada-002, etc) that is capable of creating a vector from a reasonably large token size, see here for comparisons: https://huggingface.co/spaces/mteb/leaderboard

Experiment with a "sliding scale" around the book (paragraphs, pages, etc). Try to use a graph to relate book sections, etc.

Consider setting up a tuner with well defined questions and answers to search for optimality around embeddings.

link