Hacker News new | ask | show | jobs
by swalsh 1206 days ago
A better way to do this might be to use the embedding API. That allows you to upload a text corpus and to then get vectors. You can then calculate the cosign similarity for a search string on those to get relevant results of clustered text from the uploaded corpus.
2 comments

I don't get why people bother with chat interface and textual prompts. The whole concept of "prompt engineering" sounds to me like a practical joke that got out of hand.

It's like, imagine there's a complex machine with large panels full of buttons and levers - and then, someone covered the panels with tapestry. Beautiful tapestry, showing artistic interpretations of things mundane and holy, trivialities of everyday life next to impossible dreams. And then, people were told the machine is to be operated by touching that tapestry, and that the artworks are the guide to understanding it and using it effectively. And then a whole religion formed around studying patterns in the tapestry. To me, prompt engineering is that religion.

There's an actual interface to the machine hidden under all the clever wordplay. A precise, formalized one. An interface that eats tokens and spits out probabilities. I just don't get why most talk - even seemingly specialist talk - about LLMs is ignoring it entirely, and focuses on the tapestry that's just obscuring the nature of the model, effectively making everything more difficult.

>The whole concept of "prompt engineering" sounds to me like a practical joke that got out of hand.

I was on a call this morning and heard someone refer to two of their team members as "Prompt Engineers" as if that were an actual role.

My impression is that the industry in aggregate is actually trying to make it into an actual role.

Which would make sense if we were talking about humanity discovering magic is real and trying to reverse engineer it based on ancient spell books[0] - but we're not. We're talking about deep learning models made by other people, using publicly available knowledge and techniques, and often with source code and training set being publicly available too. Prompt engineering feels like people purposefully trying to treat technology as magic.

--

[0] - Or any of the scenarios equivalent to it under Clarke's third law, such as finding a crashed alien starship with a working black-box AI in it, built on a computing substrate we can't even identify, much less prod with a signal generator.

Because everyone can use text interface without knowing how to configure the low level one.
This makes sense at the UI layer, if you're making a chatbot or an NPC for a game. But if you're at the point of prompt engineering, it makes no sense to stick to the natural language interface. It's like another iteration of the idea of "programming via conversations in natural language instead of writing code" - it sounds like it makes sense, until you realize that programming languages and the mathematics underpinning it were developed specifically because natural language is nowhere near precise enough for the job.

Or, put another way, using text/conversation as user interface in a model is turning a normal engineering problem into a much harder reverse engineering problem. Why would anyone want to make life difficult for themselves this way, and ultimately turning engineering into voodoo?

Would you mind explaining this and maybe dumbing it down? Sounds useful
You can use models (OpenAI have some, there are other open-source self-hostable ones that are better if I recall correctly) that will take a sentence or a paragraph and spit out a vector. These vectors are called 'embeddings'

You then put those vectors in a vector database (e.g. pinecone, pgvector, chroma).

To run searches, you generate an embedding of the search term (could be the raw user search, could be something a model like ChatGPT was asked to transform the user's search into), then query the vector database for the n closest vectors. The trick is getting a model that generates good vectors for search (and transforming the user's query into some text that'd be useful vector(s) to search against). If feeding that into an LLM context, the next step is making sure that you get your prompt right, and don't overload the model with unrelated information (i.e. bad search results).

The key is that the vector representation embeds language concepts in how close vectors are to one another. An easy way to gain a feel for this is to look at single-word embeddings. Computerphile have a great episode on it[1]. You can take a vector for 'King', subtract the vector for 'Man' and add the vector for 'Woman' and the closest vector in that search will likely be 'Queen'. Scale up this idea to whole paragraphs (and larger vectors as a result).

LangChain has an example of searching a database of facts[2] (although I find their documentation pretty inaccessible - they explain their library, but don't step back from inside the weeds of what they're doing to really explain why / what's going on). Many of the features LangChain implements are distilling (or sometimes simply lifting and providing a toolkit to directly apply) LLM papers.

1: Computerphile Word Embeddings https://www.youtube.com/watch?v=gQddtTdmG_8

2: https://langchain.readthedocs.io/en/latest/use_cases/questio...

+1 to this. Maybe even some basic code to share on how to use embeddings to query ChatGPT with bigger data sets. Like thousands of phone call transcriptions, hundreds of documents or millions of user reviews? Thank you!