| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sansseriff 712 days ago

I find it interesting that their entity extraction method for building a knowledge graph does not use or require one of the 'in-vogue' extraction libraries like instructor, Marvin, or Guardrails (all of which build off of pydantic). They just tell the llm to list graph nodes and edges in a list, and do some basic delimiter parsing, and load the result right into a networkx graph [1]. Is this because GPT-4 and the like have become very reliable at following specific formatting instructions, like a certain .json schema?

It looks like they just provide in the prompt a number of examples that follow the schema they want [2].

[1] https://github.com/microsoft/graphrag/blob/main/graphrag/ind...

[2] https://github.com/microsoft/graphrag/blob/main/graphrag/ind...

4 comments

ftkftk 712 days ago

LLMs are very good at knowledge extraction and following instructions, especially if you provide examples. What you see in the prompt you linked is an example of in-context learning, in particular a method called Few-Shot prompting [1]. You provide the model some specific examples of input and desired output, and it will follow the example as best it can. Which, with the latest frontier models, is pretty darn well.

[1] https://www.promptingguide.ai/techniques/fewshot

link

yunohn 711 days ago

Using a library like Instructor adds a significant token overhead. While that overhead can be justified, if their use case performs fine without it, I don’t see any reason to add such a dependency.

link

sansseriff 711 days ago

That's what I was wondering. How the token overhead of function calling with, say, Instructor compares with a regular chat response that includes few-shot learning (schema examples).

Maybe Instructor makes the most sense only when you're working with potentially malicious user data

link

inhumantsar 711 days ago

anecdotally i found gpt3.5 was decently reliable at returning structured data but it took some prompt tuning and gpt4 has never returned invalid JSON when asked for it, though it doesn't always return the same fields or field names. it performs better when the schema is explicit and included in the prompt, but that can add a non trivial number of tokens to each request.

link

swyx 712 days ago

regardless how reliable llms are they will never be perfect, so graphrag will error when the RNG gods feel like ruining your day. use pydantic.

link

Dowwie 711 days ago

Have you any insights about whether these python libraries are using extensions for the heavy lifting? Current situation involves layers upon layers of parsing and serializing/deserializing in Python.

link

swyx 711 days ago

sorry what kind of extensions do you mean? im not aware of any. this stuff is just boilerplate

link

darkteflon 712 days ago

Was looking forward to this. Haven’t looked at the code yet, but yeah - it’s a bit of a red flag that it doesn’t use an established parsing and validation library.

link

swyx 711 days ago

i mean dont throw baby out with bathwater. its research code, take the good chuck the bad.

link