Hacker News new | ask | show | jobs
by avereveard 1180 days ago
my biggest problem with these models is that they cannot reliably produce structured data.

even davinci can be used as part of a chain, because you can direct it to structure and unstructure data, and then extract the single component and build them into tasks. cohere, llama et al are currently struggling to consistently produce these result reliably, even if you can chat with them and frankly it's not about the chat

example from a stack overflow that split the questions before sending it down chain for answering all points individually:

This is a customer question:

I'm a beginner RoR programmer who's planning to deploy my app using Heroku. Word from my other advisor friends says that Heroku is really easy, good to use. The only problem is that I still have no idea what Heroku does...

I've looked at their website and in a nutshell, what Heroku does is help with scaling but... why does that even matter? How does Heroku help with:

    Speed - My research implied that deploying AWS on the US East Coast would be the fastest if I am targeting a US/Asia-based audience.

    Security - How secure are they?

    Scaling - How does it actually work?

    Cost efficiency - There's something like a dyno that makes it easy to scale.

    How do they fare against their competitors? For example, Engine Yard and bluebox?
Please use layman English terms to explain... I'm a beginner programmer.

Extract the scenario from the question including a summary of every detail, list every question, in JSON:

{ "scenario": "A beginner RoR programmer is planning to deploy their app using Heroku and is seeking advice about deploying it.", "questions": [ "What does Heroku do?", "How does deploying AWS on the US East Coast help with speed?", "How secure is Heroku?", "How does scaling with Heroku work?", "What is a dyno and why is it cost efficient?", "How does Heroku compare to its competitors, such as Engine Yard and Bluebox?" ] }

2 comments

Last weekend I built some tooling that you can integrate with huggingface transformers to force a given model to _only_ output content that validates against a JSON schema [1].

The challenge is that for it to work cost effectively you need to be able to append what is basically a final network layer to the model that is algorithmically designed and until OpenAI exposes the full logits and/or some way to modify them on the fly you're going to be stuck with open source models. I've run things against GPT-2 mostly but it's only list to try LLaMA.

[1] "Structural Alignment: Modifying Transformers (like GPT) to Follow a JSON Schema" @ https://github.com/newhouseb/clownfish

This feels solvable to me. I wonder if you could use fine tuning against LLaMA to teach it to do this better?

GPT-3 etc can only do this because they had a LOT of code included in their training sets.

The LLaMA paper says Github was 4.5% of the training corpus, so maybe it does have that stuff baked in and just needs extra tuning or different prompts to tap into that knowledge.

I have done it trough stages, so first stages emits in natural language in the format of "context: ... and question: ...." and then the second stage collect it as json, but then wait time doubles.