| There will be a Hackathon at work and with my team mate we are preparing with some kind of hierarchical memory/knowledge solution. Briefly: we tell ChatGPT what API based tools we have, explaines them in 1 sentence and where it can reach their documentation. We added documentations as endpoint. example.com/docs/main is always the starting point that returns high level overview of the app and all available endpoints to call. Every endpoint has its own documentation as well. E.g.: /geocoder has /docs/geocoder documentation endpoint that describes what it does, what input it expects and what it will return. We also provieded ChatGPT with actions like read_docs, call_endpoint and end_conversation. An action is a structured JSON object with a set of parameters. If ChatGPT wants to interact with the mentioned resources, it emits an action, it gets executed and the answer fed back to it. With this I can do a task like: "Get a 30 minutes drivetime polygon around 15 Bond Street, London and send it to Foster." It plans and executes the following all alone. First it calls the geocoder to get the coordinates for the isochrone endpoint, then gets the isochrone by calling the isochrone
endpoint and saves it, calls Microsoft Graph API and queries my top 50 connections to find out who Foster is and calls the MS Graph API's send mail endpoint to send the email with attachment to Foster. It can hierarchically explore the available resources so we don't need a huge context window and we don't have to train the model either. Also we could implement multiple agents. 1 would be a manager and there could me multiple agents to perform each task and return the results to the manager. It would furthet reduce reduce the required context window. Very likely some BS app will win the Hackathon like always like a market price predictor using Weka's multilayer perceptron with default settings but we believe our solution could be extremely powerful. |
I do think this will be way less than having all of the functions listed to begin with though. I think the discoverability is a novel approach. Honestly, I'm surprised ChatGPT with plugins doesn't do something like this by default rather than making you pick which plugins you want at the beginning of the conversation.