ChatGPT isn’t the limiting factor here, a good way to expose the toggles is. I recently tried to expose our company CRM to employees by means of a Teams bot they could ask for stuff in natural language (like „send an invite link to newlead@example.org“ or „how many MAUs did customer Foo have in June“), but while I almost got there, communicating an ever-growing set of actionable commands (with an arbitrary number of arguments) to the model was more complex than I thought.
Care to share what made it complex? My comment above was most likely ignorant, but my general thought was to write some header prompt about available actions that the LLM could map to, and then ask it if a given input text matches to a pre-defined action. Much like what TypeChat does.
Does this sound similar enough to what you were doing? Was there something difficult in this that you could explain?
Aside from being completely hand-wavey in my hypothetical guess-timated implementation, i had figured the most difficult part would be piping complex actions together. "Remind me tomorrow about any events i have on my calendar" would be a conditional action based on lookups, etc - so order of operations would also have to be parsed somehow. I suspect a looping "thinking" mechanism would be necessary, and while i know that's not a novel idea i am unsure if i would nonetheless have to reinvent it in my own tech for the way i wanted to deploy.
Interacting with APIs is the old style. The magic of ChatGPT is the same magic as google had back in the day - you ask it in plain english and it has an answer.
I'm guessing the solution looks like a model trained to take actions on the internet. Kinda sucks for those of us on the outside, because whatever we make is going to be the same, brittle, chewing-gum and duct tape approach as usual. Best to wait for the bleeding edge, like what that MinecraftGPT project was aiming at.