| Hi Hacker News, What do you think about "AI-Computer Interface"? Let me explain a bit. "Human-computer interface (HCI)", was first developed around WWII. Since then it evolved so much and today we have very sophisticated and intuitive interfaces. It's the AI era, and there are so many projects and products trying to automate various tasks that humans are doing. However, main approach is to try to adapt their LLM-based products to current HCI, which is developed for humans, not LLMs. This approach actually works with some workaround, such as taking screenshots to understand what actions are available, etc. Example project: https://github.com/OpenInterpreter/open-interpreter I wonder if creating UI specifically for LLMs/AIs instead of adapting LLMs into human-friendly UI can make it possible to achieve more efficient automation processes. Do you know any of such projects or products? Happy holidays! |
But it seems to me that if you're going to use an LLM to "use" some other software, the way to go is use tool-calling support to call an API, and/or something like Anthropic's MCP (Model Context Protocol) stuff. There's some exiting work to, around "agent to agent" communications that one could use to integrate one kind of AI system with another computer system (whether or not the other system has any AI abilities). They range from things like FIPA-ACL, KQML, KIF, etc., through all the SemanticWeb standards, to some more recent specs that are being worked on. For example, the forthcoming ECMA TC56[1][2] standard for Natural Language communication between agents. And a similar'ish effort is mentioned in a recent arXiv paper[3].
[1]: https://ecma-international.org/technical-committees/tc56/
[2]: https://github.com/nlip-project
[3]: https://arxiv.org/abs/2411.05828v1