Hacker News new | ask | show | jobs
by ProjectArcturis 275 days ago
I'm confused why this is addressed to Azure instead of OpenAI. Isn't Azure just offering a wrapper around chatGPT?

That said, I would also love to see some examples or data, instead of just "it's getting worse".

2 comments

I know that OpenAI has made computing deals with other companies, and as time goes on, the percentage of inference that they run their models on will shift, but I doubt that much, if any, of that has moved from Microsoft Azure data centers yet, so that's not a reason for difference in model performance.

With that said, Microsoft has a different level of responsibility, both to its customers and to its stakeholders, to provide safety than OpenAI or any other frontier provider. That's not a criticism of OpenAI or Anthropic or anyone else, who I believe are all trying their best to provide safe usage. (Well, other than xAI and Grok, for which the lack of safety is a feature, not a bug.)

The risk to Microsoft of getting this wrong is simply higher than it is for other companies, and what's why they have a strong focus on Responsible AI (RAI) [1]. I don't know the details, but I have to assume there's a layer of RAI processing on models through Azure OpenAI that's not there for just using OpenAI models directly through the OpenAI API. That layer is valuable for the companies who choose to run their inference through Azure, who also want to maximize safety.

I wonder if that's where some of the observed changes are coming from. I hope the commenter posts their proof for further inspection. It would help everyone.

[1]: https://www.microsoft.com/en-us/ai/responsible-ai

I don't remember where I saw it, but I remember a claim that Azure hosted models performed poorer than those hosted by openAI.
Explains why the enterprise copilot ChatGPT wrapper that they shoehorn into every piece of office365 performs worse than a badly configured local LLM.
They most definitely do. They have been lobotomized in some way to be ultra corporate friendly. I can only use their M365 Copilot at work and it's absolute dogshit at writing code more than maybe 100 lines. It can barely write correct PowerShell. Luckily, I really only need it for quick and dirty short PS scripts.
I agree. I asked it for some help refactoring a database and some of the SQL is quite broken. It also doesn't help that their streaming code is broken so LLM responses sometimes end up broken in the web browser (both Firefox and Edge so it is not a browser issue), so you need to refresh after a response to make sure the LLMs response was not an indication of a drunk LLM.