| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by courseofaction 714 days ago
	Really interesting. Could the potentially controversial content of the target news article have an effect on ChatGPT's ability to summarize it?

2 comments

gillesjacobs 714 days ago

I use LLM information extraction for financial news articles with OpenAI Azure and it is a huge problem for me.

404 Content moderation response in 4% of articles. This is just financial news text.

It is a prime reason we are considering open models.

link

strickvl 714 days ago

I think not. Normally if you get those kinds of errors you wouldn’t get any output at all. In the blog I show that all 724 of the test cases got proper JSON output etc for the queries so I don’t think this was an issue. I think these kinds of topics would have been well covered in the training data, and probably the OSS models would have used similar data so I don’t even think there’s a disparity to be found between proprietary vs OSS models here.

link

resource_waste 714 days ago

>Normally if you get those kinds of errors you wouldn’t get any output at all

I am not sure. I disagree. If there is a pro-chatGPT user, I'm probably it.

Ive often seen it give significantly less effort to answer the question.

link

strickvl 714 days ago

Interesting. I can maybe try finetuning one or two of the so-called 'uncensored' open models and see if that makes a difference. A bit harder to switch out the dataset completely, as that's really what I'm interested in :) I think the general point that finetuning a model for some custom task works is fairly uncontroversial, but if OpenAI's poor performance was on account of these kinds of guardrails it'd be yet another reason someone might want to finetune their own models I guess.

link