Couldn't find any mention of this, please provide a source.
Their ToS mentions scraping but it pertains to scraping their frontend instead of using their API, which they don't want you to do.
Also - this library requests the HTML by itself [0] and ships it as a prompt but with preset system messages as the instruction [1].
I don't think this is correct at all. It's one of the main use cases for GPT-4 – so long as the scraped data or outputs from their LLMs aren't used to train competing LLMs.
> OpenAI is actively blocking the scraping use case.
How? And since when? Scraping is identical to retrieval except in terms of what you do with the data after you have it, and to differentiate them when you are using the API, OpenAI would need to analyze the code calling the API, which doesn’t seem likely.
Also - this library requests the HTML by itself [0] and ships it as a prompt but with preset system messages as the instruction [1].
[0] - https://github.com/jamesturk/scrapeghost/blob/main/src/scrap...
[1] - https://github.com/jamesturk/scrapeghost/blob/main/src/scrap...