Hacker News new | ask | show | jobs
by ailicious 962 days ago
I'm the author of the article, Thanks for posting this article and I appreciate all the feedback received.

Indeed, it seems the cost of using the OpenAI API for scrapers at scale is high. However, in my opinion, optimization is key. As some comments suggested, scrapers could employ finely-tuned, smaller models—perhaps distilled from ChatGPT—to achieve similar tasks at a lower cost.

One takeaway from this article might be that obscuring text might be ineffective (and potentially always has been) if all the data is centralized in one place. In such instances, a Language Model is just as powerful as a human.

1 comments

It's likely it always had been, but the limitations have always been costs.

But we know if the ever increasing power of compute that problems limited by cost of compute get solved all the time. "Way back in the day" we'd have never really tried to crack passwords on 486's. For example, theses days we're throwing ever more complicated algorithms and requests on the user to ensure the password isn't quickly broken if the cyphertext is stolen.