| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jasonpriem 1509 days ago

We emphasize the API because imho a well-documented, high-throughput API is something really lacking in the ecosystem right now. Dealing with a dataset this size (200M works, 200M authors) is a pain, especially since many end users for this data don't have a lot of technical expertise. Often people have really basic questions (like the ones from the linked post) or they want to build simple applications like monitoring dashboards, recommender systems, and scholarly search engines.

With the API, folks can outsource the heavy data engineering to us for free, and just do the fun parts themselves. We want to make building real-world apps on the global research graph fun and easy, the kind of thing you can do as a hackathon project, instead of with a six-figure grant.

That said, I agree that it's absolutely essential the entire dataset be easy to download and mirror as well. It's called OpenAlex because it's _open_, soup to nuts (the "Alex" part is homage to the ancient Library of Alexandria). All the data is open, the code is open, and our governance is as open as we can make it. [1]

[1] https://ourresearch.org/transparency