Hacker News new | ask | show | jobs
by joemaller1 4294 days ago
What tools were used to discover and digest the data? I'd be very interested in the process behind this post.
1 comments

Short version: node.js of course!

A bit longer one:

One can fetch `jsverify` package metadata from http://registry.npmjs.org/jsverify and all current packages are listed in http://registry.npmjs.org/-/all (this one is special, its size is around 50MiB). Please cache your results, let's not DDoS the registry.

There are around one gigabyte of nice JSON data. After initial fetch you can traverse it using any tools you want. I naturally used node.js for that too.

If you'd rather work with a SQL database, here's a module that attempts to put NPM into a Postgres database: https://www.npmjs.org/package/npm-postgres-mashup
Wow, someone had time and motivation to write all of that boilerplate there. Yet e.g. the license parsing is very naive: compare https://github.com/npm/npm-www/blob/99020b5b3e21607dab24cd69... and https://github.com/rickbergfalk/npm-postgres-mashup/blob/56d...