Hacker News new | ask | show | jobs
by steeve 4099 days ago
It goes on each article on Legifrance, creates a JSON file with all the articles, which book/section they belong to, and for each article, tracks it's versions (they are dates). The crawler is in Go.

Then there is a python script that takes that JSON file, creates the .md files and runs the git commands in the shell.

Ultimately the sad thing is that I had to scrape this information. There were lots of pitfalls due to bad formatting and so on... Well, scraping.

1 comments

It might not be necessary to scrap anything since this data was finally released as OpenData last year: https://www.data.gouv.fr/fr/datasets/legi-codes-lois-et-regl...
i tried using this, but it seems it seems to be just a snapshot, not the full dataset with history (i wish i'm wrong, though)