I had to use a similar approach when creating a cluster analysis of the amendments in the Italian Senate [0].
The Italian Senate offers a SPARQL endpoint [1], which unfortunately doesn't offer access to the texts of the amendments. So I had to roll my own and create a small spider for them using Scrapy [2].