Hacker News new | ask | show | jobs
by tianhe 4702 days ago
XML? this should be in json!
7 comments

As the schema guide explains, in 1999 they picked XML based on the study from 1996:

"Following a 1999 feasibility study on XML/SGML, the Committee on House Administration adopted XML as a data standard for the exchange of legislative documents" http://uscodebeta.house.gov/download/resources/USLM-User-Gui...

It took the US Govt 17 years to release 200,000 pages of the US code in XML.

I think the most time consuming part of the conversion is from text to digital. That's what took so long.

Interchanging formats should be relatively easy. Back in 1999, json wasn't even around.

IMO in today's API centric, and javascript ruled world. json would be a lot more useful.

Given that it can be mechanically converted to JSON, I have a hard time seeing how JSON would be a lot more useful. "Very slightly more convenient" seems like the most we could reasonably say.
I wonder why. The U.S. Code was in electronic format by the 1980's. The Air Force had FLITE, Justice had Juris, and then there's Lexis and Nexis.
I wonder if 17 years is fast for the government or slow.
We might have to wait even longer for all the common law based on the court rulings to be published in the same format. Not to mention the "secret common law" based on the FISA court rulings.
Not sure what format the files are kept in, but it's been done. Several private companies have these behind a paywall:ThompsonWest and Lexis/Nexis are the most well known, but there are others. Don't forget the published caselaw of the fifty states as well as state legislation. The U.S. Code is far from the whole picture.
Fast..... sadly
Sounds about right.
Considering typical timescales for this sort of projects, i'd say we should be happy it's not in CSV or SGML.
Why? JSON should just be for data transfers. XML is for document formats. While the lines get seriously blurred that is there general use.
Agreed, for long running data sets you really want schemas.
There are Json schema implementations out there.
There are but XML schemas are well understood, standardized and have been around for a long time. Even if they did this project today, XML is the right choice due to its maturity and suitability for documents.
I hope you're joking.

XML is a superior format for semantically marking up content from a very detail-oriented dataset like this.

Given the level of detail in the XML, I shudder to think what monstrosity that would create.

Heck they didn't even bother to include a way to inherit styles, they are just included in every case.

This is surely an artifact of the conversion process. They are not storing the documents internally like this :)
Time to write a json wrapper. I nominate you!
i'd prefer markdown /sarcasm