Hacker News new | ask | show | jobs
by elijahwright_ 358 days ago
this article doesn't make any sense. the bill has a lot of em dashes because that's how bills are expressed and it's a large bill. bills in Congress aren't written with em dashes because it can be confusing with the bill syntax and there's not a reason to do it that way
2 comments

The author compares it to the average bill going through congress, where you expect 0.1 emdash per page, where this bill has 10. So 100x the historic average.
well, for one, it's more more than 0.1 em dashes per page. the SHARE IT Act has 10 on each page[0]. I don't know how many the 2017 tax cut bill had but it's more than 1,000 and that was over 185 pages[1], and obviously that was before LLMs like ChatGPT. so I don't really know why this is the measure of AI or not, especially because bills have always had a lot of em dashes to start. if you're not analyzing the text of the bill then it's just not going to be accurate

[0] https://www.congress.gov/118/plaws/publ187/PLAW-118publ187.p...

[1] https://www.congress.gov/115/statute/STATUTE-131/STATUTE-131...

I'm the author and updated this post - after looking into this, the larger bills contain entire pages with only headings that contain emdashes - removed the headings from analysis so that the emdashes per page are only from the legislative text itself. For the baseline, over 50% of bills found on congress.gov are 1-2 pgs, after reading a few I decided some rationale could exist to remove them from the baseline - even after all these adjustments, we're still looking at a 30% increase from a decent baseline of similar bill size. It's evident when reading the text below headings (as a human!)
Share IT is from 2024, but the 2017 tax cut bill is interesting (lots of emdashes there that deviate from the avg) - you’re correct on the additional need for text analysis in this case. Bills I’d found from earlier in 2024 that are publicly available do not have emdashes outside of the table of contents, which is built into the average - curious how/why they are used so much in this bill from 2017, now wondering how they got into any potential templates (or not), and adds the confound of how much this is AI or template (or requirements, or something else) Thx!
Not following exactly, so apologies if I'm misinterpreting, but I'm the author and updated this post (transparently) with nuance I'd recently learned about that explains this (somewhat) - the larger bills contain entire pages with only headings that contain emdashes - removed the headings from analysis so that the emdashes per page are only from the legislative text itself. For the conservatively / minimal difference, we're still looking at a 30% increase from a decent baseline.