Hacker News new | ask | show | jobs
by notaboutdave 3139 days ago
This is pretty great. It looks like it automatically writes to the next logical empty cell.

  curl https://www.bitstamp.net/api/v2/ticker/btcusd \
  | grep 'last": *"\d*\.\d*' -o | grep '\d*\.\d*' -o \
  | tosheets -c A1 --spreadsheet=foo
^ That will append the last BTC price to column A of sheet 'foo'
1 comments

Can I nerd out over how unreasonably effective regexps are?

That's basically a mini JSON parser in 48 characters.

With a domain-specific tool, it's even easier though.

    curl ... | jq -r .last
Sure. But you can't use jq to scrape arbitrary websites, for example. :)
Jeff Atwood has an entertaining post about parsing HTML with regular expressions:

https://blog.codinghorror.com/parsing-html-the-cthulhu-way/

“”” That's right, if you attempt to parse HTML with regular expressions, you're succumbing to the temptations of the dark god Cthulhu's … er … code. “””

Let's not forget about this masterpiece: https://stackoverflow.com/a/1732454/864310
Indeed, its quality cannot be ignored and must be shared; it’s referenced in the Atwood post.
Parsing and scraping are different things though. You don't need to parse a web page to extract specific things from it.
Of course, if it's anything like HTML, the formatting will vary over time that you really want a more permissive parser like BeautifulSoup. I haven't found a cli interface, so I briefly wrote my own ages ago: https://github.com/jldugger/dotfiles/blob/master/bin/select.....
For cases where a website is not a tutorial for websites, regex is a suitable tool for scraping.
Do NOT use r̔ͩegͩ̾ͪͥͪeͮ͊ͨ̓xͫ͆͆̓ ͤt͊͗o̒̾͋ͬ̾̚ ͆̌͌̄p͠a͟r̐̎͆̽̄ͭ́se̒ͥ ͕̪̻̭̭̺̳ͣͬͪͫͪj̞̲ͤ̚s̡̳̟̤̳̖̤͒̋ͣ͊ͤ͗̿oͬ̀͆n̮̳͚̝̩͙͔̈͋ ̮͕̩̼̔̾̈̄̋ͦ́́̚ͅĤ̷̝̯̝ͯ̈́ͣ̔ͪ̊ͬ͜͡Eͬͧ҉̜̰̲̩̰̝̠̥ ̶̯̯͚̗̪̭̘ͨͩͭ̎ͧͮCͪ҉̖͍͔͚͚̯͕Ơ̰̻̂̅̋̇̓̅͌M̸̭̱̭̥͆̽ͨͦÊ̸̴̢̪͚̮̲̜̙̍ͤ͋̾ͦS̛̗̟͙͍̹͈̳ͣ͑̏̓ͤͦ̽
Please don't do this here.