| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by notaboutdave 3139 days ago

This is pretty great. It looks like it automatically writes to the next logical empty cell.

  curl https://www.bitstamp.net/api/v2/ticker/btcusd \
  | grep 'last": *"\d*\.\d*' -o | grep '\d*\.\d*' -o \
  | tosheets -c A1 --spreadsheet=foo

^ That will append the last BTC price to column A of sheet 'foo'

1 comments

sillysaurus3 3139 days ago

Can I nerd out over how unreasonably effective regexps are?

That's basically a mini JSON parser in 48 characters.

link

gcr 3139 days ago

With a domain-specific tool, it's even easier though.

    curl ... | jq -r .last

link

sillysaurus3 3139 days ago

Sure. But you can't use jq to scrape arbitrary websites, for example. :)

link

dpflan 3138 days ago

Jeff Atwood has an entertaining post about parsing HTML with regular expressions:

https://blog.codinghorror.com/parsing-html-the-cthulhu-way/

“”” That's right, if you attempt to parse HTML with regular expressions, you're succumbing to the temptations of the dark god Cthulhu's … er … code. “””

link

alexozer 3138 days ago

Let's not forget about this masterpiece: https://stackoverflow.com/a/1732454/864310

link

dpflan 3138 days ago

Indeed, its quality cannot be ignored and must be shared; it’s referenced in the Atwood post.

link

ams6110 3138 days ago

Parsing and scraping are different things though. You don't need to parse a web page to extract specific things from it.

link

jldugger 3139 days ago

Of course, if it's anything like HTML, the formatting will vary over time that you really want a more permissive parser like BeautifulSoup. I haven't found a cli interface, so I briefly wrote my own ages ago: https://github.com/jldugger/dotfiles/blob/master/bin/select.....

link

nurettin 3139 days ago

For cases where a website is not a tutorial for websites, regex is a suitable tool for scraping.

link

heptathorp 3138 days ago

Do NOT use r̔ͩegͩ̾ͪͥͪeͮ͊ͨ̓xͫ͆͆̓ ͤt͊͗o̒̾͋ͬ̾̚ ͆̌͌̄p͠a͟r̐̎͆̽̄ͭ́se̒ͥ ͕̪̻̭̭̺̳ͣͬͪͫͪj̞̲ͤ̚s̡̳̟̤̳̖̤͒̋ͣ͊ͤ͗̿oͬ̀͆n̮̳͚̝̩͙͔̈͋ ̮͕̩̼̔̾̈̄̋ͦ́́̚ͅĤ̷̝̯̝ͯ̈́ͣ̔ͪ̊ͬ͜͡Eͬͧ҉̜̰̲̩̰̝̠̥ ̶̯̯͚̗̪̭̘ͨͩͭ̎ͧͮCͪ҉̖͍͔͚͚̯͕Ơ̰̻̂̅̋̇̓̅͌M̸̭̱̭̥͆̽ͨͦÊ̸̴̢̪͚̮̲̜̙̍ͤ͋̾ͦS̛̗̟͙͍̹͈̳ͣ͑̏̓ͤͦ̽

link

dang 3138 days ago

Please don't do this here.

link