Hacker News new | ask | show | jobs
by rtpg 1023 days ago
data = json.load(sys.stdin)

commits = [elt.commit for elt in data if elt.commit.author = "Tom Hudson"]

json.dump(commits, sys.stdout)

Definitely not as straightforward... would be nice to have a bit more affordances for path expressions in Python.

2 comments

That doesn’t quite work, because JSON objects are parsed to Python dicts, not Python objects with properties, so it would be:

  data = json.load(sys.stdin)
  commits = [
    e["commit"] 
    for e in data 
    if e["commit"]["author"] == "Tom Hudson"
  ]
  json.dump(commits, sys.stdout)
This also won't work since it'll crash on missing fields. e.get("commit", {}).get("author", "") maybe (ignoring the corner case of non-list top level object).
Which is pretty useful - I will get malformed JSON error as earlier as possible.

P.S. `some.get("A", {})["B"]` is bad programming habit because there might be a list on `some["A"]`

You can do it like this with Jello (I am the author):

    jello '[e.commit for e in _ if e.commit.author == "Tom Hudson"]'
Jello let’s you use python syntax with dot notation without the stdin/stdout/json.loads boilerplate.

https://github.com/kellyjonbrazil/jello

This is a non-problem solved by the jq example. Clearly nobody sane writes (or consumes) APIs which sometimes produce array of object, sometimes produce singular objects of the same shape... Or maybe I'm spoiled from using typed languages and cannot see the ingenuity of the python/javascript/other-untyped-hyped-lang api authors that it solves?
> Clearly nobody sane writes (or consumes) APIs which sometimes produce array of object, sometimes produce singular objects of the same shape...

Has nothing to do with arrays, it has to do with the fact that Python dicts with string indexes and Python objects with properties are different things, unlike JS where member and index access are just different ways of accessing object properties.

> Or maybe I'm spoiled from using typed languages and cannot see the ingenuity of the python/javascript/other-untyped-hyped-lang api authors that it solves?

This isn't an untyped thing, this is a JavaScript (and thus JSON) and Python have type systems (even if they usually don't statically declare them) and those type systems and thus the syntax around objects are different between the two.

I see. I am spoiled, I think. :)
Oops, yep totally. Even more futzy! Think if I was doing this a lot I'd totally pull out one of those "dict wrappers that allow for attr-based access" that lots of projects end up writing for whatever reason
jmespath is your friend for this

    import jmespath
    import json

    doc = json.load(sys.stdin)
    print(jmespath.search("[?commit.author == `Tom Hudson`].commit", doc))
I wish it had won over jq because JMESPath is a spec with multiple implementations and a test suite where jq is... well jq and languages have bindings not independent implementations.
`import jmespath` is a lot like importing jq...

> I wish it had won over jq because JMESPath is a spec with multiple implementations and a test suite where jq is... well jq and languages have bindings not independent implementations.

jq has multiple implementations too! In Go, Rust, Java, and... in jq itself.

So just picking Java https://github.com/eiiches/jackson-jq

> jackson-jq aims to be a compatible jq implementation. However, not every feature is available; some are intentionally omitted because thay are not relevant as a Java library; some may be incomplete, have bugs or are yet to be implemented.

Where JMESPath has fully compliant 1st party implementations in Python, Go, Lua, JS, PHP, Ruby, and Rust and fully compliant 3rd party implementations in C++, Java, .NET, Elixer, and TS.

Having a spec and a test suite means that a all valid JMESPath programs will work and work the same anywhere you use it. I think jq could get there but it doesn't seem to be the project's priority.

Repeating an identifier like this is inelegant, it should be (untested)

  commit|[?author == `Tom Hudson`]
jmespath does look like an interesting thing. Wish it weren't stringly-typed but that is a bit unavoidable.