Hacker News new | ask | show | jobs
Show HN: Need feedback for my Python library to query dicts (github.com)
21 points by cyberlis 2996 days ago
11 comments

Recently I was interested in parsing. Real parsing with grammars BNF LL LR.. :)

As exercise I implemented a library to query dicts.

This is my first python library and I'm looking for feedback and criticism

Especially, I need advice about public API

https://github.com/cyberlis/dictquery

You can install it with:

pip install dictquery

To write it I read:

Python Cookbook third edition. 2.18 Tokenizing Text, 2.19 Writing a Simple Recursive Descent Parser

All parsing related articles from https://eli.thegreenplace.net/2010/01/02/top-down-operator-p...

https://docs.python.org/3/library/re.html#writing-a-tokenize...

I really don't like the stringly API.

It'll defeat anything my editor can predict, making typos easier. It also makes it harder to reason about types, as I have to trust the library to make the right assumption. [0]

Instead of:

    dq.match(data, "`friends.age` <= 26")
Why not:

    import operator # from the standard library

    dq.match(data, "friends.age", operator.le, 26)
It'll simplify your parsing, works with my editor, and makes it easier to reason about types. You'll still need to parse, as you're basically handling a variadic function, but the user can hand you an AST.

---

[0] For example, what type would occur if:

    dq.match(data, "`friends.age` <= 26.")
Does 26 become a float? Or an integer? Or raise a ParseError?

I have to consider that with a string, less so with Python's own objects.

Actually why not overload operators to do something like:

dq.match(data, Field("friends.age") <= 26)

or in the appengine style if data is some class (say Friend):

dq.match(data, Friend__age <= 26)

if you need a nested key value, you can use function:

    def get_dict_value(query_dict, dict_key, use_nested_keys=True,
                       key_separator='.', raise_keyerror=False):

theoretically i can make alternative syntax.

The plus of "stringly API" is that you can save your queries in database, or get it through web interface

> The plus of "stringly API" is that you can save your queries in database, or get it through web interface

You can do the same with a function call, there's no difference, apart from already having a list of arguments to pass around. (And of course, the type uncertainty you get with a value stored in a string.)

> if you need a nested key value, you can use function:

Not sure why you're telling me that. What part of my comment are you responding to?

In your README.md you have:

    >>> import dictquery as dq
    >>> dq.match(data, "`friends.age` <= 26")
    True
But this is meaningless because you haven't defined data. What is it: A dict? A list of dicts? Something else?

I suggest you use a full example.

The data is given at the bottom of the page.
Also it would be nice to know how it compares to things like https://github.com/adriank/ObjectPath which seem similar.
You need to learn `ObjectPath` query language.

In `dictquery` I use simple and easy to understand language.

BUT My project is exercise and it is obviously not a competitor to mature, community supported libs.

That is why i do not want to compare `dictquery` to other libs

I think I would prefer a dot prefix instead of backticks for keys.
The back ticks to quote the keys are off-putting.. It feels like writing arcane SQL.

I personally would try and align with jsonpath for the convention, richer query-language and expressions.

Please, developers, don't use back ticks. They're not as accessible in other keyboard layouts.
I can use other symbol than back tick. But what symbol fits here better? what symbol would you use?
I'd prefer no quotes, but that would restrict the dictionary to keys without spaces. Maybe square brackets, since anyone using python already types those to access dictionary keys.
Yes, I wanted SQL like syntax and I'm happy it feels like this
Since you're using strings, I guess this same query language could be implemented in other languages also. I like this idea.
Not intended as criticism, but isn't it easier to just express your query in Python using filter?
Yes. You are right. But I wanted the way to store my queries in any major database. And I wanted natural simple queries.

If I need to check my dictionary I want to use something similar to natural language.

lambdas and filters are greate, but not obvious at a glance.

If I want to check my dict: "Username is 'john' and age is greater than 23"

I will make this query: "`username` == 'john' and `age` > 23"

This query is obvious (i hope it is) for anyone from first glance

maybe even more simple than reinventing a langage using lambda functions applying on path/value in a dict with an iterator (which can be consumed by filter/any/map/reduce) http://vectordict.readthedocs.io/en/latest/finding.html
It would be nice to know how performant your method is, but overall looks quite neat!
This seems like a really nice way to query json without dumping it into a DB
small README typo?? `user.frinds.age`
fixed by `mattcaldwell`
If it meets the guidelines, this might make a good 'Show HN'. Show HN guidelines: https://news.ycombinator.com/showhn.html
OK, we've added that.