Hacker News new | ask | show | jobs
by barosl 1457 days ago
The addition of a TOML parser to the standard library is really welcome. I've always wanted to use TOML in my Python scripts but having to import an external library made me use JSON instead, which was awkward due to the lack of comments.
5 comments

But why tomllib rather than toml? That’s just weird. I thought they’d long stopped using the -lib suffix as pointless.

(Judging by https://docs.python.org/3/library/, there are 17 of them already, disregarding zlib, mostly niche and of ancient lineage. Comparing with https://docs.python.org/2/library/, it looks like three have been added in the py3k era. reprlib in 3.0 (with unpythonic naming conventions, dunno what’s up with that), pathlib in 3.4, and graphlib in 3.9 (though the version it was added in is missing from the page; I invite someone else to fix this or file a bug report)—so I guess -lib suffixes aren’t quite as dead as I’d thought.)

This is what you get when you missed a chance to namespace standard library many years ago. Should have been std.toml.
If ever there was a bikeshed to bikeshed all other bikesheds it’s people creating hierarchies where none is needed. Toml is fine, it’s perfect, it’s exactly what people would expect and remember. But instead you’d have them need to remember that it’s hiding in std. “but if everything’s in std it’s easy to remember!” Yes but then also it’s not needed. And if everything isn’t in std you leave people having to remember if toml was in std, sys, or was it found in tools? Utils? Or did they put it in parsertools? And for what reason? The user knows they want toml. And it’s not like you need to have multiple separate tomls in different places in the hierarchy.

Leaving users having to guess where in the hierarchy you decided to hide something, that they know they’re looking for doesn’t add value. No, just keep it flat and simple.

> creating hierarchies where none is needed. Toml is fine, it’s perfect, it’s exactly what people would expect and remember. [...] And it’s not like you need to have multiple separate tomls in different places in the hierarchy.

Yes, "toml" would be perfect, but we're getting "tomllib" instead, due to the lack of namespacing that the parent comment is lamenting.

The problem with the current situation is that (1) every project has to keep all the standard library modules in mind and make sure to never name a module "io", "site" or "email", etc. and (2) once a non-conflicting name has been picked (e.g. "toml") it will break if the standard library later introduces a module with the same name.

(I'm not advocating for Java's endless chain of single-child directories though.)

They reorg'd in 3.0 but fell short of mandating namespace guidelines, so it was inevitable that the chaos would come back in at some point.

Who knows, it could be a good feature to earn the 4.0 moniker.

To avoid breaking exiting code that imports "toml", which is more common than "tomllib".
toml was already taken and no one could contact the author.
.ini configparser worked fine for decades, especially as a simple project config.
PEP 518 lists some reasons not to use configparser and I agree with them wholeheartedly.[1] Personally I don't want to use ill-defined formats, even for simple projects.

[1] https://peps.python.org/pep-0518/#configparser

The only reason offered is that there are (or might be) differences between version of Python, mainly 2 and 3, as far as I can see.

Python 2 is EOL, so that's no longer a concern.

As for differences between Python 3 releases, isn't there a fairly large difference in TOML support as well, since in versions before 3.11 it doesn't work at all? Wouldn't specifying the behaviour of the INI parser as whatever 3.11 is doing (and raising an exception on earlier versions) amount to essentially the same thing?

Their reason was obsolete shortly after it was written, Python 2 was EOL years before TOML included.

Toml is ill-defined as well. ini files work fine for these trivial uses.

https://hitchdev.com/strictyaml/why-not/toml/

So now you have two flawed ways to do it, congrats.

Newer packaging-related PEPs require TOML, and many developer tools use TOML by choice, but a fresh install of Python couldn't actually read it. That required workarounds from tool and library maintainers, making it more complicated to support as many users' preferences and ways of working as possible.

However you prefer to work, your tools may now be easier to test and maintain. This change is good for everyone.

Rather unnecessary. See my comment above.
These files are already in toml, so it's only "unnecessary" if you ask for them to switch the file format.

Otherwise, adding tools to the standard library to read file formats required by the ecosystem is a good idea, regardless of whether you agree with the particular format.

They fixed a problem that they created.

History has shown that worrying about an incompatibility with the moribund Python 2 that never affected trivial packaging config was a waste of time.

Meanwhile we still have a setup.cfg on a work project, has worked without issue for 15+ years.

Ini files aren’t a standard though, so every language can handle them differently. As a result they often bite people using them in multiple languages in the foot
These are dev written and Python3 read files, the parser of which hasn't changed in a decade. Everyone knows how to write them. The parser is also lenient and handles both kinds of comment. That's enough for simple metadata.

We have a 15 year old setup.cfg in one project at work, many other .ini's... has never been an issue.

It seems to me that HSL is a better fit than TOML. I use TOML because it is okay and seems better suited than YAML in some situations, but if I had a choice HCL would be the defacto standard.
HCL is also oodles more complex than TOML since it has expression evaluation, whereas TOML is just a static format. It really depends on your use case and I think there's a lot of cases where TOML makes a lot more sense than HCL (e.g. metadata formats).
HCL is one properly nested format and much more readable like YAML. TOML ends up getting confusing once you nest a few levels.
HCL also has expression evaluation however: TOML parsing is guaranteed linear over the size of the file, but HCL isn't necessarily because you can trivially do things like dynamically create a n*n-sized list from n-sized input list, which may be an undesirable property!
CUE is much better than HCL. I've been converting my more complex TF to CUE and then exporting as json for TF to consume.
Thanks. I'll check it out.
What's HCL?
Hashicorp (config?) Language

It's what your Terraform files are primarily written in

In python, I just go with "config.py".
I only do this for configurations that aren't meant to be used by non-programmers. I don't want to have to explain for the billionth time that something like the path "C:\New Folder" isn't going to work as-written in a string literal, or that the name "type" is already taken.
Yeah. It's easy to write, easy to read. It takes zero effort to parse. Serializing as "python file" might be a bit awkward.
That is also really important as pyproject.toml is becoming the de facto way of declaring build requirements.