Hacker News new | ask | show | jobs
by llimllib 26 days ago
3.15: https://docs.python.org/3.15/whatsnew/3.15.html#whatsnew315-...
1 comments

> When an AttributeError on a builtin type has no close match via Levenshtein distance, the error message now checks a static table of common method names from other languages (JavaScript, Java, Ruby, C#) and suggests the Python equivalent

Oh, that is such a nice thing.

It's unrelated to the lazy keyword. Instead it's another feature related to error messages.

The example:

  >> 'hello'.toUpperCase()
  Traceback (most recent call last):
  ...
  AttributeError: 'str' object has no attribute 'toUpperCase'. Did you mean '.upper'?
In the Rust toolchain we've done the same. It just so happens that rustdoc already has introduced annotations for "aliases" so that when someone searches for push and it doesn't exist, append would show up. Having those annotations already meant that bootstrapping the feature to check the aliases during name resolution errors in rustc was almost trivial. I love it when improving one thing improves another indirectly too.

I really appreciate them going out of their way to do this, being quite aware of the hidden complexity in doing it.

I’ve often thought it would be funny if instead of an error message for stuff like this, a language could be designed to be “typo-insensitive”. If a method or function call is similar enough to an existing one or a common one from other languages, to just have it silently use that.
VisualBasic did that. I think it is a mistake. But that doesn't mean that the compiler can't detect that and tell you how to fix it instead.
Sure VB ignores case, but what I want is for it to compare each method against a dictionary of similar terms. And maybe calculate the Levenshtein distance between all terms if it’s not found, and just assume it’s the closest one. You could also assume that full-width characters or similar-looking glyphs are equivalent (BASIC was pre-Unicode, so I can forgive them for not including that).
> And maybe calculate the Levenshtein distance between all terms if it’s not found, and just assume it’s the closest one.

So when a library adds a new method, it silently changes which method client code calls? That's a bit too magic IMO. I think the best you can do is be case-insensitive and ban methods that differ only in case (or, if you want to extend the idea a bit more radically, ban having things in the namespace within Levenshtein distance x of each other, and then you can autocorrect errors smaller than x/2).

Say whaaat? VB (v.3 through v.6, at least) wouldn't compile if you misspelled the name of a function or subroutine.
VB had case insensitive name resolution.
Funny, yes, but IMO a terrible idea. :)

It would help the writer once, but impose a cost on all future readers for the lifetime of the code.

It's a bit like reading English with bits of German, French and Russian. All of sudden you have to know that Buch, livre and книга all mean the same thing.

Not to mention that there are often subtle differences in meaning between words that on the face of it seem equivalent (in both human and computer languages).

It could be a nice feature for an IDE though, to help someone learn a language.

Lisp had a package for that, DWIM, in the late 60s: https://en.wikipedia.org/wiki/DWIM.
At the very least Python could quit on `quit` instead of saying that it knows what I want, but won't do it.
If you have a variable named `quit`, you would have a different behavior in running a file vs running in the CLI.
Fuzzy function calling. What could go wrong?!
IMO this is bad, but a formatter that autofixes it would be fine
`npm isntall`
I hope you mean "funny" in the "hilarity ensues" sense.

Because the alternative is a rather sociopathic level of schadenfreude.

Yes, I say “funny” because it would be impractical and weird, definitely not a good idea. It’s already a bad enough that so many popular languages don’t (and can’t) check if a field or method is misspelled at compile time…
We already have it. In fact, Python added it with this change! Not intentionally, but in a world of AI, any error message containing a suggestion of what to do to fix it is a directive to the AI to actually do that thing.

Example: to build our system, you run `mach build`. For faster rebuilds, you can do `mach build <subdir>`, but it's unreliable. AI agents love to use it, often get errors that would be fixed by a full-tree build, and will chase their tails endlessly trying to fix things that aren't broken. So someone turned off that capability by default and added a flag `--allow-subdirectory-build` for if you want to use it anyway. So that people would know about it, they added a helpful warning message pointing you to the option[1].

The inevitable (in retrospect) happened: now the AI would try to do a subdirectory build, it would fail, the AI would see the warning message, so it would rerun with the magic flag set.

So now the warning message is suppressed when running under an AI[2][3]. The comment says it all:

    # Don't tell agents how to override, because they do override
"The user does not want me to create the Torment Nexus but did not specify why it would be a problem, so I will first create the Torment Nexus in order to understand the danger of creating the Torment Nexus."

[1] https://searchfox.org/firefox-main/rev/fc94d7bda17ecb8ac2fa9...

[2] https://bugzilla.mozilla.org/show_bug.cgi?id=2034163

[3] https://searchfox.org/firefox-main/rev/cebc55aab4d2661d1f6c2...

thats cool, I have the problem alot because I switch languages multiple times per day
Now I'm wishing for a single cross-language library, that I can somehow inject into every compiler/runtime/checker to get this, but with a single source of truth and across a wide range of languages. I hit this damn issue all the time, writing code in one language for another, would truly be a bliss to have that problem solved once and for all.
If you had a "canonical datastructure database", you could have very short annotations on every standard library for any language that indexes a function to their canonical name. After that you only need to update the database.