Hacker News new | ask | show | jobs
by zanie 413 days ago
:wave:

Looks like you found the not-so-secret repository we're using to prepare for a broader announcement :)

Please be aware this is pre-alpha software. The current version is 0.0.0a6 and the releases so far are all in service of validating our release process. We're excited to get this in people's hands, but want to set the expectation that we still have a lot of work left to do before this is production ready.

Stay tuned for more for news in the near future!

(... I work at Astral)

7 comments

For pre-alpha software it's working fantastic for my project. I thought I type annotated it well, but Ty had quite a lot of feedback for me. Great job and I can't wait until this is released.
Had you checked it with Pyright previously?
I've only used Pylance standard type checking in vscode. I have not used pyright as a stand alone package.
Use Pyright or basedpyright for now.
pylance is pyright
pylance is pyright + proprietary stuff on top, see https://docs.basedpyright.com/latest/ for a fork with some of pylance's features added.
If you can say - are there any thoughts about implementing plugins / extension capabilities to keep type checking working even with libraries that aren't otherwise typecheckable?

(where "not otherwise typecheckable" means types that can't be expressed with stubs - e.g., Django, dataclasses pre-PEP-681, pytest fixtures, etc.)

At least for the moment, we aren't planning on a plugin architecture. We do recognize that there are some popular libraries and code patterns that aren't easily (or at all) typeable with the current state of the typing spec. We feel it would be more useful to help drive changes to the typing spec where we can, so that other type checkers can also benefit; and/or implement workarounds for the most popular libraries directly in ty, so that a library author doesn't have to rely on their downstream consumers installing a particular set of plugins to get good type-checker results.

(It's also more difficult to support plugins effectively in a type checker like ty, than in a linter like ruff, since any non-trivial use case would likely require deep changes to how we represent types and how we implement type inference. That's not something that lends itself to a couple of simple hook extension points.)

Helping improve the spec and all is great, but being 100% honest, as a user, I would rather have a type checker I can bend to my needs. As you said, some code patterns in a dynamic language like Python are difficult, or even impossible, to type-check without custom code. Type checkers are becoming more popular than ever, and this implicitly means that these code patterns are are going to be discouraged. On one hand, I believe the dynamism of Python is core to the language. On the other, I would never want to write any collaborative piece of software without a type checker anymore. Therefore, to get the benefits of a type checker, I am occasionally forced to write worse code just to please it.

Considering how fast uv and ruff took off, I am sure you are aware of the impact your project could have. I understand that supporting plugins is hard. However, if you are considering adding support for some popular libraries, IMHO, it would be really beneficial for the community if you could evaluate the feasibility of implementing things in a somewhat generic way, which could be then maybe leveraged by third-party authors.

In any case, thanks for all the amazing work.

Out of curiosity, do you have experience with other languages that have type system plugins that you’d hope be used as inspiration for something in Python?

I don’t have any such experience (short of a macro system, which requires code generation or runtime support) and it always makes me curious when people ask for type system plugins whether this is a standard feature in a type system I’ve never used.

To add to the complexity, you have to worry about not just which language you're analyzing, but also which language the type-checker is implemented in.

So if we were to do this for ty, we would have to carefully design the internal data types and algorithms that we use to model Python code, so that they're extensible in a robust way.

But we would also have to decide what kind of Rust plugin architecture to use. (Embed a Lua interpreter? dlopen plugins at runtime? Sidecar process communication over stdin/stdout?)

Solvable problems, to be sure, but it adds to the amount of work that's needed to support this well — which in turn affects our decisions about whether/when to prioritize this relative to other features.

Isn't mypy extensible with plugins?
Can you either give some additional details on the code patterns you’re talking about, or link to some ‘typical’ examples? I do appreciate the flexibility of being able to just write code and not particularly be overly sensitive to jumping through typing hoops, but I can’t think of any place I’ve actually used algorithms or specific code patterns that rely on untyped-ness to actually work at run time. I’d be very interested in trying to work through what is actually required to consider these code patterns as well-typed.
IMO creating custom rules is problematic - when projects import external code, rule conflicts become inevitable. C++'s type system might be complex, but at least there's consistency across header files within a project.

Regarding type checkers: while I don't love optimizing code just to make them run faster, most Python patterns can be implemented in statically checkable ways without much compromise. The benefits typically outweigh the costs. Python's dynamic features are powerful but rarely essential for everyday tasks.

https://pypi.org/project/django-types/ is compatible with pyright without plugins, so it should theoretically work with ty. It might not check as much as mypy though especially with values() querysets.
What might, possibly, redeem Python in my eyes as a potential language for making production applications (something that today, it is most certainly not) would be if the type checker worked across the broader ecosystem of common Python packages.

For example, as my recent struggles showed, SQLAlchemy breaks `pyright` in all kinds of ways. Compared with how other 'dynamic' ORMs like Prisma interact with types, it's just a disaster and makes type checking applications that use it almost pointless.

How does Ty play with SQLAlchemy?

This is a weakness of the Python typing system and not necessarily of individual typecheckers. Pyright has a policy of only implementing what's standardized, and the Python type system is simply inadequate to annotate most real world Python code out there. It's been years now and something as basic as properly typing kwargs is still not supported.

Ty could solve this if they rebel and decide to ignore the Python typing standards, which I honestly would appreciate, but if they take the sensible approach and follow the standards, it won't change anything.

> properly typing kwargs is still not supported.

I've been typing them with TypedDict for a while now and it's been fine. What can't you do?

Python code feels like back in the day when JavaScript was typed using JSDoc comments, and libraries would use all kinds of fantastical object shapes for their option parameters, so users could pass "just about anything" and it would work. You would never know how to configure an Express app without digging through the documentation, for example.

I loathe the Python convention of just using kwargs instead of clearly annotated parameters; most libraries don't even have doc comments in the code, so you're really required to look up the documentation, hope that it actually describes the method you're interested in and contains more than stuff like "foo: the foo to use"—or fall back to rummaging in the library intestines to figure out how it works.

It's pathetic.

I'm not sure into what kind of industry you're in, but having most functions as (args, *kwargs) is not the way I deal with most of my code and the libraries I work at all (backend development). Everything is typed fully.

Maybe you're in a niche spot, or using scientist-based code. I've seen plenty of trainwrecks in 'conda-only' ""libraries"" done by scientists. Maybe that's the niche you're at?

Sometimes, though, you may get lucky, and find some tests for the code you want to use!

On a more serious note, I can't even blame library devs as long as they try. Type "hints" often are anything but _just_ hints. Some are expected to be statically checked; some may alter runtime behavior (e.g. the @overload decorator). It's like the anti-pattern of TypeScript's enums laid out here and there, and it's even harder to notice such side-effects in Python.

My experience is this is nearly impossible, the solution is new packages written after typing was introduced.

I don’t know about SQLAlchemy, but for libraries like pandas I just don’t see how it can be done, and so people are actively replacing them with modern typed alternatives

Ha. I just finished a huge rewrite at work from sync SQLAlchemy to async SQLAlchemy, because the async version uses a totally different API (core queries) to sync. So this implies if I want type checking I need to use a different ORM and start again?

I love how Python makes me so much faster due to its dynamic nature! Move fast, break things!

I don't agree that dynamic nature makes things necessarily faster, if you compare Python to C or Java it is true, but if you compare to Typescript it is not. With a decent typing system and a good editor that makes use of it (and AI-assistants nowadays) the prototyping can actually be both faster and more stable.
Yes, I 100% agree. My career has been Java/C++ -> php/JavaScript -> typescript/python. Types are a godsend
I think davedx was being sarcastic. Python's dynamic nature cost them time.
What version of SQLAlchemy? SQLAlchemy v2 is built with with type-hinting support, I didn't have any issues with it when I used it few months ago
Only tangentially related but does anyone else here get very bothered when looking at the SQLAlchemy documentation? It seems so hard to find what kind of magic incantation you need to do in which order when trying to do a somewhat non-trivial query and I often just write the SQL I want and then tell chatGPT to rewrite it to SQLAlchemy operations but thats not really a sustainable solution.
Have you sat down and read the SQLAlchemy docs properly? It made a lot more sense to me once I'd set aside an hour or two to work through the Unified Tutorial.[0] I feel like these days people just want quick answers to do very specific things but that's a very inefficient way to learn something like SQLAlchemy.

If you know the SQL you want it's just a matter of writing it in SQLAlchemy's query language which is quite close to SQL. Should just be a matter of practice to become fluent in it. "Complex queries" usually turn up when you're doing something like rendering a table or report or something. You don't need the ORM for this kind of thing, just write a query.

An ORM is useful when you want to write domain logic to do read/write operations against domain entities and persist them back to a database. IMO people get hung up on ORMs and think if they're using one then they have to use it for everything then do the most horrible contortions that should have just been db queries. SQLAlchemy allows you to use the ORM judiciously.

[0] https://docs.sqlalchemy.org/en/20/tutorial/index.html

> I feel like these days people just want quick answers to do very specific things but that's a very inefficient way to learn something like SQLAlchemy.

Good documentation should absolutely provide a usable reference to quickly look up common ways to solve common problems. Even the PHP docs got that right twenty years ago.

Also, I disagree: A library should be as self-evident and incrementally understandable as possible, not require reading a full tome and grow a grey beard before being accessible.

> "Complex queries" usually turn up when you're doing something like rendering a table or report or something. You don't need the ORM for this kind of thing, just write a query.

Or, when building generic filtering/sorting/pagination logic for a bog-standard CRUD app. Or to do full-text search. Or when doing lateral joins to minimize queries. Or to iterate over a huge table. There's lots of cases where I want the ergonomics and malleability of ORM query instances even when working with complex queries.

> I feel like these days people just want quick answers to do very specific things but that's a very inefficient way to learn something like SQLAlchemy.

In defense of OP, a particular frustration I have with SQLAlchemy is that I understand SQL just fine, but the ways in which I translate my SQL knowledge into SQLAlchemy incantations is often pretty obscure. I think I deserve "quick answers to do very specific things" because I already have the exact form of the SQL solution in my head. That it then takes 20 minutes of digging through docs or ChatGPT is annoying.

Exactly the reason I stay away from it. I prefer just SQL and something like aiosql to load it.
Yes. It’s super opaque. SqlAlchemy is one of those libs I want to like but the docs just make it too difficult.
Have you tried SQLModel?
I've tried using it but it's still so immature and poorly documented. I wish it were different because I love the idea of it.
You mean Python is not a language for production applications ?
Only when performance doesn't matter, then it becomes a DSL for C and C++ libraries.
For me, until it gets a production quality JIT, or PyPy and GraalPY get more community love, it remains a scripting language for learning on how to program, automating OS and applications tasks.
Instagram I think would like a word with you on its viability for production.
Maybe you should first investigate all the gimmicks they had to do, between amount of servers they had to ramp up burning needless budget, rewriting code into C and C++ libraries, Go or whatever else they ended up adding, before doing such statements.

https://stackshare.io/instagram/instagram

Any other link to share about that? The stackshare url does not even mention anything related.
It surely does, it is quite simple to correlate how many of those technologies are actually implemented in Python.

Pure Python that is.

Cool! Out of curiosity, what's the bedrock that's used to determine what the fundamental python AST objects are? I'm wondering what the "single source of truth" is, if you will.

Is this all based off a spec that python provides? If so, what does that look like?

Or do you "recode" the python language in rust, then use rust features to parse the python files?

Regardless of how it's done - This is a really fascinating project, and I'm really glad you guys are doing it!

There is a formal grammar defined in the CPython repo, implemented in a language called ASDL:

https://github.com/python/cpython/blob/main/Parser/Python.as...

ty uses the same AST and parser as ruff. We don't use the ASDL grammar directly, because we store a few syntax nodes differently internally than how they're represented upstream. Our parser is hand-written in Rust. At first, our AST was also entirely hand-written, though we're moving in the direction of auto-generating more of it from a declarative grammar.

https://github.com/astral-sh/ruff/issues/15655

https://github.com/astral-sh/ruff/tree/main/crates/ruff_pyth...

https://github.com/astral-sh/ruff/blob/main/crates/ruff_pyth...

ditto! but we gave impressively non-overlapping answers
As in, how are we parsing the Python code into an AST?

CPython uses a generated parser. The grammar is defined in https://github.com/python/cpython/blob/main/Grammar/python.g... which is used to generate the specification at https://docs.python.org/3/reference/grammar.html#full-gramma...

We use a hand-written parser, in Rust, based on the specification. We've written that previously at https://astral.sh/blog/ruff-v0.4.0#a-hand-written-parser

Curious if this means it'll be released as a separate binary than ruff? I personally feel like having it within ruff is much nicer for ensuring that we have a consistent set of dependencies that play nicely with each other. Though I guess because a type checker doesn't mutate the files maybe that's not a real concern (vs formatting/linting with --fix).
It'll be separate (at least to start) — we want to be able to iterate on it rapidly. Long-term, a consistent toolchain is definitely important and something we're thinking about.
+1 for (eventually) baking it ty into ruff. In my mind static type checking is a form of linting.

For years I pushed black for formatting code. Once formatting was baked into ruff I ditched black. Having fewer dependencies to track and update simplifies my life and shortens my dependabot queue.

Finally the missing puzzle piece from the astral toolchain is here! <3
Pointlessmy anal; but 0.0.0a6 is very strongly indicative of the sixth alpha release. Pre-alpha are much better as .dev releases.
> Pre-alpha are much better as .dev releases.

No, they are correctly using semantic versioning to indicate pre-alpha releases. https://github.com/astral-sh/ty/releases https://semver.org/

Python doesn’t use plain semver: https://peps.python.org/pep-0440/
The reference Python implementation, written in C, doesn't use semver. But other projects in the Python ecosystem are generally assumed to unless stated otherwise. For example, Setuptools does (but not pip: https://pip.pypa.io/en/stable/development/release-process/).
It is the sixth alpha release. They haven't yet released a stable version – this is their sixth alpha release before that.

What am I missing here?

I was replying to a post that said "Please be aware this is pre-alpha software." presumably trying to make a distinction from "alpha software".