Hacker News new | ask | show | jobs
by Scene_Cast2 100 days ago
Are there any good static (i.e. not runtime) type checkers for arrays and tensors? E.g. "16x64x256 fp16" in numpy, pytorch, jax, cupy, or whatever framework. Would be pretty useful for ML work.
5 comments

We're working on statically checking Jaxtyping annotations in Pyrefly, but it's incomplete and not ready to use yet :)
This would be an insta-switch feature for me! Jaxtyping is a great idea, but the runtime-only aspect kills it for me - I just resort to shape assertions + comments, but it's a pretty poor solution.

A follow-up question: Google's old `tensor_annotations` library (RIP) could statically analyse operations - eg. `reduce_sum(Tensor[Time, Batch], axis=0) -> Tensor[Batch]`. I guess that wouldn't come with static analysis for jaxtyping?

- /?hnlog pycontract icontract https://westurner.github.io/hnlog/ :

From https://news.ycombinator.com/item?id=14246095 (2017) :

> PyContracts supports runtime type-checking and value constraints/assertions (as @contract decorators, annotations, and docstrings).

> Unfortunately, there's yet no unifying syntax between PyContracts and the newer python type annotations which MyPy checks at compile-type.

Or beartype.

Pycontracts has: https://andreacensi.github.io/contracts/ :

  @contract
  def my_function(a : 'int,>0', b : 'list[N],N>0') -> 'list[N]':
  
  @contract(image='array[HxWx3](uint8),H>10,W>10')
  def recolor(image):
For icontract, there's icontract-hyothesis.

parquery/icontract: https://github.com/Parquery/icontract :

> There exist a couple of contract libraries. However, at the time of this writing (September 2018), they all required the programmer either to learn a new syntax (PyContracts) or to write redundant condition descriptions ( e.g., contracts, covenant, deal, dpcontracts, pyadbc and pcd).

  @icontract.require(lambda x: x > 3, "x must not be small")
  def some_func(x: int, y: int = 5) -> None:
icontract with numpy array types:

  @icontract.require(lambda arr: isinstance(arr, np.ndarray))
  @icontract.require(lambda arr: arr.shape == (3, 3))
  @icontract.require(lambda arr: np.all(arr >= 0), "All elements must be non-negative")
  def process_matrix(arr: np.ndarray):
      return np.sum(arr)

  invalid_matrix = np.array([[1, -2, 3], [4, 5, 6], [7, 8, 9]])
  process_matrix(invalid_matrix)
  # Raises icontract.ViolationError
Parquery/icontract: https://github.com/Parquery/icontract

mristin/icontract-hypothesis: https://github.com/mristin/icontract-hypothesis :

> The result is a powerful combination that allows you to automatically test your code. Instead of writing manually the Hypothesis search strategies for a function, icontract-hypothesis infers them based on the function's precondition. This makes automatic testing as effortless as it goes.

pschanely/CrossHair: An analysis tool for Python that blurs the line between testing and type systems https://github.com/pschanely/CrossHair :

> If you have a function with type annotations and add a contract in a supported syntax, CrossHair will attempt to find counterexamples for you: [gif]

> CrossHair works by repeatedly calling your functions with symbolic inputs. It uses an SMT solver (a kind of theorem prover) to explore viable execution paths and find counterexamples for you

Check out optype (specifically the optype.numpy namespace). If you use scipy, scipy-stubs is compatible and the developer of both is very active and responsive. There's also a new standalone stubs library for numpy called numtype, but it's still in alpha.
Jaxtyping is the best option currently - despite the name it also works for Torch and other libs. That said, I think it still leaves a lot to be desired. It's runtime-only, so unless you wire it into a typechecker it's only a hint. And, for me, the hints aren't parsed by Intellisense, so you don't see shape hints when calling a function - only when directly reading the function definition.

Personally, I also think the syntax is a little verbose: for a generic shape hint you need something like `Shaped[Array, "m n"]`. But 95% of the time I only really care about the shape "m n". It doesn't sound like much, but I recently tried hinting a codebase with jaxtyping and gave up because it was adding so much visual clutter, without clear benefits.

There have been some early proposals to add something like that, but none of them have made it very far yet. As you might imagine, it's a hard problem!