Hacker News new | ask | show | jobs
by j_w 23 days ago
Isn't an array of arrays by definition the sequential implementation?

Otherwise you would have an array of pointers to arrays. The usage (syntax) for them would be the same but the performance would not be.

They also have different uses. You would expect an array of arrays to be an array of arrays which share the same length. For an array of pointers to an array you would expect dynamic length arrays contained within the original array.

Even in c++ could you not just define some int [1000][1000]foo? I've never really used C++ but my C knowledge assumption is that is 1000000 continuous elements.

2 comments

The C++ way to do it currently would be:

    std::array<std::array<T, N>, M> data;
Which is contiguous

    int data[M][N]; 
also works fine and is contiguous in C++

Edit:

For the stack at least. On the heap, you'd need to use a single std::vector<int> and do the indices manually, or use mdspan

I does not work fine in C++ when N and M are not compile-time constants, which is basically always the case in any interesting numerical algorithm. Also not in Rust.

It works fine in C though, or FORTRAN, or Ada, or ALGOL 60, ...

Which is why std::mdspan exists, and std::linalg.

NVidia has pivoted to design CUDA hardware with focus on C++ back in , and seems to be doing quite well for them.

CppCon 2017: "Designing (New) C++ Hardware”

https://www.youtube.com/watch?v=86seb-iZCnI

They were also the ones sponsoring the ISO work on mdspan, while HPC research labs are pushing for linalg on top.

I would rather be using Ada today, but that isn't how the world moves.

I see that they spend time making their hardware run general software, but I can't see anything specific in GPU hardware to std::mdspan.

I respect Ada but I would not want to use it. But I have a choice between C++ with hmdspan and C99's arrays, I choose C99 any time.

Why is that? I find Ada much nicer than the C-languages when it comes to arrays: A'Range, A'Length, A'First, and A'Last are super-useful, as is the unconstrained array.

You can even use unconstrained arrays to provide the same functionality that Optional does in functional-programming, provided the element-type can be an element of an array:

    -- Here we define an index-type with one value.
    Subtype Boolean_Index is Boolean range True..True;
    -- And here we define an array indexed by it, but can also have length 0.
    Type Optional(Boolean_Index range <>) of Element;
And there you have the mechanism for Optional; just use "For Object of Optional_Array Loop" to enclose your operations and bam, it works perfectly.
I guess you aren't their target customer anyway, NVidia isn't that found of pure C code, with first class tooling for C, C++, Fortran, Python JIT, Ada and most recently Rust.

The std:mdspan proposal came from NVidia employees, alongside AMD and HPC research labs.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p00...

Yeah, I remember discussions on comp.lang.c calling programming with Ada, or even Modula-2, programming with straightjacket.

Meanwhile governments and national security bodies got another point of view.

You mean the C99 arrays Google paid the work to clean from the Linux kernel?

The thing is that I rewrite high performance numerical code on GPUs and the CUDA part is what sucks most. And the moment one uses templates, the compilation times make it insufferable. I really do not understand why people put up with this garbage. I am really looking forward to the day where I can remove CUDA from my projects and replace it with compiler-supported offloading is really

The kernel removed VLAs, I am more talking about vm types. But even for VLA - while I had a small role in that undertaking myself - I think it was a stupid mistake from a security point of view to remove VLAs from the kernel. Google pays for a lot of nonsense...

> Even in c++ could you not just define some int [1000][1000]foo?

If it fits on the stack, yes.

Typical code using MD-arrays is scientific code, and the data they manipulate generally do not fit there.

Would the compiler not allocate the memory contiguously on the heap in that case then? Seems like a reasonable thing to do.
Nope. The C++ memory models is designed around no hidden/non-deterministic memory allocation.

If you try to allocate 10MB on the stack, that's the dev problem if the program fails, it's not the compiler job to guesstimate whether something will fit there or not (and it's impossible anyway, the compiler can't know all the stack sizes a program will ever run on).