So from these resources it seems like they develop a vector processor with Semidynamics out-of-order Atrevido core as a scalar core and their Vitruvius VPU.
In the more recent report they have a vector length of 16,384 bits, with 16 lanes (8 in FPGA, 16 in the diagram, final version could be more), so total of 16*64=1024 bits of ALUs.
Slide 15 seems to indicate that they want to create a chip with 32 of those cores, a shared L3 cache, and access to HBM.