|
|
|
|
|
by petrohi
1555 days ago
|
|
Great questions! With Tensil, all computations are performed on the FPGA. In addition to matrix multiplication Tensil supports SIMD instruction with various operations. The ML activations, average and maximum pooling, normalization, and image resizing use SIMD instruction. Some ML operations, such as padding, are achieved by changing the memory layout. Tensil uses DRAM0 and DRAM1 memory pools (usually in DDR memory) to interact with the host to read model inputs and weights and write outputs. It also uses these pools to offload intermediate results between layers and between tiles within a layer when FPGA does not have sufficient BRAM, which is common on lower-end devices. Tensil compiler takes care of finding the most efficient memory scheduling for given on-FPGA memory size. |
|
Edit: Sorry I think you already clarified that the DRAM0 & DRAM1 memory pools are located on the host