| Custom / semi-custom is probably an old Altera FPGA kinda thing, which means verilog (or similar hardware-level programming languages), which is much more difficult to use in my experience. I generally see the "compute" hierarchy as: 1. General Purpose CPUs -- easiest. 2. CPU acceleration -- AVX512, NEON, SVE, etc. etc. Specialized CPU instructions that only exist on certain versions of hardware. 3. General Purpose GPU -- OpenCL, DirectCompute, ROCm, CUDA. General purpose GPU instructions that work on a wide variety of hardware. 4. GPU accelerated units -- Raytracing, matrix-multiplication, bfloat16 support, wave intrinsics. I guess DirectX12 Ultimate has these but you definitely need to be checking to see if your hardware supports this before using them. 5. FPGAs -- Over here baby, but much much harder than #4 in practice. ------- I know Intel has FPGA + Intel core chips, and that Microsoft has used them before for Bing search and other such projects. But no one ever told me it was easy. |