|
|
|
|
|
by Gregaros
356 days ago
|
|
Some further questions: 1. For tasks like autocomplete, keyword routing, or voice transcription, what would the latency and power savings look like on an ASIC vs. even a megakernel GPU setup? Would that justify a fixed-function approach in edge devices or embedded systems? 2. ASICs obviously kill retraining, but could we envision a hybrid setup where a base model is hardwired and a small, soft, learnable module (e.g., LoRA-style residual layers) runs on a general-purpose co-processor? 3. Would the transformer’s fixed topology lend itself to spatial reuse in ASIC design, or is the model’s size (e.g. GPT-3-class) still prohibitive without aggressive weight pruning or quantization? |
|