Hacker News new | ask | show | jobs
by dceddia 1371 days ago
I noticed the size seems to correspond to the model. With a large model, the error is tensor<1x1280x3000xf16>. With tiny, it's tensor<1x384x3000xf16>, and with medium it's tensor<1x1024x3000xf16>. It also seems like a bad thing that those are f16's but the "expected" data is f32.
1 comments

I'm giving up for the night, but https://github.com/Smaug123/whisper/pull/1/files at least contains the setup instructions that may help others get to this point. Got it working on the GPU, but it's… much much slower than the CPU? Presumably due to the 'aten::repeat_interleave.self_int' CPU fallback.

Also hitting a nice little PyTorch bug:

> File "/Users/patrick/Documents/GitHub/whisper/whisper/decoding.py", line 388, in apply logits[:, self.tokenizer.encode(" ") + [self.tokenizer.eot]] = -np.inf

> RuntimeError: dst_.nbytes() >= dst_byte_offset INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/Copy.mm":200, please report a bug to PyTorch.