I'm still not successfully using the GPU, but it's working decently quickly (with the base model - it's incredibly slow to use the Large model) using just the CPU. I'm going to have to check what magic stable-diffusion is doing to enable the GPU :(
There's a --device flag you can pass. I've been trying to get `--device cuda` to work on my Windows machine and it's saying that torch wasn't compiled with CUDA. Trying to figure out what's going on there.
And on the M1, supposedly PyTorch has support for hardware acceleration using MPS (Metal Performance Shaders, announced here https://pytorch.org/blog/introducing-accelerated-pytorch-tra...) but when I tried `--device mps` it blew up with an error "input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible".
> I've been trying to get `--device cuda` to work on my Windows machine and it's saying that torch wasn't compiled with CUDA.
I struggled with the same. Here's what worked for me:
Use pip to uninstall pytorch first, should be "pip uninstall torch" or similar.
Find the CUDA version you got installed[1]. Go to PyTorch get started page[2] and use their guide/wizard to generate the pip string, and run that. I had to change pip3 to pip FWIW, and with Cuda 11.6 installed I ended up with "pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116".
After that I could use --device cuda, and the difference was immense. On my 2080Ti it went from roughly an hour for a minute with large model, to 10-20 seconds.
Yep, same for me, on M1 after enabling MPS (with `model.to("mps")`) it just either SIGSEGV or SIGABRTs every time with that line. The extremely unclean nature of the abort is making it hard to debug :(
I noticed the size seems to correspond to the model. With a large model, the error is tensor<1x1280x3000xf16>. With tiny, it's tensor<1x384x3000xf16>, and with medium it's tensor<1x1024x3000xf16>. It also seems like a bad thing that those are f16's but the "expected" data is f32.
I'm giving up for the night, but https://github.com/Smaug123/whisper/pull/1/files at least contains the setup instructions that may help others get to this point. Got it working on the GPU, but it's… much much slower than the CPU? Presumably due to the 'aten::repeat_interleave.self_int' CPU fallback.
Also hitting a nice little PyTorch bug:
> File "/Users/patrick/Documents/GitHub/whisper/whisper/decoding.py", line 388, in apply
logits[:, self.tokenizer.encode(" ") + [self.tokenizer.eot]] = -np.inf
> RuntimeError: dst_.nbytes() >= dst_byte_offset INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/Copy.mm":200, please report a bug to PyTorch.
I got it working inside a docker container on my M1 MBP. FWIW, I'm having my $180 tinyminimicro PC run a translation task while my M1 MBP runs a transcription task with the same audio input. So far, the PC is actually outputting results a lot faster than the MBP. Interesting results.
Probably need to pass some kind of options when initializing. The command itself works fine, just shows a warning: warnings.warn("FP16 is not supported on CPU; using FP32 instead")
(after running the command for setuptools)
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pip in /Users/xxx/Library/Python/3.9/lib/python/site-packages (22.2.2)
Requirement already satisfied: setuptools in /Users/xxx/Library/Python/3.9/lib/python/site-packages (65.3.0)
----
after trying whisper installation:
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
Traceback (most recent call last):
File "/Users/xxx/Library/Python/3.9/lib/python/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
main()
File "/Users/xxx/Library/Python/3.9/lib/python/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
json_out['return_val'] = hook(*hook_input['kwargs'])
File "/Users/xxx/Library/Python/3.9/lib/python/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 130, in get_requires_for_build_wheel
return hook(config_settings)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/site-packages/setuptools/build_meta.py", line 154, in get_requires_for_build_wheel
return self._get_build_requires(
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/site-packages/setuptools/build_meta.py", line 135, in _get_build_requires
self.run_setup()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/site-packages/setuptools/build_meta.py", line 150, in run_setup
exec(compile(code, __file__, 'exec'), locals())
File "setup.py", line 2, in <module>
from setuptools_rust import Binding, RustExtension
File "/private/var/folders/lj/7x6d3dxd3cbdtt484k6xsmyh0000gn/T/pip-build-env-ieaydl8r/overlay/lib/python3.9/site-packages/setuptools_rust/__init__.py", line 1, in <module>
from .build import build_rust
File "/private/var/folders/lj/7x6d3dxd3cbdtt484k6xsmyh0000gn/T/pip-build-env-ieaydl8r/overlay/lib/python3.9/site-packages/setuptools_rust/build.py", line 23, in <module>
from setuptools.command.build import build as CommandBuild # type: ignore[import]
ModuleNotFoundError: No module named 'setuptools.command.build'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
Not quite sure if this is related, but since there's a bunch of statements in there referencing rust: I had to install the rust compiler on my Mac (`brew install rust` if you use homebrew). This is not mentioned in the installation instructions.
Nope, that doesn't look good! I honestly just googled the error and installing setuptools fixed it for me, but I barely know anything about the Python ecosystem so I'm really just fumbling around here.