SAM+CUDA+Diffusion on Windows
Getting SegmentAnything model (SAM), transformers, and CUDA to work together on Windows is as usual a tricky stuff.
With errors like below, CUTLASS dependencies missing, etc
OSError: [WinError 127] The specified procedure could not be found. Error loading “lib\site-packages\torch\lib\nvfuser_codegen.dll” or one of its dependencies.
The below steps can be used to correctly install.
Create a new conda environment with python
conda create -n myenv python=3.8
Install pytorch 2.x with cuda, from the official channels
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
Confirm that pytorch is installed, with correct CUDA version dependencies
>>> import torch
>>> torch.cuda.is_available()
True
>>> print(torch.version.cuda)
11.7
To confirm, pip list at this stage shows below
Package Version
— — — — — — — — — — — — — —
aiofiles 23.1.0
aiohttp 3.8.4
aiosignal 1.3.1
altair 5.0.1
anyio 3.7.0
async-timeout 4.0.2
attrs 23.1.0
brotlipy 0.7.0
certifi 2023.5.7
cffi 1.15.1
charset-normalizer 2.0.4
click 8.1.3
colorama 0.4.6
contourpy 1.1.0
cryptography 39.0.1
cycler 0.11.0
exceptiongroup 1.1.1
fastapi 0.98.0
ffmpy 0.3.0
filelock 3.9.0
fonttools 4.40.0
frozenlist 1.3.3
fsspec 2023.6.0
gradio 3.35.2
gradio_client 0.2.7
h11 0.14.0
httpcore 0.17.2
httpx 0.24.1
huggingface-hub 0.15.1
idna 3.4
importlib-resources 5.12.0
Jinja2 3.1.2
jsonschema 4.17.3
kiwisolver 1.4.4
linkify-it-py 2.0.2
markdown-it-py 2.2.0
MarkupSafe 2.1.1
matplotlib 3.7.1
mdit-py-plugins 0.3.3
mdurl 0.1.2
mkl-fft 1.3.6
mkl-random 1.2.2
mkl-service 2.4.0
mpmath 1.2.1
multidict 6.0.4
networkx 2.8.4
numpy 1.24.3
orjson 3.9.1
packaging 23.1
pandas 2.0.2
Pillow 9.4.0
pip 23.1.2
pkgutil_resolve_name 1.3.10
pycparser 2.21
pydantic 1.10.9
pydub 0.25.1
Pygments 2.15.1
pyOpenSSL 23.0.0
pyparsing 3.1.0
pyrsistent 0.19.3
PySocks 1.7.1
python-dateutil 2.8.2
python-multipart 0.0.6
pytz 2023.3
PyYAML 6.0
requests 2.29.0
semantic-version 2.10.0
setuptools 67.8.0
six 1.16.0
sniffio 1.3.0
starlette 0.27.0
sympy 1.11.1
toolz 0.12.0
torch 2.0.1
torchaudio 2.0.2
torchvision 0.15.2
tqdm 4.65.0
typing_extensions 4.6.3
tzdata 2023.3
uc-micro-py 1.0.2
urllib3 1.26.16
uvicorn 0.22.0
websockets 11.0.3
wheel 0.38.4
win-inet-pton 1.1.0
yarl 1.9.2
zipp 3.15.0
Now install Segment Anything from official git repo.
pip install git+https://github.com/facebookresearch/segment-anything.git
This should result in successful installation, as below
Successfully installed segment-anything-1.0
Now, CLIPImageProcessor, CLIPTextModel, CLIPTokenizer all depend on transformers, so let us install via pip
pip install transformers
resulting in,
Successfully installed safetensors-0.3.1 tokenizers-0.13.3 transformers-4.30.2
The environment is now ready to use with SAM and CUDA.