入门 WSL 中由 GPU 加速的机器学习

Machine learning (ML) is becoming a key part of many development workflows. Whether you're a data scientist, ML engineer, or starting your learning journey with ML the Windows Subsystem for Linux (WSL) offers a great environment to run the most common and popular GPU accelerated ML tools.

There are lots of different ways to set up these tools. For example, NVIDIA CUDA in WSL, TensorFlow-DirectML and PyTorch-DirectML all offer different ways you can use your GPU for ML with WSL. To learn more about the reasons for choosing one versus another, see GPU accelerated ML training.

This guide will show how to set up:

NVIDIA CUDA if you have an NVIDIA graphics card and run a sample ML framework container
TensorFlow-DirectML and PyTorch-DirectML on your AMD, Intel, or NVIDIA graphics card

Prerequisites

Ensure you are running Windows 11 or Windows 10, version 21H2 or higher.
Install WSL and set up a username and password for your Linux distribution.

Setting up NVIDIA CUDA with Docker

Download and install the latest driver for your NVIDIA GPU
Install Docker Desktop or install the Docker engine directly in WSL by running the following command
```
curl https://get.docker.com | sh
```
```
sudo service docker start
```

If you installed the Docker engine directly then install the NVIDIA Container Toolkit following the steps below.

Set up the stable repository for the NVIDIA Container Toolkit by running the following commands:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-docker-keyring.gpg

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-docker-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

Install the NVIDIA runtime packages and dependencies by running the commands:

sudo apt-get update

sudo apt-get install -y nvidia-docker2

Run a machine learning framework container and sample.

To run a machine learning framework container and start using your GPU with this NVIDIA NGC TensorFlow container, enter the command:
```
docker run --gpus all -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.03-tf2-py3
```
You can run a pre-trained model sample that is built into this container by running the commands:
```
cd nvidia-examples/cnn/
```
```
python resnet.py --batch_size=64
```

Additional ways to get setup and utilize NVIDIA CUDA can be found in the NVIDIA CUDA on WSL User Guide.

Setting up TensorFlow-DirectML or PyTorch-DirectML

Download and install the latest driver from your GPU vendors website: AMD, Intel, or NVIDIA.
Setup a Python environment.

We recommend setting up a virtual Python environment. There are many tools you can use to setup a virtual Python environment — for these instructions, we'll use Anaconda's Miniconda.
```
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
```
```
bash Miniconda3-latest-Linux-x86_64.sh
```
```
conda create --name directml python=3.7 -y
```
```
conda activate directml
```
Install the machine learning framework backed by DirectML of your choice.

TensorFlow-DirectML:
```
pip install tensorflow-directml
```
PyTorch-DirectML:
```
sudo apt install libblas3 libomp5 liblapack3
```
```
pip install pytorch-directml
```
Run a quick addition sample in an interactive Python session for TensorFlow-DirectML or PyTorch-DirectML to make sure everything is working.

If you have questions or run into issues, visit the DirectML repo on GitHub.

Multiple GPUs

If you have multiple GPUs on your machine you can also access them inside of WSL. However, you will only be able to access one at a time. To choose a specific GPU please set the environment variable below to the name of your GPU as it appears in device manager:

export MESA_D3D12_DEFAULT_ADAPTER_NAME="<NameFromDeviceManager>"

This will do a string match, so if you set it to "NVIDIA" it will match the first GPU that starts with "NVIDIA".

Prerequisites​

Setting up NVIDIA CUDA with Docker​

Setting up TensorFlow-DirectML or PyTorch-DirectML​

Multiple GPUs​

Additional Resources​

Prerequisites

Setting up NVIDIA CUDA with Docker

Setting up TensorFlow-DirectML or PyTorch-DirectML

Multiple GPUs

Additional Resources