【MBJC】windows11 wsl2 配置Linux子系统 安装 Ubuntu20.04 配置Nvidia深度强化学习环境

wsl2计划失败,最终使用双系统,参考链接: https://www.bilibili.com/video/BV1Cc41127B9

由于需要使用ISaac Gym做深度强化学习仿真,手边仅有一台有显卡的win笔记本,因此尝试基于wsl2配环境

wsl2配置较简单,这里不再赘述,详见官方链接: https://learn.microsoft.com/en-us/windows/wsl/install

安装wsl2的cuda toolkit,链接: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local

整个过程只需要安装一个win11环境的cuda驱动,在WSL中不需要安装,WSL会直接使用win11环境的驱动

1
2
3
4
5
6
7
8
9
10
sudo wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda-repo-wsl-ubuntu-12-5-local_12.5.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-5-local_12.5.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-5-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-5

export PATH=$PATH:/usr/local/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu

安装Miniconda

1
2
3
4
5
6
7
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh

~/miniconda3/bin/conda init bash
~/miniconda3/bin/conda init zsh

配conda环境,pytorch环境直接下载whl安装,链接: https://download.pytorch.org/whl/

1
2
3
4
conda create -n wlg python=3.8
pip install torch-2.3.0+cu121-cp38-cp38-linux_x86_64.whl
pip install torchvision-0.18.0+cu121-cp38-cp38-linux_x86_64.whl
pip install triton-2.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

安装Isaac Gym,参考链接: https://github.com/clearlab-sustech/Wheel-Legged-Gym

1
2
3
4
tar -xf IsaacGym_Preview_4_Package.tar.gz
cd isaacgym/python && pip install -e .
cd examples && python 1080_balls_of_solitude.py
cd Wheel-Legged-Gym && pip install -e .

cd examples && python 1080_balls_of_solitude.py出现报错

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
~/isaacgym/python/examples$ python 1080_balls_of_solitude.py

*** Warning: failed to preload CUDA lib
*** Warning: failed to preload PhysX libs
Importing module 'gym_38' (/home/mbjc/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/mbjc/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
WARNING: Forcing CPU pipeline.
Not connected to PVD
/buildAgent/work/99bede84aa0a52c2/source/physx/src/gpu/PxPhysXGpuModuleLoader.cpp (148) : internal error : libcuda.so!

[Warning] [carb.gym.plugin] Failed to create a PhysX CUDA Context Manager. Falling back to CPU.
Physics Engine: PhysX
Physics Device: cpu
GPU Pipeline: disabled
Segmentation fault

解决方法:Isaac Gym不支持wsl2,可以尝试docker安装,参考链接: https://forums.developer.nvidia.com/t/failed-to-acquire-interface/178379/13

1
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/wsl/lib/

成功解决,使用docker启动成功,参考链接: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.htmlhttps://github.com/clearlab-sustech/Wheel-Legged-Gym/isaacgym/docs/install.html

1
2
3
4
5
6
7
8
9
10
11
12
13
Importing module 'gym_38' (/opt/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /opt/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
WARNING: Forcing CPU pipeline.
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: disabled
WARNING: lavapipe is not a conformant vulkan implementation, testing use only.
[Error] [carb.windowing-glfw.plugin] GLFW initialization failed.
[Error] [carb.windowing-glfw.plugin] GLFW window creation failed!
[Error] [carb.gym.plugin] Failed to create Window in CreateGymViewerInternal
*** Failed to create viewer

安装GUI

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
!给root用户重设密码:
passwd

!安装GUI命令:
sudo apt update && sudo apt -y upgrade
sudo apt-get purge xrdp
sudo apt install -y xrdp
sudo apt install -y xfce4
sudo apt install -y xfce4-goodies

sudo cp /etc/xrdp/xrdp.ini /etc/xrdp/xrdp.ini.bak
sudo sed -i 's/3389/3390/g' /etc/xrdp/xrdp.ini
sudo sed -i 's/max_bpp=32/#max_bpp=32\nmax_bpp=128/g' /etc/xrdp/xrdp.ini
sudo sed -i 's/xserverbpp=24/#xserverbpp=24\nxserverbpp=128/g' /etc/xrdp/xrdp.ini
echo xfce4-session > ~/.xsession

sudo nano /etc/xrdp/startwm.sh
!以下这两行注释掉:
#test -x /etc/X11/Xsession && exec /etc/X11/Xsession
#exec /bin/sh /etc/X11/Xsession

!添加这一行:
# xfce
startxfce4

sudo /etc/init.d/xrdp start

!WINDOWS远程连接
localhost:3389

安装docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
sudo apt update
sudo apt install -y curl gnupg2 software-properties-common apt-transport-https ca-certificates
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update

# list all docker-ce version
apt list -a docker-ce
sudo apt install -y containerd.io docker-ce=5:20.10.16~3-0~ubuntu-focal docker-ce-cli=5:20.10.16~3-0~ubuntu-focal

# Create required directories
sudo mkdir -p /etc/systemd/system/docker.service.d

# Create daemon json config file
sudo vim /etc/docker/daemon.json

{
"registry-mirrors": [
"https://2efk2pit.mirror.aliyuncs.com"
],
"dns": [
"223.5.5.5",
"119.29.29.29"
],
"exec-opts": [
"native.cgroupdriver=systemd"
],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}

sudo usermod -aG docker mbjc

# Start and enable services
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl enable docker

安装nvidia-container-toolkit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

sudo apt-get install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker

sudo systemctl restart docker

安装Isaac Gym,docker方式启动

1
2
3
4
tar -xf IsaacGym_Preview_4_Package.tar.gz
cd isaacgym/docker
bash build.sh
bash run.sh

调试代码

1
2
3
4
5
#查看容器
docker ps -a

#删除容器
docker rm isaacgym_container