Training

From Waveshare Wiki
Jump to: navigation, search
RoArm-M3-AI-Kit
RoArm-M3

UART
{{{name2}}}

{{{name3}}}

{{{name4}}}

{{{name5}}}

{{{name6}}}

LeRobot Tutorial Catalog

7. Training

  • Note: The following operations are performed inside the container
  • Note: You need to navigate to the lerobot directory to execute the following code

7.1 Parameter Description

The following parameters can be adjusted according to actual needs.

--policy.type: What model to use for training, start with act for basics.
--local_files_only: Whether to use local data for training, the default is true, representing the use of local data.
--wandb.enable: Whether to enable the wandb platform to upload training logs, the default is false, representing not using it.
--resume: Whether to resume training, true means use it.
--config_path: If you resume training, you need to provide the address of the training parameter file, such as "/home/ws/lerobot/outputs/train/act_roarm_m3_test/checkpoints/last/pretrained_model/train_config.json”

If you have uploaded your dataset to the Hugging Face Hub using --control.push_to_hub=true, you can pull your dataset online to train it on other devices.

7.2 Training Dataset Preparation

7.2.1 Compression

If data has not been uploaded to the hub, you need to package the dataset for subsequent training. The save location for the dataset is, for example, "/root/.cache/huggingface/lerobot".

tar czvf roarm_m3_datasets.tar.gz /root/.cache/huggingface/lerobot

Next, select the file and download it to your local device.

900px-Lerobot-datasets-download.png

7.2.2 Decompression

Select the file and upload it to the root directory of the training environment.

900px-Lerobot-datasets-upload.png

Place the packaged dataset in the training container, go to the root directory, and decompress it to /root/.cache/huggingface/lerobot.

cd /root/../ && tar xzvf /root/roarm_m3_datasets.tar.gz

7.3 Windows System Local GPU Training

  • Note: The Windows local GPU needs to be NVIDIA.

7.3.1 Training Environment Setup

On the Windows system, use docker desktop and the pre-configured Ubuntu 22.04 image RoArm-M3-AI-Kit-GPU-x86_64, and you do not need to install and configure the environment manually.

docker load -i RoArm-M3-AI-Kit-GPU-x86_64.tar

Using an image to create a container, where "--shm-size=4g” can be allocated based on the available GPU capacity on the system.

docker run --gpus all -it   -p 2200:22   -p 9090:9090  --shm-size=4g  --ulimit memlock=-1  --ulimit stack=67108864  --env NVIDIA_DRIVER_CAPABILITIES=all   -w /home/ws/lerobot   --name lerobot  dudulrx0601/lerobot_gpu

Then, after entering the container, enable remote login.

service ssh start

Use MobaXterm to remotely log in to your local container.

900px-Lerobot-remote-connect-localhost.png

Go to the LeRobot catalog.

cd /home/ws/lerobot

If you have not previously selected to upload a dataset, you can open http://127.0.0.1:9090 in your browser to visualize the dataset locally and confirm if it is the dataset you want to train on:

python lerobot/scripts/visualize_dataset_html.py \
  --repo-id ${HF_USER}/roarm_m3_test \
  --host 0.0.0.0

7.3.2 Training

Use Nvidia GPU for local dataset training:

python lerobot/scripts/train.py \
  --dataset.repo_id=${HF_USER}/roarm_m3_test \
  --policy.type=act \
  --output_dir=outputs/train/act_roarm_m3_test \
  --job_name=act_roarm_m3_test \
  --local_files_only=true\
  --wandb.enable=false

7.4 Jetson Orin System Local GPU Training

On the Ubuntu system, use the Ubuntu 22.04 image [RoArm-M3-AI-Kit-GPU-arm64] that we have configured with the LeRobot environment, and you don't need to install and configure the environment by yourself.

Use Jetson Orin GPU for local dataset training:

python lerobot/scripts/train.py \
  --dataset.repo_id=${HF_USER}/roarm_m3_test \
  --policy.type=act \
  --output_dir=outputs/train/act_roarm_m3_test \
  --job_name=act_roarm_m3_test \
  --local_files_only=true\
  --wandb.enable=false