Description
An AI Asset enables accelerated deployment of deep learning models to resource constrained low power embedded systems (Deep Edge). The provided workflows deliver powerful and easy-to-deploy building blocks for creating complex AI models that can be deployed on cyber-physical systems.
By taking care of many of end-to-end tooling dependencies and providing standardized interfaces, Bonseyes AI Asset enable users to focus on producing optimal solutions while allowing faster feedback during the implementation of end user requirements. The goal is to facilitate easier deployment to the deep edge with the Bonseyes AI Marketplace.
Requirements
Hardware requirements
In order to utilize maximum potential of AI Assets - specially for training - it is required to have NVIDIA Graphic Card (GTX1060
and newer)
with CUDA support on x86_64 environments. Nonetheless, AI Assets can also be run using Intel/AMD CPUs.
We also provide support for Nvidia Jetsons devices as well as for platforms with arm64v8 achitectures. The support of these devices allows the user to evaluate any given AI Assets on them and obtain embedded-oriented benchmarks for a faster design process.
Software requirements
The following requirements need to be installed in the platform where the AI Asset will run:
Docker
To install docker, follow the instructions in here.
By default, docker is not accessible to normal users. To allow the current user to access docker, run the following command:
sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker
Verify that you can run docker commands without sudo:
docker run hello-world
Git and Git LFS
Install git and git LFS by executing:
sudo apt-get install git git-lfs
Git LFS is not active by default. To make sure git lfs is active, run the following command:
git lfs install
The command will print some errors that can be safely ignored.
Due to a bug in Ubuntu 18.04 LTS, the binaries installed by pip are not available by default. To make sure that they are available, run the following command:
echo 'export PATH="$HOME/.local/bin:$PATH"' >> .bashrc
Unfortunately, due to some limitations in Ubuntu 18.04 LTS, it is not possible to ensure that the new user groups are taken into account after a simple logout/login. To complete the setup, it is necessary to restart the machine so that the changes to PATH and groups take effect.
Packages
The remaining packages can be installed by executing the following command:
sudo apt-get install python3 python3-pip python3-wheel python3-setuptools
To be able to benchmark you AI Asset on your HW platform, you would need to install the following packages, based on the target platform:
x86 (CPU)
pip3 install psutil
x86 (Cuda)
pip3 install psutil nvidia-smi nvidia-ml-py3
arm64v8
pip3 install psutil
Nvidia Jetsons
pip3 install psutil jetson-stats
NVIDIA Drivers
If you are working on a x86 platform with a Nvidia GPU, ensure that you have installed the appropriate NVIDIA drivers. On Ubuntu, the easiest way of ensuring that you have the right version of the drivers set up is by installing a version of CUDA, at least as new as the image you intend to use via the official NVIDIA CUDA download page. As an example, if you intend on using CUDA 10.2 you should ensure that you have the correct graphics drivers, as described here.
The following command can be used to verify your system for x86 platforms:
docker run --gpus all nvidia/cuda:10.2-base nvidia-smi
If you are using a Nvidia Jetson device, it will be sufficient to set up the device following the DPE workflow described in DPE.
Nvidia docker
You will also need to install the NVIDIA Container Toolkit to enable GPU device access within Docker containers. Installation instructions can be found here.
Setup
AI Asset CLI
AI Asset CLI is a command line interface allowing end users to interact with AI Assets providing functionalities for variety of tasks such as export, processing (video, image, camera), evaluation etc.
Install the Bonseyes AI Asset CLI on the intended device from remote:
pip3 install git+https://gitlab.com/bonseyes-opensource/aiassets_cli.git
Add user path to system path:
export PATH=$PATH:/home/${USER}/.local/bin
For detailed AI Asset CLI usage, please refer to official documentation
Board setup
For board setup, please, follow first the DPE workflow explained in DPE.
Usage
Currently available AI Assets:
- 3D Face Landmark detection (68 keypoints)
Backbones: mobilenetv1, mobilenetv0.5
Input-sizes: 120x120
Datasets: aflw, aflw2000-3d
- Access token
Username:
gitlab+deploy-token-483452
Password:
zsEqp4321jiCzWS-TUaG
- Whole Body Pose estimation (133 keypoints)
Backbones: resnet22, shufflenetv2k30, shufflenetv2k16
Input-sizes: 128x96, 128x128, 256x256, 384x216, 512x384
Datasets: wholebody
- Access token
Username:
gitlab+deploy-token-557315
Password:
AskgZQwcDRRYv3Da7BNB
Currently available platforms and environments:
- x86_64 machines
cpu
cuda10.2_tensorrt7.0
cuda11.2_tensorrt7.2_rtx3070
cuda11.4_tensorrt8.0
- Nvidia Jetson Devices
jetpack4.4
jetpack4.6
- Arm CPUs
arm64v8
If you face some issues during the workflow, you can export the DEBUG flag on your terminal to obtain more information about the issue:
export DEBUG=True
Installation
Download and initialize specified demo AI Asset locally:
bonseyes_aiassets_cli init
--task {3dface_landmarks, whole_body_pose}
--platform {x86_64, jetson, rpi}
--environment {cpu,cuda10.2_tensorrt7.0,cuda11.2_tensorrt7.2_rtx3070,cuda11.4_tensorrt8.0,jetpack4.4,jetpack4.6,arm64v8}
--version {v1.0, v2.0, ...}
--user gitlab+deploy-token-USERNAME
--password PASSWORD
[--camera-id CAMERA_ID]
Check supported options by running:
bonseyes_aiassets_cli init --help
Example:
bonseyes_aiassets_cli init \
--task whole_body_pose \
--platform x86_64 \
--environment cuda10.2_tensorrt7.0 \
--version v1.0 \
--camera-id 0 \
--user <username> \
--password <password>
Check if the container is running and on what port by executing:
docker ps
If you want to stop running AI Asset
docker kill <task_name>
Switch between AI Assets
Use specific AI Asset locally:
bonseyes_aiassets_cli use --task {3dface_landmarks, whole_body_pose}
Check supported tasks by running:
bonseyes_aiassets_cli use --help
Train
Train network and produce model based on available configuration files:
bonseyes_aiassets_cli train start --config <config_name>
Check for available configs by running:
bonseyes_aiassets_cli train start --help
Example:
bonseyes_aiassets_cli train start --config v1.0_shufflenetv2k30_default_641x641_fp32_config
Check training status by running:
bonseyes_aiassets_cli train status
Stop training process by running:
bonseyes_aiassets_cli train stop
Export
Export pretrained models from PyTorch
format to ONNX
and/or TensorRT
format(s):
usage: bonseyes_aiassets_cli export [-h]
--export-input-sizes EXPORT_INPUT_SIZES [EXPORT_INPUT_SIZES ...]
--engine {all, onnxruntime, tensorrt}
--precisions {fp32, fp16}
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
[--workspace-unit {MB, GB}]
[--workspace-size WORKSPACE_SIZE]
[--enable-dla]
Note: When exporting models to TensorRT
format on devices with lower RAM size (<4GB) it is recommended to
specify lower workspace size in MBs.
Example:
bonseyes_aiassets_cli export \
--export-input-sizes 120x120 320x320 \
--engine all \
--backbone shufflenetv2k30 \
--precisions fp32 fp16
Optimize
Optimize exported models by performing PTQ (post training quantization)
usage: bonseyes_aiassets_cli optimize [-h]
--optimize-input-sizes OPTIMIZE_INPUT_SIZES [OPTIMIZE_INPUT_SIZES ...]
--engine {all, onnxruntime, tensorrt}
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
[--workspace-unit {MB, GB}]
[--workspace-size WORKSPACE_SIZE]
[--enable-dla]
Note: When optimizing models for TensorRT
format on devices with lower RAM size (<4GB) it is recommended to
specify lower workspace size in MBs.
Example:
bonseyes_aiassets_cli optimize \
--optimize-input-sizes 120x120 320x320 \
--engine tensorrt \
--backbone shufflenetv2k30
Process
Note: if you are using a VM in Virtual Box, you can share a camera (or a USB device) by selecting “Devices” > “Webcams” (or USB) and ticking the device you want to share with the VM.
Image
Currently only supported format is .jpg
bonseyes_aiassets_cli demo image
[--input-size INPUT_SIZE]
[--engine {pytorch, onnxruntime, tensorrt}]
[--precision {fp32, fp16, int8}]
[--device {gpu, cpu}]
[--cpu-num CPU_NUM]
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
--image-input <image_absolute_path>
3d face landmark specific:
[--render {2d_sparse, 2d_dense, 3d, pose, axis}]
[--thickness THICKNESS]
[--single-face-track]
Example:
# CPU
bonseyes_aiassets_cli demo image \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device cpu \
--image-input </path/to/img.jpg>
# GPU
bonseyes_aiassets_cli demo image \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device gpu \
--image-input </path/to/img.jpg>
Video
Currently only supported format is .mp4
bonseyes_aiassets_cli demo video
[--input-size INPUT_SIZE]
[--engine {pytorch, onnxruntime, tensorrt}]
[--precision {fp32, fp16, int8}]
[--device {gpu, cpu}]
[--cpu-num CPU_NUM]
[--color COLOR]
[--rotate {90, -90, 180}]
--video-input <video_absolute_path>
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
3d face landmark specific:
[--render {2d_sparse, 2d_dense, 3d, pose, axis}]
[--thickness THICKNESS]
[--single-face-track]
Example:
# CPU
bonseyes_aiassets_cli demo video \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device cpu \
--video-input </path/to/video.mp4>
# GPU
bonseyes_aiassets_cli demo video \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device gpu \
--video-input </path/to/video.mp4>
Camera
bonseyes_aiassets_cli demo camera
[--input-size INPUT_SIZE]
[--engine {pytorch, onnxruntime, tensorrt}]
[--precision {fp32, fp16, int8}]
[--device {gpu, cpu}]
[--cpu-num CPU_NUM]
[--color COLOR]
[--rotate {90, -90, 180}]
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
3d face landmark specific:
[--render {2d_sparse, 2d_dense, 3d, pose, axis}]
[--thickness THICKNESS]
[--single-face-track]
Example:
# CPU
bonseyes_aiassets_cli demo camera \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device cpu \
--camera-id 0
# GPU
bonseyes_aiassets_cli demo camera \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device gpu \
--camera-id 0 \
Server
bonseyes_aiassets_cli server start
[--input-size INPUT_SIZE]
[--engine {pytorch, onnxruntime, tensorrt}]
[--precision {fp32, fp16, int8}]
[--device {gpu, cpu}]
[--cpu-num CPU_NUM]
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
3d face landmark specific:
[--render {2d_sparse, 2d_dense, 3d, pose, axis}]
[--thickness THICKNESS]
[--single-face-track]
Example:
# CPU
bonseyes_aiassets_cli server start \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device cpu
# GPU
bonseyes_aiassets_cli server start \
--input-size 320x320 \
--engine pytorch \
--precision fp32 \
--backbone shufflenetv2k30 \
--device gpu
You can test if server is running correctly by calling:
curl --request POST --data-binary @/path/to/image.jpg http://localhost:<PORT>/inference
User <PORT> based on the aiasset you want ot use, each time you start the server PORT will be printed out to standard output, you can either save it or check
docker ps
And find out what port is AI Asset container exposing eg.
CONTAINER ID 63bc638d1243
IMAGE registry.gitlab.com/bonseyes/assets/bonseyes_openpifpaf_wholebody/x86_64:v1.0_cuda10.2_tensorrt7.0
COMMAND "/usr/local/bin/nvid…"
CREATED 58 minutes ago
STATUS Up 58 minutes
PORTS 0.0.0.0:59838->59838/tcp, :::59838->59838/tcp
NAMES whole_body_pose
To stop the server execute:
bonseyes_aiassets_cli server stop
Benchmark
Evaluate exported and pretrained models:
usage: bonseyes_aiassets_cli benchmark [-h]
--benchmark-input-sizes INPUT_SIZES
--engine {all, pytorch, onnxruntime, tensorrt}
--backbone {mobilenetv1, mobilenetv0.5, resnet22, shufflenetv2k30, shufflenetv2k16}
--device {gpu, cpu}
3d face landmark specific:
[--datasets {all, aflw, aflw2000-3d}]
Example:
# CPU
bonseyes_aiassets_cli benchmark \
--benchmark-input-sizes 120x120 320x320 \
--device cpu \
--backbone shufflenetv2k30 \
--engine pytorch onnxruntime \
--dataset all
# GPU
bonseyes_aiassets_cli benchmark \
--benchmark-input-sizes 120x120 320x320 \
--device gpu \
--backbone shufflenetv2k30 \
--engine pytorch onnxruntime tensorrt \
--dataset all