|
Snapdragon Neural Processing Engine SDK
Reference Guide
|
This tutorial describes the steps needed to run UDO with D32 data format on DSP and execute the Alexnet model using the package. The Convolution operation has been chosen in this tutorial to demonstrate the implementation of a UDO. D32 , which is acronym for Depth-32 is a vector-friendly data format is created internally for efficient dsp operations .
The SNPE SDK provides the resources for this example under
Information on UDO in general is available at UDO Overview.
Information on running the Alexnet network without UDO is available at Alexnet Tutorial.
Information on creating a UDO package and executing the model using the package is available at UDO Tutorial.
The following tutorial assumes that general SNPE setup has been followed to support SDK environment, Caffe environment, and desired platform dependencies. For details on acquiring the Alexnet model visit Tutorials Setup. Also for CPU related resources required for this tutorial it is recommended to follow the tutorial provided on UDO Tutorial With Weights.
Here are the steps to develop and run a UDO
2.) Framework Model Conversion to a DLC
5.) Model Execution
Steps 1-4 are run offline on the x86 host and are necessary for execution in step 5. Step 5 provides information on execution using the SNPE command-line executable snpe-net-run.
Generating the Conv2DPackage package requires the snpe-udo-package-generator tool and the provided UDO plugin: Conv2DQuant.json . The Conv2DQuant.json gives you skeleton code for DSP(uint8) implementation . The plugins is located under $SNPE_ROOT/examples/NativeCpp/UdoExample/Conv2D/config. More information about creating a UDO plugin can be found here.
Generate the Conv2DPackage UDO package using the following:
export SNPE_UDO_ROOT=$SNPE_ROOT/share/SnpeUdo mkdir $SNPE_ROOT/models/alexnet/ConvD32UdoDsp snpe-udo-package-generator -p $SNPE_ROOT/examples/NativeCpp/UdoExample/Conv2D/config/Conv2DQuant.json -o $SNPE_ROOT/models/alexnet/ConvD32UdoDsp
This command creates the Convolution based package at $SNPE_ROOT/models/alexnet/ConvD32UdoDsp/Conv2DPackage
For more information on the snpe-udo-package-generator tool visit here.
Converting the Caffe Alexnet model to DLC requires the snpe-caffe-to-dlc tool. The snpe-caffe-to-dlc tool consumes the same Conv2D.json used in package generation via the --udo command line option. In this step, <ALEXNET_PATH> refers to the path to the deploy.prototxt and bvlc_alexnet.caffemodel file. For example, after running the setup_alexnet.py script <ALEXNET_PATH> is $SNPE_ROOT/models/alexnet/caffe.
Convert Alexnet with the following:
snpe-caffe-to-dlc --input_network <ALEXNET_PATH>/deploy_batch_1.prototxt --caffe_bin <ALEXNET_PATH>/bvlc_alexnet.caffemodel --output_path $SNPE_ROOT/models/alexnet/dlc/alexnet_udo.dlc --udo $SNPE_ROOT/examples/NativeCpp/UdoExample/Conv2D/config/Conv2D.json
This will generate a DLC named alexnet_udo.dlc containing the Convolution as UDO at $SNPE_ROOT/models/alexnet/dlc.
The generated package creates the skeleton of the operation implementation, which must be filled by the user to create a functional UDO. Additionally, the skeleton for a user implemented validation function can be populated to validate information about the UDO passed from the SNPE runtime. The rest of the code scaffolding for compatibility with SNPE is provided by the snpe-udo-package-generator.
The UDO implementations and validation function for this tutorial are provided under $SNPE_ROOT/examples/NativeCpp/UdoExample/Conv2D/src.
Dsp Implementations (Android)
The files in the package that need to be implemented for DSP are
The provided example implementations for these files are at the locations
Copy the provided implementations to the package:
cp -f $SNPE_ROOT/examples/NativeCpp/UdoExample/Conv2D/src/DSP/Conv2DInt8D32Impl/ConvolutionImplLibDsp.c $SNPE_ROOT/models/alexnet/ConvD32UdoDsp/Conv2DPackage/jni/src/DSP/ cp -f $SNPE_ROOT/examples/NativeCpp/UdoExample/Conv2D/src/DSP/Conv2DInt8D32Impl/ConvolutionImplLibDsp.h $SNPE_ROOT/models/alexnet/ConvD32UdoDsp/Conv2DPackage/include/
Optionally, the user can provide their own implementations in the package.
Hexagon DSP Runtime Compilation
Compilation for the DSP runtime makes use of the make system. In order to build the DSP implementation libraries, Hexagon-SDK needs to be installed and set up. For details, follow the setup instructions on $HEXAGON_SDK_ROOT/docs/readme.html page, where HEXAGON_SDK_ROOT is the location of your Hexagon-SDK installation.
Note: This SNPE release supports building UDO DSP implementation libraries using Hexagon-SDK 3.5.1/3.5.2.
Make sure that HEXAGON_SDK_ROOT, HEXAGON_TOOLS_ROOT and SDK_SETUP_ENV=Done is set.
export HEXAGON_SDK_ROOT=<path to hexagon sdk installation> export HEXAGON_TOOLS_ROOT=$HEXAGON_SDK_ROOT/tools/HEXAGON_Tools/8.3.07 export SDK_SETUP_ENV=Done
Compilation for the DSP runtime on Android uses Android NDK. The ANDROID_NDK_ROOT environment variable must be set to the directory containing ndk-build in order to compile the package.
export ANDROID_NDK_ROOT=<path_to_android_ndk>
It is suggested to add ANDROID_NDK_ROOT to the PATH environment variable to access ndk-build.
export PATH=$ANDROID_NDK_ROOT:$PATH
Target architecture must also be specified when compiling the package.
export UDO_APP_ABI=<target_architecture>
This tutorial uses arm64-v8a architectures - it is recommended but not required to use arm64-v8a as the target architecture for the remainder of the tutorial. If no target architecture is supplied both arm64-v8a and armeabi-v7a are targeted.
With the environment set up, compile for DSP with the following:
cd $SNPE_ROOT/models/alexnet/ConvD32UdoDsp/Conv2DPackage make dsp PLATFORM=$UDO_ABI
The expected artifacts after compiling for Hexagon DSP are
Note: For DSP, PLATFORM will only determine the ABI of the registration library.
Execution using snpe-net-run
Executing Inception-V3 for UDO is largely the same as use of snpe-net-run without UDO.
The SNPE SDK provides Linux and Android binaries of snpe-net-run under
For UDO, snpe-net-run consumes the registration library through the --udo_package_path option. LD_LIBRARY_PATH must also be updated to include the runtime-specific artifacts generated from package compilation.
Android Target Execution
The tutorial for execution on Android targets will use the arm64-v8a architecture. This portion of the tutorial is generic to all runtimes DSP or CPU.
# architecture: arm64-v8a - compiler: clang - STL: libc++ export SNPE_TARGET_ARCH=aarch64-android-clang6.0 export SNPE_TARGET_STL=libc++_shared.so
Then, push SNPE binaries and libraries to the target device:
adb shell "mkdir -p /data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin"
adb shell "mkdir -p /data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib"
adb push $SNPE_ROOT/lib/$SNPE_TARGET_ARCH/$SNPE_TARGET_STL \
/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib
adb push $SNPE_ROOT/lib/$SNPE_TARGET_ARCH/*.so \
/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib
adb push $SNPE_ROOT/bin/$SNPE_TARGET_ARCH/snpe-net-run \
/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin
Next, update environment variables on the target device to include the SNPE libraries and binaries:
adb shell export SNPE_TARGET_ARCH=aarch64-android-clang6.0 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib export PATH=$PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin
Lastly, push the Alexnet UDO model and input data to the device:
cd $SNPE_ROOT/models/alexnet mkdir data/rawfiles && cp data/cropped/*.raw data/rawfiles/ adb shell "mkdir -p /data/local/tmp/alexnet_udo" adb push data/rawfiles /data/local/tmp/alexnet_udo/cropped adb push data/target_raw_list.txt /data/local/tmp/alexnet_udo adb push dlc/alexnet_udo.dlc /data/local/tmp/alexnet_udo rm -rf data/rawfiles
Hexagon DSP Execution
The procedure for execution on device for DSP is largely the same as CPU and GPU. However, the DSP runtime requires quantized network parameters. While DSP allows unquantized DLCs, it is generally recommended to quantize DLCs for improved performance. The tutorial will use a quantized DLC as an illustrative example. Quantizing the DLC requires the snpe-dlc-quantize tool.
To quantize the DLC for use on DSP:
cd $SNPE_ROOT/models/alexnet/ snpe-dlc-quantize --input_dlc dlc/alexnet_udo.dlc --input_list data/cropped/raw_list.txt --udo_package_path ConvUdoCpu/Conv2DPackage/libs/x86-64_linux_clang/libUdoConv2DPackageReg.so --output_dlc dlc/alexnet_udo_quantized.dlc
Here ConvUdoCpu folder contains the CPU implementation of the Conv2D Udo . For more details on this refer UDO Tutorial With Weights.
For more information on snpe-dlc-quantize visit quantization. For information on UDO-specific quantization visit Quantizing a DLC with UDO. For information on DSP runtime visit DSP Runtime.
Now push the quantized model to device:
adb push dlc/alexnet_udo_quantized.dlc /data/local/tmp/alexnet_udo
Before executing on the DSP, push the SNPE libraries for DSP to device:
adb shell "mkdir -p /data/local/tmp/snpeexample/dsp/lib"
adb push $SNPE_ROOT/lib/dsp/*.so \
/data/local/tmp/snpeexample/dsp/lib
Now push DSP-specific UDO libraries to device:
cd $SNPE_ROOT/models/alexnet/ConvD32UdoDsp adb shell "mkdir -p /data/local/tmp/alexnet_udo/dsp" adb push Conv2DPackage/libs/dsp/*.so /data/local/tmp/alexnet_udo/dsp adb push Conv2DPackage/libs/arm64-v8a/libUdoConv2DPackageReg.so /data/local/tmp/alexnet_udo/dsp # Pushes reg lib
Then set required environment variables and run snpe-net-run on device:
adb shell cd /data/local/tmp/alexnet_udo/ export LD_LIBRARY_PATH=/data/local/tmp/alexnet_udo/dsp/:$LD_LIBRARY_PATH export ADSP_LIBRARY_PATH="/data/local/tmp/alexnet_udo/dsp/;/data/local/tmp/snpeexample/dsp/lib;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp" snpe-net-run --container alexnet_udo_quantized.dlc --input_list target_raw_list.txt --udo_package_path dsp/libUdoConv2DPackageReg.so --use_dsp
To verify classification results run the following on your host cpu machine .
cd $SNPE_ROOT/models/alexnet/
adb pull /data/local/tmp/alexnet_udo/output .
python3 $SNPE_ROOT/models/alexnet/scripts/show_alexnet_classifications.py -i data/cropped/raw_list.txt \
-o output/ \
-l data/ilsvrc_2012_labels.txt
The output should look like the following, showing classification results for all the images.
Classification results
<input_files_dir>/trash_bin.raw 0.949348 412 ashcan, trash can, garbage can,
wastebin, ash bin, ash-bin, ashbin,
dustbin, trash barrel, trash bin
<input_files_dir>/plastic_cup.raw 0.749104 647 measuring cup
<input_files_dir>/chairs.raw 0.365685 831 studio couch, day bed
<input_files_dir>/notice_sign.raw 0.722708 458 brass, memorial tablet, plaque