Automatically generate high-performance TensorRT plugins for unsupported operators or replacing inefficient kernels.
End-to-end command line tool. No requirement for any CUDA programming knowledge. Users only need to provide the ONNX model and assign the node names or types to auto-generate TensorRT plugin.
The performance of auto-generated TensorRT plugins in real cases:
mkdir build && cp cmake/config.cmake build
#Edit build/config.cmake to customize the compilation options
set(USE_LLVM /usr/local/llvm/bin/llvm-config)
set(USE_CUDA ON)
#gcc compiler is required to support C++14
cd build && cmake ..
make -j
#TVM Python package
export TVM_HOME=/path/to/tvm
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
4. Plugin Compiler Env
Modify python/trt_plugin/Makefile according to your environment setup.
CUDA_PATH: local CUDA installation path
TRT_LIB_PATH: local TensorRT installation path
Usage
TPAT provides a Python function and command line for usage.
Python function
onnx2plugin(
input_model_path,
output_model_path,
node_names=None,
node_types=None,
plugin_name_dict=None,
dynamic_bs=False, # if True, this operator support dynamic batchsize
min_bs=1,
max_bs=256,
opt_bs=128
)
input_model_path[required] : input onnx model including nodes which require TRT plugin
output_model_path[required] : output onnx model where the corresponding node types are replaced by plugin names. The output onnx model can be directly converted to TRT with onnx parser and built plugin dynamic library.
node_names : list of node names for autogen
node_types : list of node types for autogen
plugin_name_dict : dict of {plugin_name: node_name} for autogen
dynamic_bs : if True, TPAT will generate plugin that supported dynamic batch, if False, generated plugin only support fixed shapes but has better performance.
min_bs: the minium batch size in range of dynamic batch.
max_bs: the maxium batch size in range of dynamic batch.
opt_bs: the optimize batch size in range of dynamic batch.
NOTE: For node_names, node_types, plugin_name_dict, at least one of them should be provided
TPAT - TensorRT Plugin Autogen Tool
Introduction
Support Matrix
Runtime Env : dockerfile
1. Build image
2. Run container
3. Execute conrainer
4. Modify CUDA_PATH and TRT_PATH in python/trt_plugin/Makefile
5. Plugin auto generated
Runtime Env : Build
1. Prerequisites
System Packages
PyPI packages
Optional packages
2. Clone the TPAT repository
3. Build BlazerML-TVM
4. Plugin Compiler Env
Modify python/trt_plugin/Makefile according to your environment setup.
Usage
TPAT provides a Python function and command line for usage.
Python function
Command line
Output
1. Assign nodes and plugin names through plugin_name_dict
2. Assign node names or node types
Example && UnitTest
Release notes
Changelog
Known issues
TODO