build_wheel.py
The script can be executed directly from the command line
The build_wheel.py
file is a Python script that automates the build process for the TensorRT-LLM project, including building the C++ library, generating Python bindings, and creating a wheel package for distribution.
Shebang and license header
The script starts with the shebang
#!/usr/bin/env python3
, indicating that it should be executed using Python 3.The license header contains the copyright information and the Apache License 2.0 notice.
Imports
The script imports various Python modules required for its functionality, such as
os
,platform
,sys
,argparse
,contextlib
,functools
,multiprocessing
,pathlib
,shutil
,subprocess
,textwrap
, andtyping
.These modules provide functions for file and directory handling, platform detection, command-line argument parsing, context management, partial function application, multiprocessing, path manipulation, file copying and deletion, subprocess execution, text wrapping, and type hinting.
working_directory
context manager
working_directory
context managerThis is a custom context manager that changes the working directory to the specified path and returns to the previous directory when exiting the context.
It uses
os.chdir()
to change the directory and atry-finally
block to ensure the previous directory is restored.
main
function
main
functionThe
main
function is the entry point of the script and contains the main logic for building the wheel package.It takes various command-line arguments such as
build_type
,build_dir
,dist_dir
,cuda_architectures
,job_count
,extra_cmake_vars
,extra_make_targets
,trt_root
,nccl_root
,clean
,use_ccache
,cpp_only
,install
,skip_building_wheel
,python_bindings
,benchmarks
, andnvtx
.These arguments control different aspects of the build process, such as the build type, build directories, CUDA architectures, parallel job count, CMake variables, make targets, dependencies, and various build options.
Project directory and build command
The script determines the project directory using
Path(__file__).parent.resolve().parent
.It changes the current working directory to the project directory using
os.chdir()
.The
build_run
function is created usingpartial()
to simplify running shell commands with error checking.
Submodule initialisation
The script checks if the
3rdparty/cutlass/.git
directory exists, and if not, it initialises the Git submodules usinggit submodule update --init --recursive
.
Platform detection and requirements installation
The script detects the platform (Windows or non-Windows) using
platform.system()
.It determines the appropriate requirements file based on the platform (
requirements-dev-windows.txt
for Windows,requirements-dev.txt
otherwise).It installs the required dependencies using
pip install
with the specified requirements file.
TensorRT installation check
The script checks if TensorRT is installed by running
pip freeze
and checking iftensorrt
is in the list of installed packages.If TensorRT is not installed, it raises a
RuntimeError
with an appropriate error message based on the platform.
CMake and build configuration
The script constructs the CMake command-line arguments based on the provided command-line options.
It sets the CMake generator to "Ninja" on Windows.
It determines the number of parallel jobs for the build using
cpu_count()
if not specified.It handles extra CMake variables, TensorRT and NCCL paths, and ccache usage.
Build directory setup
If a build directory is not specified, it is determined based on the build type (
build
for "Release",build_<build_type>
otherwise).If the
clean
flag is set and the build directory exists, it is removed usingrmtree()
.The build directory is created using
build_dir.mkdir()
.
Wheel building
If
skip_building_wheel
is not set, the script runs thepython3 -m build
command to build the wheel package.It specifies the project directory, skips dependency checks, disables isolation, and outputs the wheel to the specified
dist_dir
.
Installation
If the
install
flag is set, the script installs the package in editable mode usingpip install -e .[devel]
.
Command-line argument parsing
The script uses
argparse.ArgumentParser
to define and parse the command-line arguments.It sets default values, choices, and help messages for each argument.
The parsed arguments are then passed to the
main
function usingvars(args)
.
The script can be executed directly from the command line, and it parses the command-line arguments using argparse
.
The available arguments include
--build_type
: Specifies the build type (Release, RelWithDebInfo, Debug).--cuda_architectures
: Specifies the CUDA architectures to build for.--install
: Installs the project after building.--clean
: Cleans the build artifacts.--use_ccache
: Uses ccache as the compiler driver.--job_count
: Specifies the number of parallel jobs for the build.--cpp_only
: Builds only the C++ library without Python dependencies.--extra-cmake-vars
: Specifies extra CMake variables.--extra-make-targets
: Specifies additional make targets.--trt_root
: Specifies the directory to find TensorRT headers/libs.--nccl_root
: Specifies the directory to find NCCL headers/libs.--build_dir
: Specifies the directory where C++ sources are built.--dist_dir
: Specifies the directory where Python wheels are built.--skip_building_wheel
: Skips building the wheel files.--python_bindings
: Builds the Python bindings for the C++ runtime (deprecated).--benchmarks
: Builds the benchmarks for the C++ runtime.--nvtx
: Enables NVTX features.
Compilation of C++ Code
Compilation is the process of converting source code written in a programming language (like C++) into machine code that can be executed by a computer.
In the
wheel.py
script, the C++ code for the TensorRT-LLM project needs to be compiled before it can be used.The script uses CMake, a build system, to configure and generate the necessary build files for compiling the C++ code.
It sets up various CMake configuration options, such as the build type (e.g., Release or Debug), CUDA architectures, and paths to dependencies like TensorRT and NCCL.
The script then invokes the C++ compiler (e.g., MSVC on Windows or GCC on Linux) to compile the C++ source files into object files and link them together to create libraries or executables.
The compiled C++ code is then linked with the Python bindings to create a shared library that can be used in Python.
Last updated