This document provides instructions for building TensorRT-LLM from source code on Linux
Fetching the Sources
1. Install git-lfs
gitlfsinstall
This command prepares your local Git environment to handle large files by not storing them directly in the repository but as references.
This initial setup is crucial for working with large files efficiently and is required only once per repository to ensure that your Git configuration is optimized for LFS operations.
It replaces these large files with text pointers inside Git, while storing the file contents on a remote server like GitHub LFS.
2. Clone the TensorRT-LLM repository
To start working with the TensorRT-LLM, you first need to clone the repository to your local machine.
This can be done by executing the following command in your terminal:
This command clones the entire repository from GitHub to your local directory, allowing you to work with the files, including large files that are handled efficiently through Git Large File Storage (LFS).
3. Move into the directory
cdTensorRT-LLM
4. Initialise and update the submodules
gitsubmoduleupdate--init--recursive
This command performs several actions:
--init initialises your local configuration file to include the submodules defined in the .gitmodules file of the repository.
--update fetches all the data from the project and checks out the appropriate commit as specified in your project.
--recursive ensures that this command is run not only in the current module but also in any nested submodules, effectively updating all the submodules within the project.
5. Pulling Large Files with Git LFS
After initialising and updating your repository's submodules, you'll need to handle large files managed with Git Large File Storage (LFS). This is where git lfs pull comes into play.
Running this command will download the large files associated with the current branch from the remote repository, based on the tracking configurations established by Git LFS.
gitlfspull
This step ensures all the necessary assets, which are too large to be efficiently managed by standard Git operations, are properly downloaded and available for use.
It's a step before proceeding with operations that depend on these large files, such as building Docker images or executing large-scale data processing tasks.
Building TensorRT-LLM in One Step
Once Git LFS is set up and the necessary files are pulled, you can proceed to build the TensorRT-LLM Docker image.
This can be done with a single command:
make-Cdockerrelease_build
This command builds a Docker image that contains everything you need to run TensorRT-LLM, simplifying the setup process and ensuring consistency across environments.
Optionally specify GPU architectures with CUDA_ARCHS
Dockerfile Analysis
Base Image
ARG BASE_IMAGE=nvcr.io/nvidia/pytorchARG BASE_TAG=23.08-py3FROM ${BASE_IMAGE}:${BASE_TAG} as base
Defines two arguments BASE_IMAGE and BASE_TAG, which are used to specify the base image. In this case, it's using NVIDIA's PyTorch image.
The FROMinstruction initialises a new build stage and sets the base image for subsequent instructions. The as base names this stage as base.
Copies the wheel file built in the wheel stage and installs it using pip.
Removes the wheel file after installation.
COPY README.md ./COPY examples examples
Copies README.md and the examples directory to the container.
The build process will take some time
Fire up the Docker Container
Once built, execute the Docker container using make -C docker release_run.
make-Cdockerrelease_run
To run as a local user instead of root, use LOCAL_USER=1.
Analysis of Dockerfile process
The provided output is from running a Docker container for NVIDIA's TensorRT-LLM
Docker Run Command Breakdown
The docker run command is used to create and start a container. The command and its options are as follows:
--rm: Automatically remove the container when it exits.
-it: Run the container in interactive mode (i.e., attached to the terminal) and allocate a pseudo-TTY.
--ipc=host: Use the host's IPC namespace, which allows the container to share memory with the host.
--ulimit memlock=-1 --ulimit stack=67108864: Set certain limits on system resources. Here, memlock=-1 removes the memory lock limit, and stack=67108864 sets the stack size.
--gpus=all: Allocate all available GPUs to the container. This is important for machine learning tasks that require GPU acceleration.
--volume /home/jack/TensorRT-LLM:/code/tensorrt_llm: Mount the host directory /home/jack/TensorRT-LLM to the container directory /code/tensorrt_llm.
--workdir /code/tensorrt_llm: Set the working directory inside the container to /code/tensorrt_llm.
--hostname-laptop-release: Set the hostname of the container.
--name tensorrt_llm-release-jack: Assign a name to the container for easy reference.
--tmpfs /tmp:exec:Mount a temporary file system (tmpfs) at /tmp with execution permissions. This can be used for temporary storage that's faster than writing to disk.
tensorrt_llm/release:latest: The Docker image to use, where tensorrt_llm/release is the image name and latest is the tag.
What is GNU? What does the MAKE command do?
make and GNU are fundamental concepts in software development, particularly in the context of building and compiling code.
What is make?
Function:make is a utility that automatically builds executable programs and libraries from source code by reading files called Makefiles which specify how to derive the target program.
Automation: It helps in automating the compilation process, reducing the complexity and potential errors in building software, especially large projects with multiple components and dependencies.
Efficiency:make determines which portions of a program need to be recompiled and issues commands to recompile them. This is efficient because only those parts of a program that have been modified are recompiled, saving time.
Platform: It is widely used in Unix and Unix-like systems but is available for many other operating systems.
What is GNU?
GNU Project: GNU stands for "GNU's Not Unix!" It's a recursive acronym and is part of the GNU Project, which was launched in 1983 by Richard Stallman to create a complete, free operating system.
Free Software: The GNU Project has developed a comprehensive collection of free software. When people refer to “GNU software”, they are usually referring to software released under the GNU General Public License (GPL), which is known for its commitment to free software principles.
GNU Tools: The project has produced a number of tools widely used in software development, including the GNU Compiler Collection (GCC), GNU Debugger (GDB), and GNU Make (a version of the make utility).
GNU/Linux: The combination of GNU tools and the Linux kernel resulted in the GNU/Linux operating system, commonly referred to as just “Linux”, which is used in systems around the world.
GNU Make
GNU Make: In the context of your query, GNU Make is a version of make utility developed by the GNU Project. It is an enhanced version of the original make utility and is more feature-rich and portable.
Usage in TensorRT-LLM: The make commands you mentioned in the TensorRT-LLM build process are using GNU Make. This tool simplifies the building process by reading the specified Makefile to automate the compilation and linking of the TensorRT-LLM software.
In summary, make is a tool for automating the build process in software development, and GNU is an organization that provides a variety of free software tools, including GNU Make.