# TensorRT-LLM Dockerfile

This is a multi-stage Dockerfile that builds and packages the TensorRT-LLM library.&#x20;

Let's break it down section by section:

#### <mark style="color:green;">Base image and environment setup</mark>

```docker
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch
ARG BASE_TAG=24.02-py3
ARG DEVEL_IMAGE=devel

FROM ${BASE_IMAGE}:${BASE_TAG} as base

ENV BASH_ENV=${BASH_ENV:-/etc/bash.bashrc}
ENV ENV=${ENV:-/etc/shinit_v2}
SHELL ["/bin/bash", "-c"]
```

This section <mark style="color:yellow;">**sets up the base image using the NVIDIA PyTorch image**</mark> and specifies the tag for the image.

It also sets environment variables for the bash configuration and shell initialization. The <mark style="color:yellow;">**`SHELL`**</mark> instruction sets the default shell to Bash.

#### <mark style="color:green;">Development stage</mark>

```docker
FROM base as devel

COPY docker/common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh

COPY docker/common/install_cmake.sh install_cmake.sh
RUN bash ./install_cmake.sh && rm install_cmake.sh

COPY docker/common/install_ccache.sh install_ccache.sh
RUN bash ./install_ccache.sh && rm install_ccache.sh
```

This stage builds upon the base image and installs necessary dependencies, CMake, and ccache.&#x20;

The installation scripts are copied into the image and executed using `RUN` instructions. After each script is run, it is removed to keep the image size small.

#### <mark style="color:green;">TensorRT installation</mark>

```docker
ARG TRT_VER
ARG CUDA_VER
ARG CUDNN_VER
ARG NCCL_VER
ARG CUBLAS_VER

COPY docker/common/install_tensorrt.sh install_tensorrt.sh
RUN bash ./install_tensorrt.sh \
    --TRT_VER=${TRT_VER} \
    --CUDA_VER=${CUDA_VER} \
    --CUDNN_VER=${CUDNN_VER} \
    --NCCL_VER=${NCCL_VER} \
    --CUBLAS_VER=${CUBLAS_VER} && \
    rm install_tensorrt.sh
```

This section installs TensorRT using the provided installation script.&#x20;

The script takes various arguments for the versions of TensorRT, CUDA, cuDNN, NCCL, and cuBLAS.&#x20;

These arguments are <mark style="color:yellow;">**passed as build arguments**</mark> to the Dockerfile.

#### <mark style="color:green;">Additional dependencies</mark>

```docker
COPY docker/common/install_polygraphy.sh install_polygraphy.sh
RUN bash ./install_polygraphy.sh && rm install_polygraphy.sh

COPY docker/common/install_mpi4py.sh install_mpi4py.sh
RUN bash ./install_mpi4py.sh && rm install_mpi4py.sh
```

This section installs additional dependencies, namely Polygraphy and mpi4py, using their respective installation scripts.

#### <mark style="color:green;">PyTorch installation</mark>

```docker
ARG TORCH_INSTALL_TYPE="skip"
COPY docker/common/install_pytorch.sh install_pytorch.sh
RUN bash ./install_pytorch.sh $TORCH_INSTALL_TYPE && rm install_pytorch.sh
```

This section installs PyTorch using the provided installation script. The <mark style="color:yellow;">**`TORCH_INSTALL_TYPE`**</mark> argument specifies the type of installation (default is "skip").

#### <mark style="color:green;">Wheel building stage</mark>

```docker
FROM ${DEVEL_IMAGE} as wheel

WORKDIR /src/tensorrt_llm

COPY benchmarks benchmarks
COPY cpp cpp
COPY benchmarks benchmarks
COPY scripts scripts
COPY tensorrt_llm tensorrt_llm
COPY 3rdparty 3rdparty
COPY setup.py requirements.txt requirements-dev.txt ./

ARG BUILD_WHEEL_ARGS="--clean --trt_root /usr/local/tensorrt --python_bindings --benchmarks"
RUN python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS}
```

This stage builds the TensorRT-LLM wheel package.&#x20;

It starts from the development image and sets the working directory to <mark style="color:yellow;">**`/src/tensorrt_llm`**</mark>.&#x20;

The necessary source files and directories are copied into the image.&#x20;

The <mark style="color:yellow;">**`build_wheel.py`**</mark> script is run with the specified build arguments to create the wheel package.

#### <mark style="color:green;">Release stage</mark>

```docker
FROM ${DEVEL_IMAGE} as release

WORKDIR /app/tensorrt_llm

COPY --from=wheel /src/tensorrt_llm/build/tensorrt_llm*.whl .
RUN pip install tensorrt_llm*.whl --extra-index-url https://pypi.nvidia.com && \
    rm tensorrt_llm*.whl

COPY README.md ./
COPY docs docs
COPY cpp/include include

RUN ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/libs")') lib && \
    test -f lib/libnvinfer_plugin_tensorrt_llm.so && \
    ln -sv lib/libnvinfer_plugin_tensorrt_llm.so lib/libnvinfer_plugin_tensorrt_llm.so.9 && \
    echo "/app/tensorrt_llm/lib" > /etc/ld.so.conf.d/tensorrt_llm.conf && \
    ldconfig

ARG SRC_DIR=/src/tensorrt_llm
COPY --from=wheel ${SRC_DIR}/benchmarks benchmarks

ARG CPP_BUILD_DIR=${SRC_DIR}/cpp/build
COPY --from=wheel \
    ${CPP_BUILD_DIR}/benchmarks/bertBenchmark \
    ${CPP_BUILD_DIR}/benchmarks/gptManagerBenchmark \
    ${CPP_BUILD_DIR}/benchmarks/gptSessionBenchmark \
    benchmarks/cpp/

COPY examples examples
RUN chmod -R a+w examples && \
    rm -v \
    benchmarks/cpp/bertBenchmark.cpp \
    benchmarks/cpp/gptManagerBenchmark.cpp \
    benchmarks/cpp/gptSessionBenchmark.cpp \
    benchmarks/cpp/CMakeLists.txt

ARG GIT_COMMIT
ARG TRT_LLM_VER
ENV TRT_LLM_GIT_COMMIT=${GIT_COMMIT} \
    TRT_LLM_VERSION=${TRT_LLM_VER}
```

The release stage starts from the development image and sets the working directory to <mark style="color:yellow;">**`/app/tensorrt_llm`**</mark>.&#x20;

It copies the built wheel package from the previous stage and installs it using <mark style="color:yellow;">**`pip`**</mark>.&#x20;

The README, documentation, and include files are copied into the image.

The <mark style="color:yellow;">**`RUN`**</mark> instruction creates symbolic links for the TensorRT-LLM libraries and updates the library configuration.

The benchmark files and examples are copied from the wheel stage into the release image. Some unnecessary files are removed, and the examples directory is given write permissions.

Finally, the <mark style="color:yellow;">**`GIT_COMMIT`**</mark> and <mark style="color:yellow;">**`TRT_LLM_VER`**</mark> build arguments are used to set environment variables in the image.

This multi-stage Dockerfile allows for efficient building and packaging of the TensorRT-LLM library, separating the development dependencies from the final release image.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://tensorrt-llm.continuumlabs.ai/building-tensorrt-llm/tensorrt-llm-dockerfile.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
