# trt-llm build command

The <mark style="color:yellow;">**`trtllm-build`**</mark> <mark style="color:blue;">**command-line tool**</mark> is designed to provide a convenient way to build TensorRT engines for large language models using the TensorRT-LLM library.&#x20;

It encapsulates the necessary configuration options and build process into a single command that can be executed from the command line.

### <mark style="color:purple;">The tool is structured in two main parts</mark>

The <mark style="color:yellow;">**`build.py`**</mark> file: This file contains the actual implementation of the <mark style="color:yellow;">**`trtllm-build`**</mark> command.

&#x20;It defines the <mark style="color:yellow;">**`main`**</mark> function, which serves as the entry point for the command-line tool.&#x20;

The <mark style="color:yellow;">**`main`**</mark> function parses the command-line arguments, creates the necessary configurations, and invokes the <mark style="color:yellow;">**`parallel_build`**</mark> function to build the engines.

The <mark style="color:yellow;">**`setup.py`**</mark> file: This file is responsible for creating the <mark style="color:yellow;">**`trtllm-build`**</mark> command during the package installation process.&#x20;

It includes an <mark style="color:yellow;">**`entry_points`**</mark> parameter that maps the <mark style="color:yellow;">**`trtllm-build`**</mark> command to the `main` function in <mark style="color:yellow;">**`build.py`**</mark>.

&#x20;When the TensorRT-LLM package is installed using <mark style="color:yellow;">**`python setup.py install`**</mark> or <mark style="color:yellow;">**`pip install`**</mark>, the <mark style="color:yellow;">**`entry_points`**</mark> are processed, and the <mark style="color:yellow;">**`trtllm-build`**</mark> command is created and <mark style="color:yellow;">**installed in the system's executable path**</mark> or the virtual environment's <mark style="color:yellow;">**`bin`**</mark> directory.

To use the <mark style="color:yellow;">**`trtllm-build`**</mark> command-line tool, you need to have the TensorRT-LLM package installed.&#x20;

Once installed, you can execute the <mark style="color:yellow;">**`trtllm-build`**</mark> command from the command line, passing the necessary arguments to configure the build process.&#x20;

The available arguments include specifying the checkpoint directory, model configuration, build configuration, maximum batch size, maximum input length, and various other options.

For example, you can run the command like this:

{% code overflow="wrap" %}

```bash
trtllm-build --checkpoint_dir /path/to/checkpoint --model_config /path/to/model_config.json --max_batch_size 8 --max_input_len 512 --output_dir /path/to/output
```

{% endcode %}

This command will build the TensorRT engines using the specified checkpoint directory, model configuration, maximum batch size of 8, maximum input length of 512, and save the generated engines in the specified output directory.

### <mark style="color:blue;">Building a Front End</mark>

Since the <mark style="color:yellow;">**`trtllm-build`**</mark> command is a command-line tool, you can create a graphical user interface (GUI) or a web-based front-end that allows users to input the necessary configuration options and generates the corresponding command-line arguments.

For example, you can create a simple web form that prompts the user to enter the checkpoint directory, model configuration file, and other relevant options.&#x20;

When the user submits the form, your front-end can generate the appropriate command-line arguments and execute the <mark style="color:yellow;">**`trtllm-build`**</mark> command behind the scenes.

Here's a simple example using Python and the Flask web framework:

```python
from flask import Flask, render_template, request
import subprocess

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def build_form():
    if request.method == 'POST':
        checkpoint_dir = request.form['checkpoint_dir']
        model_config = request.form['model_config']
        max_batch_size = request.form['max_batch_size']
        max_input_len = request.form['max_input_len']
        output_dir = request.form['output_dir']

        command = f"trtllm-build --checkpoint_dir {checkpoint_dir} --model_config {model_config} --max_batch_size {max_batch_size} --max_input_len {max_input_len} --output_dir {output_dir}"
        subprocess.run(command, shell=True)

        return "Build completed!"

    return render_template('build_form.html')

if __name__ == '__main__':
    app.run()
```

In this example, the Flask app renders an HTML form (<mark style="color:yellow;">**`build_form.html`**</mark>) that allows the user to input the necessary configuration options.&#x20;

When the form is submitted, the app retrieves the user input, generates the corresponding <mark style="color:yellow;">**`trtllm-build`**</mark> command, and executes it using the <mark style="color:yellow;">**`subprocess`**</mark> module.

This is just a simple example to illustrate the concept. You can further enhance the front-end by adding more options, validation, error handling, and a more user-friendly interface.

By creating a front-end, you can provide a more intuitive and user-friendly way for users to interact with the <mark style="color:yellow;">**`trtllm-build`**</mark> command-line tool, abstracting away the complexities of the command-line arguments.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://tensorrt-llm.continuumlabs.ai/tensorrt-llm-libraries/trt-llm-build-command.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
