LogoLogo
Continuum WebsiteContinuum ApplicationsContinuum KnowledgeAxolotl Platform
  • TensorRT-LLM
  • The TensorRT-LLM Process
  • Performance
  • Virtual Machine Creation
  • CUDA Introduction
    • CUDA Architecture
    • Stream Multiprocessors: The Heart of GPU Computing
    • Pre Installation
    • Compatibility Assessment
    • NVCC: The NVIDIA CUDA Compiler
    • Installing Cuda
    • Installing the NVIDIA Container Toolkit
    • CUDA and bandwidth
    • Tensor Cores
  • Building TensorRT-LLM
    • Building from Source
    • TensorRT-LLM Dockerfile
      • Base Image
      • install_base.sh
      • install_cmake.sh
      • install_tensorrt.sh
      • install_pytorch.sh
      • requirements.txt
      • build_wheel.py
      • setup.py
      • Docker Makefile
      • Persistence
      • Running with persistent volumes
  • TensorRT-LLM Architecture and Process
    • The TensorRT-LLM process
    • INetworkDefinition
    • Model Definition
    • Compilation
    • Runtime Engine
    • Weight Bindings
    • Model Configuration
  • TensorRT-LLM build workflow
    • TensorRT-LLM build workflow - process
  • CUDA Graphs
    • Experimentation with CUDA Graphs
  • TensorRT-LLM Libraries
    • tensorrt_llm folders
    • tensorrt_llm/builder.py
    • tensorrt_llm/network.py
    • tensorrt_llm/module.py
    • top_model_mixin.py
    • trt-llm build command
    • trtllm-build CLI configurations
  • LLama2 installation
    • Converting Checkpoints
      • Checkpoint List - Arguments
      • Examples of running the convert_checkpoint.py script
      • convert_checkpoint examples
      • Checkpoint Script Arguments
      • checkpoint configuration file
      • run_convert_checkpoint.py script
    • LLama2 Files Analysis
    • TensorRT-LLM Build Engine Process
    • TensorRT-LLM Build Process Documentation
    • Build arguments
    • trtllm build configuration file
    • Run the buildconfig file
    • Analysis of the output from build.py
    • LLama3 configurations
    • Proposed checkpoint config file for LLama3
    • Proposed build config file for LLama3
    • run.py for inference
    • Using the models - running Llama
    • generate_int8 function
    • summarize.py script in Llama folder
    • Compiling LLama Models
  • Tasks
  • LLama Model Directory
    • llama/model.py
    • llama/utils.py
    • llama/weight.py
    • llama/convert.py
    • PreTrainedModel class
    • LlamaForCausalLM class
    • PretrainedConfig class
  • TensorRT-LLM Tutorial
  • Tutorial 2 - get inference going
  • examples/run.py
  • examples/utils.py
  • examples/summarize.py
  • The Python API
    • Layers
    • Functionals
    • functional.py
    • tensorrt_llm.functional.embedding
    • tensorrt_llm.functional.gpt_attention
    • tensorrt_llm.functional.layer_norm
    • tensorrt_llm.functional.rms_norm
    • Model
    • Quantization
    • Runtime
    • Runtime Process
  • Transformer Architecture
    • Attention Mechanism
    • Multi Head Attention
    • Positional Encoding
    • Scaled dot-product attention
    • Layer Normalisation
    • Activation Functions
    • Residual Connections
    • Position Wise Feed-Forward Layer
    • Transformer Feed-Forward Layers Are Key-Value Memories
    • KV Cache
      • Efficient Streaming Language Models with Attention Sinks
      • Input QKV tensor
    • General Notes on Model Architecture
  • Best Practices for Tuning the Performance of TensorRT-LLM
    • Optimisation Techniques
    • Batch Manager
    • Alibi
    • Relative Attention Bias
    • Beam Search
    • Rotary Positional Embedding (RoPE)
    • Numerical Precision
    • FP8 Formats for Deep Learning
  • Graph Rewriting
  • Reducing Activation Recomputation in Large Transformer Models
  • Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
  • Numerical Position
  • TensorRT Models
  • Bloom
    • Huggingface Bloom Documentation
  • Runtime
  • Graph Rewriting (GW) module
  • FasterTransfomer Library
  • Dual ABI issues
  • Phi 2.0
  • ONNX
  • Message Passing Interface (MPI)
  • NVIDIA Nsight Systems: A Comprehensive Guide for TensorRT-LLM and Triton Inference Server
  • NCCL
Powered by GitBook
LogoLogo

Continuum - Accelerated Artificial Intelligence

  • Continuum Website
  • Axolotl Platform

Copyright Continuum Labs - 2023

On this page
  • This script runs the run_convert_checkpoint.py script
  • run_convert_checkpoint.py script
  • run_convert_checkpoint.py script

Was this helpful?

  1. LLama2 installation
  2. Converting Checkpoints

run_convert_checkpoint.py script

This script runs the run_convert_checkpoint.py script

run_convert_checkpoint.py script

python3 run_convert_checkpoint.py
  1. It parses the command-line argument --config, which specifies the path to the YAML configuration file (default is config.yaml).

  2. It loads the configurations from the specified YAML file using yaml.safe_load().

  3. It extracts the relevant configuration values from the loaded YAML data, such as model_dir, output_dir, dtype, tp_size, pp_size, vocab_size, n_positions, n_layer, n_head, n_embd, and inter_size.

  4. It constructs the command-line arguments for the convert_checkpoint.py script based on the extracted configuration values. The common arguments like model_dir, output_dir, dtype, tp_size, etc., are explicitly added to the cmd_args list.

  5. It iterates over the remaining checkpoint arguments in the YAML file and adds them to the cmd_args list if they are specified and not None. The script excludes the arguments that were already explicitly added (tp_size, pp_size, vocab_size, n_positions, n_layer, n_head, n_embd, inter_size).

  6. Finally, it uses subprocess.run() to execute the convert_checkpoint.py script with the constructed command-line arguments.

  1. It uses argparse to parse the command-line argument --config, which specifies the path to the YAML configuration file (default is config.yaml).

  2. It loads the configurations from the specified YAML file using yaml.safe_load().

  3. It extracts the relevant configuration values from the loaded YAML data, such as model_dir, output_dir, dtype, tp_size, pp_size, vocab_size, n_positions, n_layer, n_head, n_embd, and inter_size.

  4. It constructs the command-line arguments for the convert_checkpoint.py script based on the extracted configuration values. The common arguments like model_dir, output_dir, dtype, tp_size, etc., are explicitly added to the cmd_args list.

  5. It iterates over the remaining checkpoint arguments in the YAML file and adds them to the cmd_args list if they are specified and not None. The script excludes the arguments that were already explicitly added (tp_size, pp_size, vocab_size, n_positions, n_layer, n_head, n_embd, inter_size).

  6. Finally, it uses subprocess.run() to execute the convert_checkpoint.py script with the constructed command-line arguments.

run_convert_checkpoint.py script

import argparse
import os
import subprocess
import yaml

def main():
    parser = argparse.ArgumentParser(description='Run convert_checkpoint.py with configurations from config.yaml')
    parser.add_argument('--config', default='config.yaml', help='Path to the YAML configuration file')
    args = parser.parse_args()

    # Load configurations from the YAML file
    with open(args.config, 'r') as f:
        config = yaml.safe_load(f)

    # Extract the configuration values
    model_dir = config['model']['model_dir']
    output_dir = config['model']['output_dir']
    dtype = config['model']['dtype']
    tp_size = config['checkpoint']['tp_size']
    pp_size = config['checkpoint']['pp_size']
    vocab_size = config['checkpoint']['vocab_size']
    n_positions = config['checkpoint']['n_positions']
    n_layer = config['checkpoint']['n_layer']
    n_head = config['checkpoint']['n_head']
    n_embd = config['checkpoint']['n_embd']
    inter_size = config['checkpoint']['inter_size']

    # Construct the command-line arguments for convert_checkpoint.py
    cmd_args = [
        'python', 'convert_checkpoint.py',
        '--model_dir', model_dir,
        '--output_dir', output_dir,
        '--dtype', dtype,
        '--tp_size', str(tp_size),
        '--pp_size', str(pp_size),
        '--vocab_size', str(vocab_size),
        '--n_positions', str(n_positions),
        '--n_layer', str(n_layer),
        '--n_head', str(n_head),
        '--n_embd', str(n_embd),
        '--inter_size', str(inter_size)
    ]

    # Add additional checkpoint arguments if specified in the YAML file
    for key, value in config['checkpoint'].items():
        if key not in ['tp_size', 'pp_size', 'vocab_size', 'n_positions', 'n_layer', 'n_head', 'n_embd', 'inter_size']:
            if value is not None:
                cmd_args.extend([f'--{key}', str(value)])

    # Run the convert_checkpoint.py script with the specified arguments
    subprocess.run(cmd_args, check=True)

if __name__ == '__main__':
    main()

The run_convert_checkpoint.py script does this:

  1. It uses argparse to parse the command-line argument --config, which specifies the path to the YAML configuration file (default is config.yaml).

  2. It loads the configurations from the specified YAML file using yaml.safe_load().

  3. It extracts the relevant configuration values from the loaded YAML data, such as model_dir, output_dir, dtype, tp_size, pp_size, vocab_size, n_positions, n_layer, n_head, n_embd, and inter_size.

  4. It constructs the command-line arguments for the convert_checkpoint.py script based on the extracted configuration values. The common arguments like model_dir, output_dir, dtype, tp_size, etc., are explicitly added to the cmd_args list.

  5. It iterates over the remaining checkpoint arguments in the YAML file and adds them to the cmd_args list if they are specified and not None. The script excludes the arguments that were already explicitly added (tp_size, pp_size, vocab_size, n_positions, n_layer, n_head, n_embd, inter_size).

  6. Finally, it uses subprocess.run() to execute the convert_checkpoint.py script with the constructed command-line arguments.

The script correctly handles the additional checkpoint arguments specified in the updated config.yaml file.

It checks if each argument is present in the YAML file and adds it to the cmd_args list if the value is not None.

By using the check=True argument in subprocess.run(), the script will raise an exception if the convert_checkpoint.py script exits with a non-zero status code, indicating an error.

Therefore, assuming that the convert_checkpoint.py script is located in the same directory as the run_convert_checkpoint.py script and the config.yaml file, running the run_convert_checkpoint.py script should execute the convert_checkpoint.py script with the configurations specified in the config.yaml file without any errors.

python3 run_convert_checkpoint.py
Previouscheckpoint configuration fileNextLLama2 Files Analysis

Last updated 1 year ago

Was this helpful?