LogoLogo
Continuum WebsiteContinuum ApplicationsContinuum KnowledgeAxolotl Platform
  • TensorRT-LLM
  • The TensorRT-LLM Process
  • Performance
  • Virtual Machine Creation
  • CUDA Introduction
    • CUDA Architecture
    • Stream Multiprocessors: The Heart of GPU Computing
    • Pre Installation
    • Compatibility Assessment
    • NVCC: The NVIDIA CUDA Compiler
    • Installing Cuda
    • Installing the NVIDIA Container Toolkit
    • CUDA and bandwidth
    • Tensor Cores
  • Building TensorRT-LLM
    • Building from Source
    • TensorRT-LLM Dockerfile
      • Base Image
      • install_base.sh
      • install_cmake.sh
      • install_tensorrt.sh
      • install_pytorch.sh
      • requirements.txt
      • build_wheel.py
      • setup.py
      • Docker Makefile
      • Persistence
      • Running with persistent volumes
  • TensorRT-LLM Architecture and Process
    • The TensorRT-LLM process
    • INetworkDefinition
    • Model Definition
    • Compilation
    • Runtime Engine
    • Weight Bindings
    • Model Configuration
  • TensorRT-LLM build workflow
    • TensorRT-LLM build workflow - process
  • CUDA Graphs
    • Experimentation with CUDA Graphs
  • TensorRT-LLM Libraries
    • tensorrt_llm folders
    • tensorrt_llm/builder.py
    • tensorrt_llm/network.py
    • tensorrt_llm/module.py
    • top_model_mixin.py
    • trt-llm build command
    • trtllm-build CLI configurations
  • LLama2 installation
    • Converting Checkpoints
      • Checkpoint List - Arguments
      • Examples of running the convert_checkpoint.py script
      • convert_checkpoint examples
      • Checkpoint Script Arguments
      • checkpoint configuration file
      • run_convert_checkpoint.py script
    • LLama2 Files Analysis
    • TensorRT-LLM Build Engine Process
    • TensorRT-LLM Build Process Documentation
    • Build arguments
    • trtllm build configuration file
    • Run the buildconfig file
    • Analysis of the output from build.py
    • LLama3 configurations
    • Proposed checkpoint config file for LLama3
    • Proposed build config file for LLama3
    • run.py for inference
    • Using the models - running Llama
    • generate_int8 function
    • summarize.py script in Llama folder
    • Compiling LLama Models
  • Tasks
  • LLama Model Directory
    • llama/model.py
    • llama/utils.py
    • llama/weight.py
    • llama/convert.py
    • PreTrainedModel class
    • LlamaForCausalLM class
    • PretrainedConfig class
  • TensorRT-LLM Tutorial
  • Tutorial 2 - get inference going
  • examples/run.py
  • examples/utils.py
  • examples/summarize.py
  • The Python API
    • Layers
    • Functionals
    • functional.py
    • tensorrt_llm.functional.embedding
    • tensorrt_llm.functional.gpt_attention
    • tensorrt_llm.functional.layer_norm
    • tensorrt_llm.functional.rms_norm
    • Model
    • Quantization
    • Runtime
    • Runtime Process
  • Transformer Architecture
    • Attention Mechanism
    • Multi Head Attention
    • Positional Encoding
    • Scaled dot-product attention
    • Layer Normalisation
    • Activation Functions
    • Residual Connections
    • Position Wise Feed-Forward Layer
    • Transformer Feed-Forward Layers Are Key-Value Memories
    • KV Cache
      • Efficient Streaming Language Models with Attention Sinks
      • Input QKV tensor
    • General Notes on Model Architecture
  • Best Practices for Tuning the Performance of TensorRT-LLM
    • Optimisation Techniques
    • Batch Manager
    • Alibi
    • Relative Attention Bias
    • Beam Search
    • Rotary Positional Embedding (RoPE)
    • Numerical Precision
    • FP8 Formats for Deep Learning
  • Graph Rewriting
  • Reducing Activation Recomputation in Large Transformer Models
  • Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
  • Numerical Position
  • TensorRT Models
  • Bloom
    • Huggingface Bloom Documentation
  • Runtime
  • Graph Rewriting (GW) module
  • FasterTransfomer Library
  • Dual ABI issues
  • Phi 2.0
  • ONNX
  • Message Passing Interface (MPI)
  • NVIDIA Nsight Systems: A Comprehensive Guide for TensorRT-LLM and Triton Inference Server
  • NCCL
Powered by GitBook
LogoLogo

Continuum - Accelerated Artificial Intelligence

  • Continuum Website
  • Axolotl Platform

Copyright Continuum Labs - 2023

On this page
  • What is ABI?
  • Background and Need for Dual ABI
  • Implementation of Dual ABI
  • Choosing the ABI
  • Impact on Code
  • Troubleshooting and Compatibility

Was this helpful?

Dual ABI issues

The concept of Dual ABI, as introduced by the GCC (GNU Compiler Collection) 5.1 release for its libstdc++ library, revolves around supporting two Application Binary Interfaces (ABIs) simultaneously.

This detailed explanation will unpack the concepts and technical terms involved in understanding Dual ABI, its implementation, and its impact on C++ development.

What is ABI?

An Application Binary Interface (ABI) is a low-level, binary interface between two program modules; one of which could be a library or an operating system.

It defines details such as how functions are called and how data is formatted in memory.

An ABI ensures that a program compiled by one compiler can call and use functions compiled by another compiler, provided both compilers adhere to the same ABI.

Background and Need for Dual ABI

With the release of the C++11 standard, certain changes were mandated to standard library components, such as std::string and std::list, to enhance performance and conformance to the new standard. For example:

  • Copy-On-Write (COW) Strings: The C++11 standard prohibits the use of COW for std::string due to thread safety and performance issues.

  • List Size Tracking: std::list is required to keep track of its size, changing its implementation.

To comply with these changes without breaking existing binaries linked against older versions of libstdc++, GCC introduced a Dual ABI mechanism.

Implementation of Dual ABI

  • Inline Namespaces: The new implementations of affected components (like std::string and std::list) were placed in an inline namespace (std::__cxx11), allowing both old and new versions to coexist in the same library. This means, for example, that the new std::list is actually std::__cxx11::list.

  • _GLIBCXX_USE_CXX11_ABI Macro: This macro controls which ABI (old or new) is used by the source file being compiled. Setting it to 1 uses the new ABI, while setting it to 0 uses the old ABI.

Choosing the ABI

The decision of which ABI to use is made at the compilation level, independent of the C++ standard version (-std=c++11, -std=c++03, etc.) being used. This design choice allows for linking code compiled with different C++ standards but ensures ABI consistency.

Impact on Code

  • Extensive Use of std::string: Because std::string is a fundamental part of the C++ standard library, the Dual ABI affects many other types, including I/O streams and locale facets.

  • Exception Handling: Most standard exceptions do not change with the ABI to ensure exceptions thrown in one part of a program can be caught in another, regardless of the ABI used. However, std::ios_base::failure is an exception due to changes in its base class in C++11.

Troubleshooting and Compatibility

  • Linker Errors: When attempting to link object files compiled with different ABIs, you may encounter linker errors related to std::__cxx11 namespace symbols. This usually indicates ABI incompatibility.

  • Third-party Libraries: If a third-party library was compiled with an older ABI, you might need to compile your code with the old ABI for compatibility.

PreviousFasterTransfomer LibraryNextPhi 2.0

Last updated 1 year ago

Was this helpful?

Page cover image