back to home

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

View on GitHub
2,959 stars
385 forks
341 issues
PythonJupyter NotebookC++

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing pytorch/TensorRT in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/pytorch/TensorRT)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

Torch-TensorRT =========================== Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform. --- Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code. Installation Stable versions of Torch-TensorRT are published on PyPI Nightly versions of Torch-TensorRT are published on the PyTorch package index Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included. For more advanced installation methods, please see here Quickstart Option 1: torch.compile You can use Torch-TensorRT anywhere you use : Option 2: Export If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency). Step 1: Optimize + serialize Step 2: Deploy Deployment in PyTorch: Deployment in C++: Further resources • Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT • Up to 50% faster Stable Diffusion inference with one line of code • Optimize LLMs from Hugging Face with Torch-TensorRT • Run your model in FP8 with Torch-TensorRT • Accelerated Inference in PyTorch 2.X with Torch-TensorRT • [Tools to resolve graph breaks and boost performance]() \[coming soon\] • Tech Talk (GTC '23) • Documentation Platform Support | Platform | Support | | ------------------- | ------------------------------------------------ | | Linux AMD64 / GPU | **Supported** | | Linux SBSA / GPU | **Supported** | | Windows / GPU | **Supported (Dynamo only)** | | Linux Jetson / GPU | **Source Compilation Supported on JetPack-4.4+** | | Linux Jetson / DLA | **Source Compilation Supported on JetPack-4.4+** | | Linux ppc64le / GPU | Not supported | > Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack. Dependencies These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass. • Bazel 8.1.1 • Libtorch 2.12.0.dev (latest nightly) • CUDA 13.0 (CUDA 12.6 on Jetson) • TensorRT 10.15.1.29 (TensorRT 10.3 on Jetson) Deprecation Policy Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy: Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning. Contributing Take a look at the CONTRIBUTING.md License The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence