HPC-AI Tech
We are a global team to help you train and deploy your AI models
Pinned Loading
Repositories
Showing 10 of 31 repositories
- TensorRT-LLM Public Forked from NVIDIA/TensorRT-LLM
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
hpcaitech/TensorRT-LLM’s past year of commit activity - TensorRT-Model-Optimizer Public Forked from NVIDIA/TensorRT-Model-Optimizer
A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
hpcaitech/TensorRT-Model-Optimizer’s past year of commit activity - Open-Sora-Demo Public
hpcaitech/Open-Sora-Demo’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…