Skip to content

tenstorrent/tt-inference-server

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TT-Inference-Server

Tenstorrent Inference Server (tt-inference-server) is the repo of available model APIs for deploying on Tenstorrent hardware.

Official Repository

https://github.com/tenstorrent/tt-inference-server

Getting Started

Please follow setup instructions for the model you want to serve, Model Name in tables below link to corresponding implementation.

Note: models with Status [πŸ” preview] are under active development. If you encounter setup or stability problems please file an issue and our team will address it.

LLMs

For automated and pre-configured vLLM inference server using Docker please see the Model Readiness Workflows User Guide. The list below shows the default model implementations supported.

Model Weights Hardware Status tt-metal commit vLLM commit Docker Image
Qwen3-32B WH-QuietBox/WH-LoudBox (T3K) πŸ› οΈ Experimental v0.59.0-rc39 3accc8d 0.0.5-v0.59.0-rc39-3accc8d
Mistral-7B-Instruct-v0.3 WH-QuietBox/WH-LoudBox (T3K), n150, n300 🟑 Functional v0.59.0-rc39 f028da1 0.0.5-v0.59.0-rc39-f028da1
QwQ-32B WH-QuietBox/WH-LoudBox (T3K) πŸ› οΈ Experimental v0.57.0-rc71 2a8debd 0.0.5-v0.57.0-rc71-2a8debd
Qwen2.5-72B
Qwen2.5-72B-Instruct
WH-QuietBox/WH-LoudBox (T3K) πŸ› οΈ Experimental v0.61.0-rc1 3dc6c31 0.0.5-v0.61.0-rc1-3dc6c31
Qwen2.5-7B
Qwen2.5-7B-Instruct
WH-QuietBox/WH-LoudBox (T3K), n300 πŸ› οΈ Experimental v0.61.0-rc1 3dc6c31 0.0.5-v0.61.0-rc1-3dc6c31
Llama-3.3-70B
Llama-3.3-70B-Instruct
Llama-3.1-70B
Llama-3.1-70B-Instruct
DeepSeek-R1-Distill-Llama-70B
Galaxy 🟑 Functional f8c933739eee f028da1 0.0.5-f8c933739eee-f028da1
Llama-3.3-70B
Llama-3.3-70B-Instruct
Llama-3.1-70B
Llama-3.1-70B-Instruct
DeepSeek-R1-Distill-Llama-70B
WH-QuietBox/WH-LoudBox (T3K) 🟑 Functional v0.59.0-rc14 a869e5d 0.0.5-v0.59.0-rc14-a869e5d
Llama-3.3-70B
Llama-3.3-70B-Instruct
Llama-3.1-70B
Llama-3.1-70B-Instruct
DeepSeek-R1-Distill-Llama-70B
BH-QuietBox (P150X4) 🟑 Functional v0.59.0-rc51 b35fe70 0.0.5-v0.59.0-rc51-b35fe70
Llama-3.2-11B-Vision
Llama-3.2-11B-Vision-Instruct
WH-QuietBox/WH-LoudBox (T3K), n300 🟑 Functional v0.60.0-rc11 d5a9203 0.0.5-v0.60.0-rc11-d5a9203
Llama-3.2-90B-Vision
Llama-3.2-90B-Vision-Instruct
WH-QuietBox/WH-LoudBox (T3K) πŸ› οΈ Experimental v0.61.1-rc1 5cbc982 0.0.5-v0.61.1-rc1-5cbc982
Llama-3.2-1B
Llama-3.2-1B-Instruct
WH-QuietBox/WH-LoudBox (T3K), n150, n300 🟑 Functional v0.60.1 5cbc982 0.0.5-v0.60.1-5cbc982
Llama-3.2-3B
Llama-3.2-3B-Instruct
WH-QuietBox/WH-LoudBox (T3K), n150, n300 🟑 Functional v0.57.0-rc71 2a8debd 0.0.5-v0.57.0-rc71-2a8debd
Llama-3.1-8B
Llama-3.1-8B-Instruct
WH-QuietBox/WH-LoudBox (T3K), n150, n300 🟑 Functional v0.57.0-rc71 2a8debd 0.0.5-v0.57.0-rc71-2a8debd
Llama-3.1-8B
Llama-3.1-8B-Instruct
p100, p150 πŸ› οΈ Experimental v0.59.0-rc3 8a43c88 0.0.5-v0.59.0-rc3-8a43c88
Llama-3.1-8B
Llama-3.1-8B-Instruct
Galaxy 🟑 Functional v0.62.0-rc10 c348d08 0.0.5-v0.62.0-rc10-c348d08

CNNs

Model Name Model URL Hardware Status Minimum Release Version
YOLOv4 GH Repo n150 πŸ” preview v0.0.1

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 7