ICYMI: 👀 Attendees at #SIGGRAPH2025 got a sneak peek of the GB10 Grace Blackwell Superchip powering the NVIDIA DGX Spark ... and were surprised at how small, and how powerful it is for running AI workloads. #SparkSomethingBig 💫
About us
Explore the latest breakthroughs made possible with AI. From deep learning model training and large-scale inference to enhancing operational efficiencies and customer experience, discover how AI is driving innovation and redefining the way organizations operate across industries.
- Website
-
http://nvda.ws/2nfcPK3
External link for NVIDIA AI
- Industry
- Computer Hardware Manufacturing
- Company size
- 10,001+ employees
- Headquarters
- Santa Clara, CA
Updates
-
Take a video of your favorite trail and turn it into a coherent 3D world with LongSplat 🤳⛰️ Our model reconstructs scenes from any casual long video without camera calibration and renders high-quality novel views from any point along your path. Try it for yourself ➡️ https://nvda.ws/4n0a4qS
-
AstraZeneca, Ericsson, Saab, SEB, and Wallenberg Investments AB have launched Sferical AI, a new company to operate Sweden’s next AI supercomputer. Sferical AI will provide Sweden’s leading industries with access to NVIDIA-powered AI compute on secure, sovereign infrastructure. Work will also begin on establishing an NVIDIA AI Technology Center in Sweden, focused on upskilling talent, supporting researchers, and advancing AI adoption in the country. 👉 https://nvda.ws/4lDK8Qw
-
-
🏎️ Accelerate your AI workflow with open source models, frameworks, and datasets—all optimized for our full-stack accelerated platform. Whether you’re scaling LLMs, robotics, or optimization pipelines, our ecosystem is open and ready for you to build, iterate, and deploy faster— all powered by #opensource innovation. Explore our open source projects on GitHub, access hundreds of models and datasets on Hugging Face, or dive deeper into our open source project catalog. ➡️ https://lnkd.in/gRtu7EUn 🤗 https://lnkd.in/gzg-pG3X #HotChips
-
-
Explore TensorRT-LLM's CI infrastructure with our experts. Topics include CI overview, conditional test triggers, adding new tests and model accuracy testing methodoogy.
Making TensorRT-LLM Developers More Efficient with Baseline Work
www.linkedin.com
-
Ready to scale your agent evaluation workflows? Watch this step-by-step video and learn how you can scale LLM-as-a-Judge with NVIDIA NeMo Evaluator: ✅Install the NVIDIA NIM Operator ✅Set up LLM-as-a-Judge with NeMo Evaluator using Docker compose ✅Scale your workflow to leverage multiple GPUs Get hands-on and power up your agent evaluations in just a few clicks. Watch video 👇 or on YouTube ➡️ https://nvda.ws/4oRuLXK
-
NVIDIA AI reposted this
As an AI Engineer, you must understand how these NVIDIA frameworks/libraries work 👌 1️⃣ 𝗖𝗨𝗗𝗔 A parallel computing platform and API to accelerate computation on NVIDIA GPUs. Keypoints: ↳ Kernels - C/C++ functions. ↳ Thread - executes the kernel instructions. ↳ Block - groups of threads. ↳ Grid - a collection of blocks. ↳ Streaming Multiprocessor (SM) - processor units that execute thread blocks. When a CUDA program invokes a kernel grid, the thread blocks are distributed to the SMs. CUDA follows the SIMT (Single Instruction Multiple Threads) architecture to execute thread logic and uses a Barrier to gather and synchronize Threads. 2️⃣ 𝗰𝘂𝗗𝗡𝗡 Library with highly tuned implementations for standard routines such as: ↳ forward and backward convolution ↳ attention ↳ Matmul, pooling, and normalization are used in all neural network architectures. 3️⃣ 𝗧𝗲𝗻𝘀𝗼𝗿𝗥𝗧 If we unpack a model architecture, we have multiple layer types, operations, layer connections, activations, etc. Imagine an NN architecture as a complex Graph of operations. TensorRT can: ↳ Scan that graph ↳ Identify bottlenecks ↳ Optimize ↳ Remove, merge layers ↳ Reduce layer precision, ↳ Many other optimizations. 4️⃣ 𝗧𝗲𝗻𝘀𝗼𝗿𝗥𝗧-𝗟𝗟𝗠 Inference Engine that leverages TensorRT Compiler optimizations for Transformer-based models. Covers the advanced and custom requirements for LLMs, such as: ↳ KV Caching ↳ Inflight Batching ↳ Optimized Attention Kernels ↳Tensor Parallel ↳ Pipeline Parallel. 5️⃣ 𝗧𝗿𝗶𝘁𝗼𝗻 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗦𝗲𝗿𝘃𝗲𝗿 An open-source, high-performance, and secure serving system for AI Workloads. Devs can optimize their models, define serving configurations in Protobuf Text files, and deploy. It supports multiple framework backends, including: ↳ Native PyTorch, TensorFlow ↳ TensorRT, TensorRT-LLM ↳ Custom BLS (Business Language Scripting) with Python Backends 6️⃣ 𝗡𝗩𝗜𝗗𝗜𝗔 𝗡𝗜𝗠 Set of plug-and-play inference microservices that package up multiple NVIDIA libraries and frameworks highly tuned for serving LLMs to production clusters & datacenters at scale. Contains: ↳ CUDA, cuDNN ↳ TensorRT ↳ Triton Server ↳ Many other libraries are baked in. NIM provides the optimal serving configuration for an LLM. 7️⃣ 𝗗𝘆𝗻𝗮𝗺𝗼 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 The Triton Inference Server successor, but for large-scale GenAI workloads. Composed of modular blocks, robust and scalable. Implements: ↳ Elastic Compute - GPU Planner ↳ KV Routing, Sharing, and Caching ↳ Disaggregated Serving of Prefill and Decode. ----- ♻️ Share, and help others in your network! Check the first comment and follow me for practical content on AI/ML!
-
-
AI is everywhere — and every interaction depends on inference. Today’s reasoning models generate far more tokens, creating new pressure on infrastructure. That’s why we built the Think SMART framework to help enterprises navigate the tradeoffs between accuracy, latency and ROI, so AI factories scale efficiently and deliver meaningful business outcomes. 🔹 𝗦cale and complexity ⚖️ 𝗠ulti-dimensional performance 🏗️ 𝗔rchitecture and software 💸 𝗥OI driven by performance 🌐 𝗧echnology ecosystem and install base AI factories that Think SMART stay ahead and deliver real business impact. 👉 Learn how to apply the framework: https://nvda.ws/4mjlPZk
-
-
✨Just released: NVIDIA Nemotron post-training multi-lingual dataset ✨ We expanded permissive post-training dataset with addition of synthetically translated reasoning traces. ✋ Five new languages 💪 World class reasoning traces 🤗 Learn more: https://nvda.ws/4fKTp85 📥 Download: https://nvda.ws/4lD3ZQ0