-
Notifications
You must be signed in to change notification settings - Fork 25.1k
[NVIDIA] Refactor Family Blackwell Support codegen #156176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/156176
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 2f1e36e with merge base 7d87e35 ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "topic: not user facing" |
I see the need to extend select_compute_arch.cmake even after the forked CUDA modules have been removed. Can you wait until I have fixed all build failures and merged #154595, then we can discuss how to extend? |
@johnnynunez Could you try modifying |
I think that i have to wait, right? |
@tinglvv @malfet @atalman could you merge it? 10.1 not exists anymore in cuda 13. Thor is 11.0 it’s building for me https://pypi.jetson-ai-lab.io/sbsa/cu130/torch/2.9.0 |
@pytorchbot merge |
This PR needs to be approved by an authorized maintainer before merge. |
@ptrblck could review and merge? |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Refactor Blackwell #27537 In CUDA 13: - 10.0 is b100/b200 same for aarch64 (gb200) - 10.3 is GB300 - 11.0 is Thor with new OpenRm driver (moves to SBSA) - 12.0 is RTX/RTX PRO - 12.1 is Spark GB10 Thor was moved from 10.1 to 11.0 and Spark is 12.1. Related patch: pytorch/pytorch#156176
With the legacy driver (nvgpu) used for CUDA 12.9, Thor was operating with SM 10.1. This changes to SM 11.0 when the newer driver model (OpenRM), which is intended for CUDA 13.0, is introduced. Thor 10.1 --> 11.0 Spark 12.1 Pull Request resolved: pytorch#156176 Approved by: https://github.com/ezyang
With the legacy driver (nvgpu) used for CUDA 12.9, Thor was operating with SM 10.1.
This changes to SM 11.0 when the newer driver model (OpenRM), which is intended for CUDA 13.0, is introduced.
Thor 10.1 --> 11.0
Spark 12.1