-
-
Notifications
You must be signed in to change notification settings - Fork 32.7k
GH-135379: Top of stack caching for the JIT. #135465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool. I'll do a full review soon enough.
78489ea
to
2850d72
Compare
Performance is in the noise, but we would need a really big speed up of jitted code for it to be more than noise overall. The nbody benchmark, which spends a lot of time in the JIT shows a 13-18% speedup, except on Mac where it shows no speedup. |
Nice. We use Apple's Compiler for the interpreter, though the JIT uses stock LLVm. Thomas previously showed that the version of the Apple compiler we use is subject to huge fluctuations in performance due to a PGO bug. |
Misc/NEWS.d/next/Core_and_Builtins/2025-06-20-16-03-59.gh-issue-135379.eDg89T.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to review the cases generator later.
Misc/NEWS.d/next/Core_and_Builtins/2025-06-13-13-32-16.gh-issue-135379.pAxZgy.rst
Outdated
Show resolved
Hide resolved
I kept track of some of the new changes and it LGTM. No rush, but when do you plan to merge this? |
When you're done making the requested changes, leave the comment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment below. The _LOAD_ATTR
changes look fine to me.
Latest bechmarking results: https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20250813-3.15.0a0-6a85f95-JIT 2.6% faster on Windows. No change on Linux. It looks like coverage is slower on linux, which is presumably some sort of artifact as the coverage benchmark does lots of instrumentation which prevents the JIT form running (Plus the coverage benchmark uses an old version of coverage, the latest version is much faster). |
The tail calling CI seems to be failing because homebrew changed where they install clang (yet again). Will put up a separate PR to fix that. |
Ok, I fixed the macOS CI on main. Please pull the changes in. |
I thought that caching through side exits would speed things up, but it looks like it slows things down a bit if anything. So, I've reverted that change. Will rerun the benchmarks, to confirm... |
The stats need fixing and the generated tables could be more compact, but it works.