-
-
Notifications
You must be signed in to change notification settings - Fork 484
Efficiency upgrade on frame limiting #4385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Efficiency upgrade on frame limiting #4385
Conversation
…acing in ApplyQueuedFrameRateLimit TL;DR: Main menu runs with arbitrary FPS limit now - EnsureFrameRateLimitApplied no longer calls ApplyFrameRateLimit() from Present path to avoid skewed pacing; now only queues the target rate for the timer hook. - ApplyFrameRateLimit simplified to just queue the requested rate. - ApplyQueuedFrameRateLimit is now the single enforcement point: uses high-precision timing, adaptive sleep/yield calibration, and minimal spin to achieve stable pacing without wasting CPU. - Added detailed comments clarifying responsibilities and caveats (e.g. Main Menu not using timer hook).
Tests on 5800X + 3080Ti
|
Your chart got my interest, so I took some readings of my own before and with your PR and... holy shit with FPS limited, CPU usage was approximately halved.
So for me didn't make any difference when using unlimited FPS. But amazing reduction in CPU and GPU usage when limited. Is this CPU usage now freed up so that players with low performance systems are now potentially going to get a much better FPS because that CPU time is now available to actually compute things, or no? I dunno just seems to good to be true because so many MTA players don't have good PCs so this change would be monumental for them. |
I've done some further testing, with some high CPU load: crun setTimer(function() for i=1, 10000 do getElementData(localPlayer, "a") end end, 50, 0) CPU and FPS were the same with and without this PR when under heavy load, when FPS was limited. However when FPS was set to a max of 60, it remained 60 FPS with the PR, but was 50 FPS on standard. Though is it possible although FPS may say it's higher it might not be better in the sense that the 10 extra frames may have tiny gaps between them and so some could end up being unbalanced in terms of feeling of smoothness. |
Yes. The old limiter wasted CPU cycles in a busy-wait loop after finishing a frame. That didn’t steal time from MTA's or SA’s own logic/rendering, but it did keep one core artificially busy doing nothing. TL;DR: I replaced a spinlock (Tight loop calling Sleep) with a mixed approach to not burn so many CPU cycles
It'd be nice to know CPU + GPU combo |
i7-10700K + GTX 1070 My readings were based on a rough average of what task manager said. |
Short answer: kinda, but it's more about not wasting CPU cycles than giving you free performance.Long answer:The old frame limiter was genuinely awful (Sorry OG code dev). Look at this mess: while (true) {
double dSpare = dTargetTimeToUse - m_FrameRateTimer.Get();
if (dSpare <= 0.0) break;
if (dSpare >= 10.0)
Sleep(1); // Only sleep if >= 10ms left, otherwise just spin
} So let's walk through what happened on a typical frame. You want 30fps (33.3ms), game renders in, let's say 4ms. Now you need to kill 29ms somehow to fit the frame target. First it burns through 19ms calling Also, for lower FPS target, the more wasted cycles because you're waiting longer. 30fps wastes way more CPU than 144fps. It's true that on lower end systems "render" time can be longer ms and it'd need to fill less, but the fact remains true for the remaining time. The new approach is actually intelligent: // Learn the sleep overhead, then sleep for most of the wait in ONE call
if (remaining_ms > sleep_overhead + 0.5) {
Sleep(remaining_ms - learned_overhead);
}
// For medium waits, yield instead of spinning
else if (remaining_ms > yield_overhead + 0.05) {
Sleep(0);
}
// Only spin for the final microseconds where precision matters
if (still_need_to_wait) {
if (remaining > 0.02ms) {
for (int i = 0; i < 10; i++) _mm_pause(); // Batched pauses
} else {
_mm_pause(); // Ultra-tight final loop
}
} So, ONE bulk Sleep, then Sleep(0) for medium waits, and finally _mm_pause() for the last few microseconds. One of the key insights is that it measures how long Sleep() actually takes on your machine and it adapts. Instead of guessing " Plus it uses Will this give you +90 FPS? Nah. But it stops your CPU from pointlessly burning cycles just to wait around. The freed up CPU time goes back to the OS for other stuff like better multitasking, less heat, more headroom for background processes. If you were already CPU bottlenecked somewhere else, this won't magically fix that. But if frame limiting was eating cycles for no reason, yeah those cycles are available now.
I get the skepticism but honestly this was just really bad code that got fixed. Think about it: if you're running 30fps on a potato PC, the old limiter was literally doing 30+ unnecessary system calls per frame just to wait around, then spinning the CPU for whatever was left. That's like 900+ wasted Sleep() calls per second plus constant busy-waiting IN A SECOND. On weak hardware it actually matters more because every wasted cycle hurts. The improvement scales with how bad your FPS cap is... So yeah, players stuck at 30fps because their PC sucks will see the biggest benefit. Not necessarily higher FPS, but way less pointless CPU waste that was competing with the actual game. |
On the other part, there is this issue with how MTA tried to frame-limit... It had two codepaths and I disabled one, with no real impact BUT the main menu rendering limit, which doesn't use the CTimer as the game is "technically" not using the timer until it's connected. I need to find a proper way to fix the other codepath... |
Queue-only ApplyFrameRateLimit, centralized pacing in ApplyQueuedFrameRateLimit
Caveats: Main menu runs with arbitrary FPS limit now (until we figure out hot to fps limit that codepath)
Before you go ahead and create a pull request, please make sure: