Skip to content

Efficiency upgrade on frame limiting #4385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PerikiyoXD
Copy link

@PerikiyoXD PerikiyoXD commented Aug 23, 2025

Queue-only ApplyFrameRateLimit, centralized pacing in ApplyQueuedFrameRateLimit

Caveats: Main menu runs with arbitrary FPS limit now (until we figure out hot to fps limit that codepath)

  • EnsureFrameRateLimitApplied no longer calls ApplyFrameRateLimit() from Present path to avoid skewed pacing; now only queues the target rate for the timer hook.
  • ApplyFrameRateLimit simplified to just queue the requested rate.
  • ApplyQueuedFrameRateLimit is now the single enforcement point: uses high-precision timing, adaptive sleep/yield calibration, and minimal spin to achieve stable pacing without wasting CPU.
  • Added detailed comments clarifying responsibilities and caveats (e.g. Main Menu not using timer hook).

Before you go ahead and create a pull request, please make sure:

…acing in ApplyQueuedFrameRateLimit

TL;DR: Main menu runs with arbitrary FPS limit now

- EnsureFrameRateLimitApplied no longer calls ApplyFrameRateLimit() from Present path
  to avoid skewed pacing; now only queues the target rate for the timer hook.
- ApplyFrameRateLimit simplified to just queue the requested rate.
- ApplyQueuedFrameRateLimit is now the single enforcement point:
  uses high-precision timing, adaptive sleep/yield calibration, and minimal spin
  to achieve stable pacing without wasting CPU.
- Added detailed comments clarifying responsibilities and caveats
  (e.g. Main Menu not using timer hook).
@PerikiyoXD
Copy link
Author

Tests on 5800X + 3080Ti

FPS Cap 1.6 Release Build (CPU) #4385 changes Debug Build (CPU)
144 7.2% ± 0.3% 2.0% ± 0.2%
60 5.0% ± 0.2% 1.0% ± 0.2%
30 2.8% ± 0.2% 0.9% ± 0.2%
Uncapped 9.3% @ 450fps 7.8% @ 922fps

@ArranTuna
Copy link
Collaborator

ArranTuna commented Aug 23, 2025

Your chart got my interest, so I took some readings of my own before and with your PR and... holy shit with FPS limited, CPU usage was approximately halved.

FPS CPU GPU
Normal unlimited (197) 11% 34%
PR unlimited (197) 11% 29%
Normal limited 60 9.2% 25%
PR limited 60 4.7% 18%

So for me didn't make any difference when using unlimited FPS. But amazing reduction in CPU and GPU usage when limited.

Is this CPU usage now freed up so that players with low performance systems are now potentially going to get a much better FPS because that CPU time is now available to actually compute things, or no? I dunno just seems to good to be true because so many MTA players don't have good PCs so this change would be monumental for them.

@ArranTuna
Copy link
Collaborator

I've done some further testing, with some high CPU load:

crun setTimer(function() for i=1, 10000 do getElementData(localPlayer, "a") end end, 50, 0)

CPU and FPS were the same with and without this PR when under heavy load, when FPS was limited. However when FPS was set to a max of 60, it remained 60 FPS with the PR, but was 50 FPS on standard. Though is it possible although FPS may say it's higher it might not be better in the sense that the 10 extra frames may have tiny gaps between them and so some could end up being unbalanced in terms of feeling of smoothness.

@PerikiyoXD
Copy link
Author

PerikiyoXD commented Aug 23, 2025

Is this CPU usage now freed up so that players with low performance systems are now potentially going to get a much better FPS because that CPU time is now available to actually compute things, or no?

Yes. The old limiter wasted CPU cycles in a busy-wait loop after finishing a frame. That didn’t steal time from MTA's or SA’s own logic/rendering, but it did keep one core artificially busy doing nothing.

TL;DR: I replaced a spinlock (Tight loop calling Sleep) with a mixed approach to not burn so many CPU cycles

Your chart got my interest, so I took some readings of my own before and with your PR and... holy shit with FPS limited, CPU usage was approximately halved.
FPS CPU GPU
Normal unlimited (197) 11% 34%
PR unlimited (197) 11% 29%
Normal limited 60 9.2% 25%
PR limited 60 4.7% 18%

It'd be nice to know CPU + GPU combo

@ArranTuna
Copy link
Collaborator

It'd be nice to know CPU + GPU combo

i7-10700K + GTX 1070

My readings were based on a rough average of what task manager said.

@PerikiyoXD
Copy link
Author

Is this CPU usage now freed up so that players with low performance systems are now potentially going to get a much better FPS because that CPU time is now available to actually compute things, or no?

Short answer: kinda, but it's more about not wasting CPU cycles than giving you free performance.

Long answer:

The old frame limiter was genuinely awful (Sorry OG code dev). Look at this mess:

while (true) {
    double dSpare = dTargetTimeToUse - m_FrameRateTimer.Get();
    if (dSpare <= 0.0) break;
    if (dSpare >= 10.0)
        Sleep(1);  // Only sleep if >= 10ms left, otherwise just spin
}

So let's walk through what happened on a typical frame. You want 30fps (33.3ms), game renders in, let's say 4ms. Now you need to kill 29ms somehow to fit the frame target.

First it burns through 19ms calling Sleep(1) over and over... that's 19 separate system calls. Then when you're down to the last 9ms? It just sits there spinning the CPU, constantly checking m_FrameRateTimer.Get() in a tight loop until time runs out. This is the CPU burning I'm fixing.

Also, for lower FPS target, the more wasted cycles because you're waiting longer. 30fps wastes way more CPU than 144fps.

It's true that on lower end systems "render" time can be longer ms and it'd need to fill less, but the fact remains true for the remaining time.

The new approach is actually intelligent:

// Learn the sleep overhead, then sleep for most of the wait in ONE call
if (remaining_ms > sleep_overhead + 0.5) {
    Sleep(remaining_ms - learned_overhead);
}
// For medium waits, yield instead of spinning  
else if (remaining_ms > yield_overhead + 0.05) {
    Sleep(0);
}
// Only spin for the final microseconds where precision matters
if (still_need_to_wait) {
    if (remaining > 0.02ms) {
        for (int i = 0; i < 10; i++) _mm_pause();  // Batched pauses
    } else {
        _mm_pause();  // Ultra-tight final loop
    }
}

So, ONE bulk Sleep, then Sleep(0) for medium waits, and finally _mm_pause() for the last few microseconds.

One of the key insights is that it measures how long Sleep() actually takes on your machine and it adapts. Instead of guessing "Sleep(1) takes 1ms", it tracks the real overhead and somewhat compensates.

Plus it uses QueryPerformanceCounter for microsecond precision instead of that janky millisecond timer.

Will this give you +90 FPS? Nah. But it stops your CPU from pointlessly burning cycles just to wait around. The freed up CPU time goes back to the OS for other stuff like better multitasking, less heat, more headroom for background processes.

If you were already CPU bottlenecked somewhere else, this won't magically fix that. But if frame limiting was eating cycles for no reason, yeah those cycles are available now.

I dunno just seems to good to be true because so many MTA players don't have good PCs so this change would be monumental for them.

I get the skepticism but honestly this was just really bad code that got fixed.

Think about it: if you're running 30fps on a potato PC, the old limiter was literally doing 30+ unnecessary system calls per frame just to wait around, then spinning the CPU for whatever was left.

That's like 900+ wasted Sleep() calls per second plus constant busy-waiting IN A SECOND.

On weak hardware it actually matters more because every wasted cycle hurts.

The improvement scales with how bad your FPS cap is...

So yeah, players stuck at 30fps because their PC sucks will see the biggest benefit. Not necessarily higher FPS, but way less pointless CPU waste that was competing with the actual game.

@PerikiyoXD
Copy link
Author

PerikiyoXD commented Aug 23, 2025

On the other part, there is this issue with how MTA tried to frame-limit... It had two codepaths and I disabled one, with no real impact BUT the main menu rendering limit, which doesn't use the CTimer as the game is "technically" not using the timer until it's connected.

I need to find a proper way to fix the other codepath...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants