22
GPUs "Fake Frames" Tested | DLSS 4.0, MFG 4X, & NVIDIA's Misleading Review Guide February 12, 2025 Last Updated: 2025-02-12 We take a look at NVIDIA’s DLSS 4.0 multi-frame generation, NVIDIA's weird decisions on testing tools, differences between transformer vs. CNN, and more The Highlights DLSS 4.0 has added Multi Frame Generation 3X and 4X and has changed the underlying model for the other DLSS featuresFor its FrameView tool, NVIDIA incorporated some outdated PresentMon code while not meaningfully contributing to the open-source softwareThe ideal scenario for frame gen is to make already acceptable framerates into extra-smooth framerates on high refresh displays Table of Contents AutoTOC Grab a GN15 Large Anti-Static Modmat to celebrate our 15th Anniversary and for a high-quality PC building work surface. The Modmat features useful PC building diagrams and is anti-static conductive. Purchases directly fund our work! (or consider a direct donation or a Patreon contribution!) Intro [Update: as of 2/4/2025, we've confirmed that NVIDIA has reached out to Intel regarding PresentMon contributions.] As simply as possible, NVIDIA is being a dick. NVIDIA has inducted open-source software into its FrameView tool, made its own "improvements" without, as far as we can tell, contributing those same changes to the original source code base, and then distributed outdated information to the press in a 100+ page review guide talking about how to properly test its devices. While NVIDIA has some valid points in how to measure generated frames, the biggest counterpoint is that their statements are based on year-old software that already made a major shift away from the msBetweenPresents metric that NVIDIA is now contesting. The license allows NVIDIA to act this way, but it’s just not productive and can actually be detrimental to testing efforts. Intel has told us that NVIDIA is welcome to contribute to the PresentMon tool and play with everyone else, but NVIDIA is doing NVIDIA things. Editor's note: This was originally published on February 1, 2025 as a video. This content has been adapted to written format for this article and is unchanged from the original publication. Credits Test Lead, Host, Writing Steve Burke Testing, Writing Patrick Lathan Video Editing Vitalii Makhnovets Writing, Web Editing Jimmy Thang As for why this is happening: It’s all relevant because of frame generation and multi-frame generation, which is the topic we’re digging into today. Whatever you want to call them -- generated frames, artificial frames, or “fake frames” -- they are very difficult to test and measure accurately. We’re only scratching the surface of this complicated subject today. This article will first define what DLSS 4.0 actually is and what it’s inclusive of, we’ll cover benchmarking methodology concerns and our issues with NVIDIA’s framing of it (plus the things they get right), and then we’ll present some performance data evaluating NVIDIA’s new models. There is something interesting here, and we absolutely think frame generation is worth talking about as a smoothing method but it’s going to take a couple of stories to get through as it’s complicated. Overview Despite the problems we have with NVIDIA’s PresentMon framing, DLSS 4.0 is worth diving into. NVIDIA seems aware of how confusing the rollout of the new DLSS features is going to be, and its mix of DLSS version numbering and GPU generation support make it complicated to follow. You may have seen this chart before that details the feature set and what does and doesn’t work across various RTX series GPUs. Some of this is segmentation, some of it is hardware, some of it is potentially temporary segmentation that could be changed. DLSS Overview Taking a step back, DLSS stands for Deep Learning Super Sampling. That name was already strange seven years ago when it was introduced, because supersampling usually means rendering above the native resolution of a display and downscaling. For example, rendering your game at 4K on a 1080 screen would result in an expensive but crisp image with excellent anti-aliasing. DLSS is upscaling, which is the opposite. For DLSS, NVIDIA trained an algorithm on sets of extremely high resolution screenshots on a per-game basis, and then used that algorithm for real-time upscaling: you could (for example) render your game at 1080 on a 4K screen and get a cheap but mostly okay-looking image. The use of high resolution screenshots for training is what made it kind-of-sort-of supersampling; DLSS is no longer trained per-game, but the name has stuck around. All of this was relatively straightforward until DLSS 3.0 came out in 2022 which introduced Optical Multi-Frame Generation. OMFG continues NVIDIA’s propensity for initialisms like MFAA. OMFG is Optical Multi Frame Generation (later referred to as Frame Generation 2X or just plain Frame Generation, because now DLSS 4 has Multi Frame Generation). With this, the original functionality of DLSS was renamed DLSS Super Resolution. Now we're at a point where NVIDIA could recommend, for example, enabling Deep Learning Super Sampling Super Resolution Ultra Performance. As for Frame Generation, it falls under the DLSS umbrella even though it's not directly tied to supersampling or upscaling. DLSS 3.5 added ray reconstruction to the DLSS family, and now DLSS 4.0 has added Multi Frame Generation 3X and 4X and has changed the underlying model for the other DLSS features. So, DLSS 4, DLSS 3.5, DLSS 3, et al. are groups of features and updates. If a game has "DLSS 4," that means it has some of the DLSS 4 features, most likely MFG, but not necessarily all of them. The claims that with 50 series AI handles "15 out of every 16 pixels" to "multipl[y] frame rates by up to 8X" are based on using the maximum 4X MFG setting with the DLSS Super Resolution set to Performance, which was formerly the minimum quality setting. That would mean that at 4K, one 1080 frame would be rendered for every four frames displayed. Multi Frame Generation MFG is available for the 50-series cards; as of this writing, no other cards support MULTI Frame Generation. Frame generation can be supported elsewhere but MFG is unique to the 50 series right now. It’s also the centerpiece of NVIDIA’s misleading marketing slides claiming 2x performance gains of the 5090 over the 4090. The misleading chart above is because MFG includes 4X (which is up to three AI frames for every one rendered) and 3X (up to two AI frames for one rendered) modes. The existing Frame Generation (2X) mode has been improved, but remains exclusive to 40-series and newer, and FG, in general, still only works in games that implement it. To paraphrase NVIDIA, the original implementation of Frame Generation combined four elements: a previous frame A, a current frame B, an “optical flow field generated by Ada’s Optical Flow Accelerator,” and data provided directly from the game engine about motion and other elements. Using these inputs, the card could throw together a so-called "fake" frame, or a generated one, between A and B at the cost of queueing up frame B for a little while. Now of course, it’d be ridiculous and a fundamental misrepresentation of the nature of frame generation if you were to get on stage and say something untrue like "it can predict the future." It doesn't extrapolate into the future, it interpolates (intelligently) into the past. This gets confusing if you think about it too much. This creates the feeling of a smoother framerate with some added latency, which is why NVIDIA Reflex has always been a mandatory setting when enabling Frame Generation, a feature that we've covered in detail. Reflex clears the render queue so that frames are delivered with as little latency as possible, which helps mitigate the latency inherent to frame generation. Fundamentally, it seems that Frame Generation, in general, is not for turning low framerates into playable framerates: it's for turning playable framerates into high framerates. This is still true for MFG; as NVIDIA stated to Digital Foundry, "the acceptable input framerate is still about the same for 3X or 4X as it was for 2X." Higher input rates mean fresher data for Frame Generation, smaller gaps to bridge, and less time to notice flaws. That’s the theory. MFG is still bridging the same gap between the newest rendered frame and the previous one, it's just doing it with up to three AI frames rather than one. On the plus side, this theoretically means that latency shouldn't be worse with MFG than with regular FG. If you have a 60Hz monitor, it appears that you are not the target audience. There's no point in using Frame Generation to boost past your monitor's refresh rate. That's not just our opinion: rendering at rates higher than the display refresh rate can reduce latency, but generating frames at a higher rate is just wasting compute power on frames you won't see. This is why NVIDIA's DLSS 4 demos have topped out around 240FPS. As for the Optical Flow Accelerator, with DLSS 4, Ada's hardware Optical Flow Accelerator is out of the picture. DLSS 4 Frame Generation is solely handled by regular Tensor cores, which is why Digital Foundry asked that question about Frame Generation (not MFG) potentially coming to more cards. NVIDIA claims that the new model is faster and uses less VRAM, which makes it feel like the only point of the OFA was to make frame gen exclusive to the 40 series. MFG is not possible using the old model since it "would be required for every new generated frame, and the performance cost would throttle the GPU, resulting in lower input frame rates." NVIDIA did confirm to us that Blackwell, which is the new architecture for the 50 series, does have a hardware OFA unit, and we presume that it'll be used for backwards compatibility with games that don't implement the new framegen model. Time to get into flip metering and some issues we’ve had with NVIDIA’s flawed understanding of PresentMon, plus NVIDIA’s stated reason for restricting MFG to the 50-series. Flip Metering NVIDIA has given two reasons for locking MFG to 50-series: first, the 5th generation Tensor cores are significantly improved, and second, Blackwell has hardware support for Flip Metering. Flip Metering shifts responsibility for frame pacing from the CPU to the GPU driver in order to effectively pace out fake frames (or generated ones), which has become more important with MFG. This takes place immediately prior to scanout. Again, Blackwell's implementation of Flip Metering is hardware based, so it can't be ported to 40-series as-is. This is where NVIDIA’s flawed understanding of PresentMon comes in. PresentMon is a fully open-source utility that is maintained by Intel, but can be contributed to by anybody. Other companies like AMD have worked with Intel on maintaining and advancing parts of PresentMon. PresentMon measures performance. It captures data. It’s able to work at a lower level with Windows and helps to analyze the frame time, framerate performance, animation error, and more. It is the key backbone to most of the performance analysis software that’s out there. Even if you don’t use PresentMon, if you’re an enthusiast who tries to measure your performance, you might use something else that does. CapFrameX is a reskin of PresentMon, and RTSS also appears to use PresentMon. We just use the core command line version of it, but recently helped JayzTwoCents start using the user interface version of PresentMon. Intel’s Tom Petersen has joined us several times to explain advancements to it and how it works. It is one of the most trusted testing utilities for measuring component performance, and, again, its open source nature is critical to that, especially as it’s maintained by Intel. Unfortunately, NVIDIA has decided not to contribute to and maintain PresentMon in a significant way, and that positions us like a marriage counselor talking to Intel on the PresentMon side and NVIDIA on the “they-don’t-understand-how-it-works-side-but-they-sure-have-some-opinions” side. NVIDIA has instead taken an ancient version of PresentMon core and inducted it into its FrameView application. It then proceeded to make adjustments to that code, but as far as we can tell, NVIDIA has elected not to contribute to the open-source project it took from. This has resulted in a bifurcation of major versions between applications. This is a big problem and NVIDIA is sort of at the heart of it. Related to Flip Metering, NVIDIA has pushed hard for a switch from PresentMon's MsBetweenPresents metric to the MsBetweenDisplayChange metric. While we understand why they’re trying to do this, we think this is a flawed premise. That’s because PresentMon already switched off of MsBetweenPresents, and in fact, began doing so around a year ago when Tom Petersen joined us to talk about the change. NVIDIA is pushing for a change that everyone already made except for them, as they were running an older version of PresentMon in their FrameView tool. That’d be like saying “Y’know, I think it’s about time we consider electricity instead of candles.” We think NVIDIA’s FrameView software is most likely based on PresentMon 1.10.0 from February of 2024, based on inspecting the results output. The current version of PresentMon (2.3.0) no longer supports either of the metrics NVIDIA is talking about under those names, switching instead to a FrameTime metric based on CPUBusy and CPUWait (which NVIDIA says is also inadequate). MsBetweenPresents is an ancient metric inherited from FRAPS, which was basically the only frametime tracking tool around when PresentMon began development. It tracks the time between Present calls, which tell the graphics engine to finish up and put the frame on the screen, but it doesn't reflect animation time or time on screen. MsBetweenDisplayChange ALSO doesn't track animation time. Intel defines it as "How long the previous frame was displayed before this Present() was displayed, in milliseconds" versus msBetweenPresents, which was "the time between this Present() call and the previous one, in milliseconds." In contrast, the current PresentMon FrameTime metric is the best estimate of the time between the game taking snapshots of its physics engine. Despite NVIDIA’s decision to not participate in supporting the tool everyone else does and instead maintain its own out-of-date toys, it does have one good point in all of this. The valid part is that situations like MFG have a problem, where there's external frame metering very late in the pipeline that may make the on-screen framerate smoother than MsBetweenPresents would imply. Using MsBetweenDisplayChange does allow NVIDIA's flip metering to be taken into account, so it makes sense to use it specifically for tracking NVIDIA’s frame generation. But we don’t think MsBetweenDisplayChange is sufficient, either, and that’s what NVIDIA is telling everyone to use. We mentioned animation error, which we’ve talked about with Tom Petersen in the past: just because frames are delivered at an even pace doesn't mean that the things being shown in those frames have been animated at an even pace. Animation times and flip times must be consistent. This was a problem with SLI and CrossFire, where frames could be delivered at a constant rate, but could appear stuttery anyway because animation timing didn't line up with the flip timing. Ideally, in the long term, we'd move towards using FPS as a metric for raw horsepower and animation error as a metric for smoothness so to speak, but that's outside the scope of this piece. A more specific issue, frankly put, is that NVIDIA is being a dick about PresentMon, an open-source tool, claiming that "FrameView has solved certain bugs to make MsBetweenDisplayChange accurate in more scenarios" without communicating those changes to the developers. That leaves us with a situation where we're forced to use NVIDIA's version of an open-source tool that NVIDIA has closed off, to then test NVIDIA's hardware because of unspecified updates that NVIDIA has made with supposedly bug fixes that NVIDIA has not explicitly disclosed. You can see why there are a lot of problems measuring an NVIDIA product. This also calls the entirety of the premise into question, as there are no external means to inspect NVIDIA’s tools. This isn’t illegal under PresentMon's current license, but it is rude. AMD is even currently collaborating to support tracking Fluid Motion Frames, but NVIDIA is not. Anyway, let’s get back to MFG and DLSS. We’ll talk about image enhancements next. Image Enhancements NVIDIA is switching away from Convolutional Neural Networks to a transformer model for DLSS (rolling out as a beta). This change affects all RTX cards. The gist is that the new model should have a better understanding of which areas of the screen are important, and therefore, according to the reviewer’s guide, has "greater stability, reduced ghosting, higher detail in motion, and smoother edges in a scene." This is an image quality improvement, not a performance improvement, and may actually worsen performance a bit. NVIDIA notes that "While Ada and Blackwell both benefit from this new model, the Transformer model is often more performant on Blackwell." We'll show our own results later in this piece. Driver Overrides For each of the DLSS4 features we've discussed—the new frame generation model, multi frame generation, and the transformer model for DLSS SR and RR—there are now driver-level overrides. For example, with a game that supported the original FG and a card that supports MFG, you could use the driver to force MFG. This is a whitelist system, so not every game will support overriding, and the ones that do won't necessarily have every DLSS 4 feature available. We don't expect this to get a ton of use; the interface is (currently) confusing, and we'd assume that most developers willing to go through the hassle of getting whitelisted for overrides will just update their games. Reflex 2.0 There are plenty of other new features that were announced alongside the 50 series, but we're just talking about DLSS here. NVIDIA Reflex isn't technically DLSS, but it is a dependency, so we're throwing it in. Reflex's new feature is "Frame Warp," which is similar to the reprojection tech that's been used in VR. It checks mouse input at the last possible moment, and if it detects an update, it shifts the camera over to approximate the correct movement. The unrendered pixels are then painted in. Whether or not this represents a literal reduction in latency is arguable, but it should feel like it. Performance Testing Visit our Patreon page to contribute a few dollars toward this website's operation (or consider a direct donation or buying something from our GN Store!) Additionally, when you purchase through links to retailers on our site, we may earn a small affiliate commission. Time to get into performance testing. We have a couple groups of comparisons here. The primary objective is to compare the impact of the old and new models that NVIDIA is using, not to compare the RTX 4090 to the RTX 5090. There are a lot of challenges with analyzing so-called “fake” frames though, one of which is that it isn’t clear that the amount of generated frames remains consistent. Another challenge is that, although we will have a control on the charts for “pure” frames, generated frames aren’t a like-for-like comparison by the very nature of them not being the same as standard rendered frames. Our test bench information is detailed on the website, which is free of third-party ads and includes the hardware configuration. You can check that out for the hardware if you want to know what we’re using. Another important thing: We're not covering image quality in this piece -- that’ll come later -- but we will go over performance numbers. Image quality is an entirely different deep-dive. NVIDIA provided six games to press with the new DLSS 4 features available. We dug through them to find some way to isolate features for testing. Four of the games had native feature implementations and two were whitelisted for driver overrides. As mentioned earlier, we were forced to use FrameView for all benchmarks involving frame generation. For the purposes of these tests, we're including generated frames in our averages. Old Frame Generation VS Enhanced Frame Generation Our first task was to compare performance of the old framegen model to the new one on both a 4090 and a 5090. The company has not claimed that image quality will change or improve with the new method. The only two games available to us with BOTH models were Marvel Rivals and Dragon Age: The Veilguard, both through driver overrides, but we were also able to make a usable comparison with Hogwarts Legacy. We’re using the games that work with it. For this set of charts, we’re focusing on comparing the upcoming red and orange bars for the impact of different models. The blue bar is all real frames, meaning the comparison is not like-for-like with the generated frames. We’re just going to present the numbers and talk about them. It’s a very tricky subject, which benefits NVIDIA because it makes it hard to evaluate, but that is the nature of artificially generated data. Dragon Age: The Veilguard Here’s Dragon Age: The Veilguard at 4K with maximum settings. We manually increased it above the highest preset here. DLSS SR and AA were not used for these tests and Reflex was enabled even for the control runs without frame generation. Each card should be compared against itself. We’ll start on the 4090. The new model with FG2X on the 4090 ran at 107 FPS AVG against the old model’s result of 100.7 FPS AVG, an increase in fake+real frame perceived performance of 6.26%. If we were to filter out the fake frames to only real frames to evaluate the overhead, then that percentage change comes down. The RTX 5090 ran at 143.4 FPS AVG on the new model and 136.4 on the old model, an improvement in the as-captured result of 5.1% uplift in perceived performance. Again, there’s more nuance if you filter out the artificial frames. Hogwarts Legacy Hogwarts Legacy is up next. This one didn't have a driver-level override for the frame generation model, so we had to control that variable by switching back and forth from the press build, otherwise the results would be invalid. But, complicating it more, RT features were updated in the press build, so we disabled all RT effects for these tests. We first compared the public and press versions of the game with frame generation off to ensure that the comparison is even valid. Disabling RT effectively equalized performance between the versions (107.3 to 106.9 is within error, as is 151.9 to 152.6), making them valid for comparison. Hogwarts Legacy FG Comparison That brings us to this chart: In Hogwarts Legacy, the 4090 had a 14% uplift in (perceived) performance from 153 FPS up to 175 FPS average with the new model, while the 5090 had a 10% uplift from 224 to 247 FPS average. Working backwards based on the output framerates, we can deduce that the new frame generation model is easier on the GPU. This allows more resources to be dedicated to rendering, enables a higher input framerate, and that then comes back around to benefiting frame generation. The other piece of the puzzle will be determining whether subjective image quality has changed in a later story, but from a performance perspective, switching away from the OFA is better. DLSS Super Resolution and Ray Reconstruction CNN versus Transformer The second feature we tested in isolation was the old Convolutional Neural Network model versus the new Transformer model. This affects both DLSS Super Resolution and DLSS Ray Reconstruction; we were able to test SR in isolation, but RR (ray reconstruction) requires SR to be enabled. Dragon Age: The Veilguard First up is Veilguard, again using the driver override feature for A:B testing. Ray reconstruction isn't an available feature in this game, so this is purely a super resolution test. We used the DLSS SR Quality preset which, given the 4K output resolution, means a 2560x1440 render resolution. On the 4090, performance suffered using the new Transformer model versus the older CNN model, with the older model retaining a 5% performance advantage. The "No DLSS" control result is included for context, but that's not the comparison we're focusing on here. On the 5090, the older CNN model had a similar performance advantage of 4%, averaging 105 FPS versus 101 with the Transformer model. The difference between performance degradation on the two cards is miniscule, but these results would technically align with NVIDIA's claim about the Transformer model being "more performant on Blackwell." The differences are too small here to make that a firm conclusion. What we can say with certainty is that the new model is a little lower performance than the old one in Veilguard. Cyberpunk 2077 Phantom Liberty Cyberpunk 2077: “Crashing Liberty” is up next. When it works, it has ray reconstruction as a feature, so we were able to test both DLSS SR alone and DLSS SR and RR in combination…when it worked. As noted during our DLSS 3.5 coverage, it's possible for ray reconstruction to raise or lower performance depending on the systems that it replaces in a given game; however, it continues to have no significant performance impact in Cyberpunk. We recorded results without RR, but we cut them from these charts because it made no difference. The losses aren't huge: the 4090 with DLSS and ray reconstruction dropped from 71 FPS with CNN to 68 FPS with the transformer model, a 5% advantage for the old model in terms of performance. On the 5090, switching to the new model dropped average FPS from 92 to 86, a 7% performance advantage for the older model on this card. Anecdotally, we’ve noticed that the newer model reduces ghosting and visibly improves the appearance of animated textures in Cyberpunk, so this level of performance loss is likely an acceptable tradeoff. Because the new DLSS SR/RR model comes with a performance hit (unlike the straight upgrade to FG), we expect many games updated to support the model will take the same approach as Cyberpunk (hopefully without the crashing) and offer a choice between CNN and Transformer in settings. Hopefully those toggles actually work: we always relaunch games between settings changes, but we noticed that relaunching Cyberpunk with our specific settings would switch to the Transformer model invisibly while still claiming to use CNN in the menu. We were forced to toggle CNN off and on from inside the game in order to test it reliably. All of this was observed in the public 2.21 Steam release. This is an extremely subtle bug that wasted a ton of our time. Frame Metering Our final performance comparisons are frame pacing with Frame Generation enabled. This comes with two major caveats: first, we're using FrameView for logging, so the plotted frametimes are MsBetweenDisplayChange. Second, we don't have a way to toggle frame metering on the 5090, so we're forced to make a general comparison between FG 2X behavior on the 4090 (without frame metering) versus the 5090 (with frame metering). Ideally, we'd run FG 4X on the 5090 alone, with versus without frame metering. The new framegen model was used for these comparisons. Dragon Age: The Veilguard Here’s Dragon Age again. As always, making a comparative plot is complicated by the fact that the higher-performance card renders more frames. The 4090's plot shows the kind of up-and-down variation between every frame that might be expected from frame generation, with the majority of the pass oscillating up and down by 1-2ms on each individual frame. That may not be enough to cause a visible problem in this title, but it's a good illustration of the problem that NVIDIA is trying to address. The 5090 actually had visibly wider deviations from the average on the whole, but the up-and-down deviations didn't occur between every single consecutive frame. We’d want to see much flatter lines, ideally. Most people notice excursions at 8-12ms, based on prior interviews we’ve done, but you’d still want a flatter line ideally. Hogwarts Legacy Hogwarts Legacy showed results closer to what NVIDIA promised with frame metering. The 4090 still exhibits some up-and-down behavior between individual frames as "real" and "fake" frames alternate, while the 5090's frametimes are clearly more stable. The game also renders at a very high framerate, though, so the actual time delta in milliseconds between frames is miniscule even on the 4090. The 5090 also doesn't completely eliminate the occasional frametime spike, with a particularly large one shown at the end of this pass. Conclusion Buy a GN 4-Pack of PC-themed 3D Coasters! These high-quality, durable, flexible coasters ship in a pack of 4, each with a fully custom design made by GN's team. You'll get a motherboard-themed coaster with debug display & reset buttons, a SATA SSD with to-scale connectors, RAM sticks, and a GN logo. These fund our web work! Buy here. Breaking down DLSS 4 into its individual components: the new frame generation model performs better broadly speaking and there shouldn't be a tradeoff in visual fidelity, although we haven't confirmed that. The updated model also theoretically makes single frame generation possible on other Tensor-equipped RTX cards, but NVIDIA has made no indication that it plans to support that. The new Transformer model for DLSS SR and RR performs worse than the older CNN model, but it should also look better, and our anecdotal experience so far suggests that it's worth it. We may follow up with an in-depth visual comparison. Frame pacing with frame generation enabled should be improved versus 40 series, but this would mostly apply to MFG, and 40 series can't do MFG. Still, our extremely limited testing does show that frame metering has an effect. We'd also like to really drive home that the ideal scenario for frame gen is to make already acceptable framerates into extra-smooth framerates on high refresh displays, like the 4K 240Hz OLEDs that have been coming out recently. Sometimes NVIDIA seems aware of this, and other times it claims that the 5070 is equivalent to a 4090 because of MFG. If you have bad performance, Frame Generation may disappoint you; DLSS Super Resolution is a more helpful option in that situation.
Fake frames, higher power usage, higher pricing and a tendency to burn cables. No thanks, will stick with AMD.