Jump to content
Sign in to follow this  
SAAB340

Update 1 to How FSX works and how performance is affected by different hardware explained

Recommended Posts

I have an update to my original post:How FSX works and how performance is affected by different hardware explainedhttp://forum.avsim.n...ost__p__1896759I have looked further in to FSX performance regarding DX9/DX10, different graphic drivers, increasing LOD_RADIUS in the .cfg file and the use of AntiAliasing with Nvidia inspector.The original text refers to DX10 mode. There are two things to correct in the original text.Nr1.FSX does not really respond to changes in memory speed on the GTX470 running the core/shaders at stock speed. There is a small impact if you overclock core/shaders with 30% and at the same time lower the memory speed with 30%. But the stock memory bandwidthon the GTX470 is more than sufficient for FSX even at high core/shader overclocks.Nr2.FiberFrameTimeFraction does affect FSX performance even when FPS is set to unlimited.When we are CPU limited there is a reduction in FPS for a second every 20-35s. (It actually seems to happen every time we pass over set places on the ground. Imagine the ground as a set of tiles and there is a reduction as we pass the joints between the tiles.) During the reduction the FPS goes down to the same value regardless of how many T&t loaders we have. (In-between these reductions more T&t loaders result in lower FPS.) The FFTF however affects what FPS we get during this reduction. A lower value gives a higher FPS in the same way as when we have locked the FPS. But once again it can also give us worse T&t loading and can result in a loss of autogen and traffic. So a value lower then the standard 0.33 will give us slightly smoother flight (when CPU limited) but if you don’t like the risk of blurries or loss off of autogen I would not recommend using it in the .cfg. A too low value can also cause buildings to hover in the sky.Moving on:Most of the original text is also valid for DX9 but there are a few differences that I am about to tell.DX9 Fibres:The fibres that can be offloaded from the main thread to another core with DX9 are only around 15-20% core load. Not 50% core load as with DX10. So fibre offloading is a lot less of an issue in DX9. DX9 is a lot better if you only have a dual core processor. If you use a quad core CPU (without hyper threading) with the standard affinity mask of 15, there is normally enough CPU cycles available on the 3 T&t loaders even during the light refresh to accommodate the fibre load. So the FPS reduction during the light refresh is minimal without a separate fibre core. If you use hyperthreading the original affinity mask recommendations still applies to avoid getting the fibres on the same physical core as the main thread.DX9 PCIe bus and the GPU:One thing that is a lot different in DX9 mode is that it requires a lot more PCIe bandwidth. In fact, we are most of the time limited by the full PCIe 2.0 x16 bus when running FSX in DX9 mode with a decent CPU and GPU! A GTX470 is never the limiting factor in DX9 without additional AntiAliasing. We are either CPU or PCIe bus limited all the time even when the GTX470 is downclocked to operate at only 75% speeds.Generating a lot of autogen, for example flying over a forest, seems to take up the most PCIe bandwidth. DX10 can in those cases give twice the FPS compared to DX9. Using a PCIe 2.0 x8 bus is not at all recommended with a decent computer in DX9 mode as it will be the most limiting factor even in a very CPU taxing environment.The actual setting up of the autogen takes up a lot of PCIe bandwidth in the first place. That is why benchmarking FSX can be so tricky.If we run a benchmark, end it with the esc key (instead of closing down FSX), and then repeat several runs of the benchmark the same way. The results might then show the first benchmark as a lot lower FPS compared to the following benchmarks. That is because FSX is only setting up the autogen during the first benchmark. It then stays in the GPUs memory. During the following benchmarks a lot of PCIe bandwidth is released giving improved FPS where we were PCIe limited during the first benchmark.If we close FSX between the benchmarks the numbers will also be consistent. But the numbers produced by that procedure might also be a lot lower compared to if we just discard the first benchmark run using the previous procedure. That is because we flush the memory of the GPU. That lower number is unfortunately the real performance we will encounter the first time we fly over an area after we started FSX.Another thing I have noticed is that overclocking the graphics card only helps initially when we start being PCIe limited. When we are only slightly limited by the PCIe bus a faster GPU helps a lot. But the more powerful the GPU gets, the less are the returns in FPS. We simply hit the PCIe barrier. Different GPU architectures, G92 vs GF100, make a notable difference but the clockspeeds do not.The GTX470 is already severely starved by the PCIe bus. GPU performance can still make a difference in DX9 if we use AntiAliasing. But we are still limited by the PCIe bus when we use AA.Thankfully we do not get the same terrible, really noticeable, stutters due quick variations in FPS as we get in DX10 when we are PCIe limited in DX9. (But it might be because the FPS is a lot lower to start with.)DX9 FPS:It is hard to compare the FPS between DX9 and DX10 as we are so PCIe bus limited. DX9 is able to produce a slightly higher FPS when it is not limited by the PCIe bus.Graphic drivers:I have many times seen the question if the graphic drivers can make any difference as FSX is such an old game? I have tested with all the Nvidia WHQL drivers from the 260 series and forward.They all perform the same in FSX apart from one big difference starting with the 266 driver and forward: In DX10 using a quad core CPU with the standard affinitymask=15 there is no longer a FPS reduction during the light refresh. The T&t loaders are now letting the fibres be offloaded from the main thread even during the light refresh giving a faster light refresh and better T&t loading. (Remember that T&t loading is tied to both the amount of T&t loaders and if the main thread is restricted.)So the graphic driver can obviously affect the fibres or T&t threads behaviour. On a dual core CPU we are still getting an FPS drop during the light refresh but affinitymask=1 does not offer any better performance.Combine that with what I wrote about the DX9 fibres above and you can see that there is not really any need for an affinitymask entry in the .cfg any more as long as you are not using a CPU with hyper threading. (I do obviously not know how the AMD graphic drivers or Bulldozer with its modules affect FSX, They might work differently but I let someone else figure that out.)LOD_RADIUS:The maximum value we can get using the sliders in FSX is 4.5, but we can manually edit it to a higher value in the .cfg. How does this affect performance? A higher value put more load on the main thread so it reduces the FPS but you will also see that the T&t improves. The load time and the light refresh will also take longer. If you increase the LOD value too much (basically if the light refresh does not have time to finish before the next one is starting) you will end up with blurry textures.We already know that more T&t threads give lower FPS and better T&t loading and that a higher LOD value also gives lower FPS and better T&t loading. So what is to prefer? Well, it depends.If you fly slowly an increased LOD will provide a better T&t loading for the same FPS reduction compared to having more T&t threads. But you also have to stand longer load times.If you fly fast more T&t threads give noticeable better T&t loading. (But still worse FPS)More T&t threads also enable a higher LOD value as more T&t loaders give faster light refresh.So for the best T&t loading many T&t threads (hyperthreading helps) and an appropriate high LOD value gives the best result. It is unbeatable when flying fast over photo scenery to keep the textures nice and crisp. It will work your CPU very hard with very high CPU utilization. So keep a good eye on those temperatures on an overclocked CPU.The use of AntiAliasing:First of all, as most of you know, you have to use an Nvidia GPU and DX9 in order to benefit from improved AA modes in FSX.AA also requires more GPU power to keep up the FPS. Overclocking the GPU helps initially but it does not scale well at all as we are still limited by the PCIe bus.8xS AntiAliasingIt is not possible to say how much more taxing 8xS AA is on the GPU compared to in game AA as we are always limited by the PCIe bus. I hit the PCIe barrier at around 700Mhz core clock on the GTX470 during my test runs. At that point we get around 85% of the FPS compared to only in game AA.8xSQ AntiAliasing8xSQ AA requires even more muscles from the GPU. But same again, we are PCIe limited so we can not tell by how much. I hit the PCIe barrier at around 800Mhz core clock. It then gives 87% of the FPS we get using 8xS AA.What this basically shows is that additional AA requires even more PCIe bandwidth. Everything points towards that the current high end 500 series cards can still be helpful when using additional AA due to the additional load on the GPU. But the fastest cards are now so powerful so they are severely bottlenecked by the PCIe 2.0 bus in FSX. In DX9 FSX desperately needs more PCIe bandwidth. PCIe 3.0 hardware should provide that.

Share this post


Link to post
Share on other sites

Great post once again Lars. Thanks for posting.I have a question. How do you go about testing PCIe exactly and how do you know when you hit the PCIe barrier or when you are PCIe limited please? I'd like to try and reproduce that if I canWhen you say it's GPU dependant, like here:

Another thing I have noticed is that overclocking the graphics card only helps initially when we start being PCIe limited. When we are only slightly limited by the PCIe bus a faster GPU helps a lot. But the more powerful the GPU gets, the less are the returns in FPS. We simply hit the PCIe barrier. Different GPU architectures, G92 vs GF100, make a notable difference but the clockspeeds do not.
I'm not sure it's a PCIe limitation if a different GPU or a GPU overclock makes such a difference. The GPU's memory bandwidth comes to mind, but you already said overclocking the card's memory made no impact, so I don't know

Share this post


Link to post
Share on other sites
Hypnotized.gif . now for the layman. I just upgraded from a GTX 470 to a GTX 580. FSX and FS9, no difference whatsoever in performance or quality. Now I didnt upgrade for FSX reasons. Train Simulator 2012 and Battlefield. Very dramatic difference. So. As has been well known, dont wast your money on huge GPU's for FSX!
There is more, I'm flying in 3D vision and was able to test a 480 GTX vs 580 GTX and if you want to go the 3D path, Go for the 580...Running on a 2600K OC to 4,6 I have 1920x1080 and Nvidia Inspector set on 64/12 AA settings and AF x16 with a steady 30 fps with the iFly 737, PMDG 737 NGX and all the sliders to the right in 3DBojotes tweak is installed and also REX-Extreme HD is involved...Just my 2 cents

Share this post


Link to post
Share on other sites
Great post once again Lars. Thanks for posting.I have a question. How do you go about testing PCIe exactly and how do you know when you hit the PCIe barrier or when you are PCIe limited please? I'd like to try and reproduce that if I canWhen you say it's GPU dependant, like here:
By taping over certain connectors on the PCIe contact on the GPU you can the bus to operate at x8, and x4 speeds if you like. So when we have the GPU on a x8 bus, and neither GPU overclocking, CPU overclocking or memory overclocking makes any difference. But upping the bus to x16 speed gives a huge improvement. It is not fully doubble the FPS but around 80% higher if I don't remember incorrectly. Then you know that it is the PCIe bus that is holding you back. And when you now are on the x16 bus and the FPS graph still shows this behaviour not responding to any over/underclocking of GPU/CPU you know you are still PCIe limited at those places.
I'm not sure it's a PCIe limitation if a different GPU or a GPU overclock makes such a difference. The GPU's memory bandwidth comes to mind, but you already said overclocking the card's memory made no impact, so I don't know
I suspect Fermis L2 cache can have somthing to do with the improvement going from G92 to GF100.Running a x8 bus in DX10 you are most of the time not limited by the PCIe bus. But when you are (you know you have just got in to that limitation) GPU overclock scales werry well Innitially. It is the same when you use 8xSQ AA and x16 bus. If you benchmark at several GPU speeds you will see that it scales a lot better at lower GPU frequences. Ex going from 455Mhz to 607Mhz gpu (+33%). Compared to going from 607Mhz to 810Mhz (also +33%). And when you test at another few frequences in between you will see that the benefit of GPU overclocking fades away the faster GPU speeds you use.

Share this post


Link to post
Share on other sites
By taping over certain connectors on the PCIe contact on the GPU you can the bus to operate at x8, and x4 speeds if you like. So when we have the GPU on a x8 bus, and neither GPU overclocking, CPU overclocking or memory overclocking makes any difference. But upping the bus to x16 speed gives a huge improvement. It is not fully doubble the FPS but around 80% higher if I don't remember incorrectly. Then you know that it is the PCIe bus that is holding you back. And when you now are on the x16 bus and the FPS graph still shows this behaviour not responding to any over/underclocking of GPU/CPU you know you are still PCIe limited at those places.I suspect Fermis L2 cache can have somthing to do with the improvement going from G92 to GF100.Running a x8 bus in DX10 you are most of the time not limited by the PCIe bus. But when you are (you know you have just got in to that limitation) GPU overclock scales werry well Innitially. It is the same when you use 8xSQ AA and x16 bus. If you benchmark at several GPU speeds you will see that it scales a lot better at lower GPU frequences. Ex going from 455Mhz to 607Mhz gpu (+33%). Compared to going from 607Mhz to 810Mhz (also +33%). And when you test at another few frequences in between you will see that the benefit of GPU overclocking fades away the faster GPU speeds you use.
Great stuff, thanks for that. I found the connectors that need to be taped for x8 operation and will be testing this weekend.I did some testing with a second GPU (no SLI) so that my GTX480 would run at x8 long ago and saw a 10% lower FPS. It was a very short test using my benchmark flight at HeathrowDo you use a saved flight? care to share the files so I can reproduce it please? or give me some gidelines as to what location, settings, weather, etc... to use?

Share this post


Link to post
Share on other sites

Most of my testing has been usning the FSmark08. I have also used the FSmark11. Remember that you need to plot a graph of the FPS to be able to really see what is happening as we are limited by different hardware at different stages of the benchmark. Just looking at the average FPS doesn't tell much. I normaly do more then one run and plot the average frame rate between all runs. Watch it thou as it will take a huge amount of time to run the benchmarks, change hardware, plot the graphs and analyse. Thats why I want to share my findings. But it is addictive=)

Share this post


Link to post
Share on other sites

Isn't Affinity Mask = 15 the same as running with no affinity mask setting at all in the .cfg?I tried the 14 setting and found zero difference so I got rid of it.

Share this post


Link to post
Share on other sites

The settings I used for update was the ones specified for FSmark11. Apart from using the full resolution on my monitor, DX9/DX10 and AA etc. Most of the time I benchmarked in FSmark08. A 630seconds run.

Isn't Affinity Mask = 15 the same as running with no affinity mask setting at all in the .cfg?
Correct if you have quad core and no HT.

Share this post


Link to post
Share on other sites
Most of my testing has been usning the FSmark08. I have also used the FSmark11. Remember that you need to plot a graph of the FPS to be able to really see what is happening as we are limited by different hardware at different stages of the benchmark. Just looking at the average FPS doesn't tell much. I normaly do more then one run and plot the average frame rate between all runs. Watch it thou as it will take a huge amount of time to run the benchmarks, change hardware, plot the graphs and analyse. Thats why I want to share my findings. But it is addictive=)
I guess you mean FSMark07? ok, off to buy some tape :)

Share this post


Link to post
Share on other sites

Yes, I ment FSmark07. I attatch a few quick screenshots of some of my analysis where you can see the differences between a x8 and a x16 bus. The actual excel files I have are several hundred MB in size.

Share this post


Link to post
Share on other sites
Yes, I ment FSmark07. I attatch a few quick screenshots of some of my analysis where you can see the differences between a x8 and a x16 bus. The actual excel files I have are several hundred MB in size.
Amazing work. Thanks once again

Share this post


Link to post
Share on other sites
Yes, I ment FSmark07. I attatch a few quick screenshots of some of my analysis where you can see the differences between a x8 and a x16 bus. The actual excel files I have are several hundred MB in size.
Lars, did you use the settings that come with FSMark07 & 11, or your owns? what addons?I'm trying to find test scenario that doesn't scale with CPU or GPU overclock, but I've had no luck so far. Should I try a fresh FSX install?

Share this post


Link to post
Share on other sites

Not to take away from any of the above post--because it's a great thread--, but isn't this all common knowledge to those of us who have been in this community for a while?As far as hyper threading goes, in practice my i7 960 rig saw no benefit at 4.0GHz with HT on, nor did my 2600k. *regardless of affinity mask*


___________________________________________________________________________________

Zachary Waddell -- Caravan Driver --

Facebook: http://www.facebook.com/zwaddell

Avsim ToS

Avsim Screenshot Rules

Share this post


Link to post
Share on other sites
Not to take away from any of the above post--because it's a great thread--, but isn't this all common knowledge to those of us who have been in this community for a while?As far as hyper threading goes, in practice my i7 960 rig saw no benefit at 4.0GHz with HT on, nor did my 2600k. *regardless of affinity mask*
Zach. I know FSX can be helped by HT with texture loading. Especially good if you use photo scenery. But you will not get better FPS.This testing is done without Reject Threshold or Bufferpools in cfg. I think they will help when we are PCIe limited. Have not had time to do proper testing on it yet. (and won't over the next few weeks either) but innitial check indicates that is the case.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...