Sign in to follow this  
Noel

Unbelievably smooth performance--question re cloud/terrain shadow impact

Recommended Posts

With some recent minor changes, ones known that I hadn't tried before because performance was quite decent already, but wow, unreal now.   I'm telling you there isn't the slightest hint of a stutter, micro stutter or what have you.    That is with the PMDG 777 in Nor Cal & So Cal FTX scenery, at major terminals.  I'm shocked truly, and this is a 5 y/o system, albeit strong for its day.

 

So really at this point the only thing that dampens perfection in video smoothness is when there are lots of clouds AND moderate levels of terrain/cloud shadowing.   It's not too much of a practical problem because I normally run clouds out to 100nm and to Maximum density, and in this instance there is no need for shadowing, so I turn it off.  With low amounts of clouds w/ cloud/terrain shadowing all's still liquid smooth.  My only question is:  are there specific cloud textures that are better on this issue?  I'm using P3D's MSAA at 4x, FXAA is OFF.    The three major changes that took me from very good to excellent total performance is:  1, UNLIMITED frames (have done this off and on for years, but conclude w/ the two other changes this is clearly preferred); 2, setting monitor refresh to 30 w/ VSYNC enabled in sim; 3, setting the affinity mask to 4052 per SteveW.  With these changes seemingly only the shadowing comes into play overall.  In the most complex of settings for me, KSFO or KLAX in FTX regional scenery, fps has gone briefly down to 20 and we lose perfect smoothness, but it's quite brief overall.

 

So anything to be done to improve how the sim copes w/ shadowing?  I'm using P3D 3.0 still and prefer to keep it that way for a while.

 

Thanks!

Share this post


Link to post
Help AVSIM continue to serve you!
Please donate today!

Thanks for the info Noel, that's really encouraging.

 

Funnily enough I've just bought a new bigger SSD so will download and do a full install of the latest P3D version. I usually run with unlimited FPS and 4 x SGSS in NVI, but I'll give your tweaks a try out before I add on addons. Its been a while since I bothered with an Affinity Mask, I've never full understood it and there seems so much conjecture about whether it works or not, however isn't the value dependant on what hardware you have?

 

cheers

 

Ian

Share this post


Link to post

Yes the AM value of 4052 is for 6-core hyper threaded processors only.  

 

I'm really shocked at this improvement.  It's always been decent, but the lack of even micro-stutters to this degree is new for me.   Look for SteveW here maybe he has a suggestion for your specific processor.  I had really assumed it was setup as good as possible, but it seems now I'm seeing a really significant increase in total performance including frame rate in complex areas, terrain texture update rate, and freedom from micro stutter.  There will be on occasion some hitches when turning on the taxiway at times, but this I think may have been done by something related to the GPU, like shadowing etc.  But overall, in the most demanding scenarios, it's very close to perfection.   It's hard to say exactly what role each part is playing, but VSYNC w/ a 30mHz refresh seems to be huge for the smoothness piece.  Running at UNLIMITED feeds the most frames possible under the circumstances and that is important if you are running VSYNC/30mHz screen refresh, and most often I'm seeing the built-in frame rate counter display exactly 30.0 almost always--until it's super demanding, then down to 24-25, and the worst so far was on take off in the PMDG 777 RWY27L KSFO-HD in FTX NCA regional, and scenery sliders close to maxed.  The new affinity mask seems to have worked w/ everything else to maximize total performance, and it's palpable, all of these working together.

 

One odd caveat I need to post about in case someone knows.  Despite the best ever performance, I notice now road traffic is more jerky at times than I recall, or perhaps it's just in contrast to everything else.  While airport ground traffic is absolutely smooth now, road traffic gets choppy.

 

Pretty cool to be 5y out from the last build and get this out of it--the last week has been a complete surprise.  And I agree, it's inspiring to know it's possible, even if elusive!

 

Good Luck to you Ian!

Share this post


Link to post

Maybe Steve can explain why 4052..

 

4052 = 11 11 11 01 01 00

 

2 logical cores for addons. You probably do not have much addons.

 

I am still on 340 which was the best for my setup with al kinds of addons.

  • Upvote 1

Share this post


Link to post

Maybe Steve can explain why 4052..

 

4052 = 11 11 11 01 01 00

 

2 logical cores for addons. You probably do not have much addons.

 

I am still on 340 which was the best for my setup with al kinds of addons.

 

I was at 4092 previously, so SteveW mentions this:

 

4092=11,11,11,11,11,00 on the right two zeros show the sim won't be running on those, but it is running on every other LP. Next coming in from the right you have two ones, showing that the sim will start it's two primary jobs on those two LPs each happen to be the same core, and so mathematically speaking this setup has no chance against one whole core each as in 4052=11,11,11,01,01,00

You can see the first two ones exist on a core to themselves they are responsible for rendering. The others can exist on cores together since they are data gathering not rendering, these jobs take seconds to complete. You can't muck the renderer about in the same way. The data gatherers can exist on the same core but they share bandwidth of the core. Your AM put's two primary jobs to share a core.

 

I was not aware sharing a core over two LPs made a difference, and quite clearly it did negatively impact rendering performance.  I manually assign everything else outside of P3D to every LP except 3,4,5,6 which are for exclusive use by the renderer.   I don't think there is a compelling reason to restrict all other processes to the 1st two LPs, but maybe there is.  I have assumed the workload for everything, including terrain texture loading, except the main thread could be shared w/o significant impact, and looking at CPU utilization seems to support that, i.e. I never see anywhere near 100% on all the other LPs.

Share this post


Link to post

Thanks Noël.

 

To which cores have you assigned your addons too.

Share this post


Link to post

How are you limiting vsync to 30 refresh rate?

Share this post


Link to post

Thanks Noël.

 

To which cores have you assigned your addons too.

 

Every process, which includes all add ons, that are assignable I have set to 12,11,10,9,8,7,6_,_,_,_,2,1

 

I think the big gain came from giving both LPs per cores 1 & 2 instead of one LP per core for exclusive use by P3D.  At least that is how I understand it.

Share this post


Link to post

Did you assign one addon to 2 cores and then devided your addons over all cores , or did you assign every addon to all those cores ?

 

Thanks

Share this post


Link to post

Maybe Steve can explain why 4052..

 

4052 = 11 11 11 01 01 00

 

2 logical cores for addons. You probably do not have much addons.

 

I am still on 340 which was the best for my setup with al kinds of addons.

I hope this helps:

 

Looking at the binary first we have at least four LPs assigned to the sim so the rendering stage is at its leanest; which means we have the least code running per core. We can allocate more LPs but they only gain a speed advantage in loading the scenario, they cannot increase the rendering speed. More than four LPs however gives the first two jobs more to do more often, in effect reduces the rendering performance.

 

Look on the right in the binary and we have one LP per core on the first two LPs. The first two LPs being the rendering stages and so getting maximum performance. The divided up jobs three and four are allocated across the remaining cores giving maximum scenario loading speed, We need not allocate as many LPs as we can find since after a point there will be no gain. First try 6 LPs giving two to the renderer and four to the loading stages. Check with a stopwatch and repeat the scenario load see that six gains an increase in scenario loading speed over four in most cases - adding more may not yield any benefit - check.

 

Gerard uses 340 = 00,01,01,01,01,00 gives four straight cores to the sim and provides maximum rendering performance. So long as addon exe apps are kept away from those cores the sim should perform great. 340 came up in my testing as maximum on 6 cores giving the first and last core to other apps which suits the jobsheduler well. Adding LPs to 340 with say 1364 = 01,01,01,01,01,00 gives five straight cores to the sim, the rendering still only gets the first two but now with the extra core undertaking loading tasks the scenario loads in a few seconds less time. However the rendering is affected slightly and there's more sim cores spread out on the CPU to coincide with other processes and lose performance.

 

Any processes external to sim processes finding themselves running on those LPs allocated to the sim will cause that LP into switching time losses that would not be so great if the external process was on the sister LP, even better a core unused by the sim. This is one of the reasons we see such differences across systems.

 

6 core +HT suggestions try:

00,01,01,01,01,00=340

or

00,11,11,01,01,00=245

01,00,00,00,00,01=addons - give exe apps combination of two LPs min

or

01,01,01,01,01,00=1364

10,10,10,00,00,01=addons - give exe apps combination of two LPs min

 

leaves an LP of core zero free for unexected system activity

 

mixed processes don't harm the sim background tasks so bad as they do the first two jobs/LPs.

 

 

 

With P3D the VSync=On in Display Settings coupled with Unlimited on the fps Slider control effects an fps "Limit" on the frame rate output to the refresh rate of the monitor. With the Slider set to an fps value other than Unlimited introduces look-ahead frame buffering. When this is going a slight stutter will use up one to three look ahead frames from the buffer (the default max set is three). Even if one frame is lost from the buffer a great deal of time can be taken to fill the buffer back up since it became utilised through lack of performance. If you can see 40+ fps at all times with Unlimited VSync=Off you can try Locking at 20fps on the slider.

Share this post


Link to post

 

 


With P3D the VSync=On in Display Settings coupled with Unlimited on the fps Slider control effects an fps "Limit" on the frame rate output to the refresh rate of the monitor. With the Slider set to an fps value other than Unlimited introduces look-ahead frame buffering. When this is going a slight stutter will use up one to three look ahead frames from the buffer (the default max set is three). Even if one frame is lost from the buffer a great deal of time can be taken to fill the buffer back up since it became utilised through lack of performance. If you can see 40+ fps at all times with Unlimited VSync=Off you can try Locking at 20fps on the slider.

 

LoL Steve, could you please explain this part again for dummys...I cant follow ....Sorry

 

Thank you

 

McDan out

Share this post


Link to post

The GPU can use various ways of holding a frame (the picture) in memory. There's a bit for while the next frame is being painted by the GPU, there's a bit for the display output circuit to be reading in so it can send onto the display, and there's a bit in the middle that can be used to store frames ahead of time.

 

There can be a combination of techniques involved to produce an outcome. We can use the bit in the middle in the technique called Triple Buffering - a page is drawn on one of two buffers while the output buffer is reading in one of them.

 

In the Fixed fps method the buffer is used to store up to three frames that the physics and positions of moving objects in the sim are computed based on a fixed time between frames. It is important to realise that with say 20fps fixed each frame will be computed such that the moving objects are 1/20s (50ms) along the timeline irrespective of how long the frame takes to draw. Having three in reserve at 20fps = 3 x 50 = 150ms delay. A 150ms stutter depletes the buffer and the next frame will be made with the objects shown in the wrong position with respect to their timeline.

 

The Unlimited VSync and Triple Buffer settings go together, however each next frame of the scene is computed for the objects to be down the timeline of the average time between frames, they are never computed to the time they appear and never appear at the right place in time, unless the frame rate is held constant by the limit of the monitor refresh or by the limit of the sim performance. Sometimes we can get less stutter by reducing the performance of the sim so that it is held against an end-stop in performance and runs with a more consistent fps that way.

Share this post


Link to post

Sorry, what is an LP? I have a i7 4820K which is a quad core processor. If anyone could recommend an affinity mask for this, I'd be happy to give it a try.

 

cheers

 

Ian

Share this post


Link to post

mixed processes don't harm the sim background tasks so bad as they do the first two jobs/LPs.

 

 

Right now all other processes including Windows & the few add ons I have running along w/ P3D are assigned manually are shared w/ P3D on 11,11,11,__,__,00 and as well on the unassigned 1st two LPs.  

 

That was my next question, is it best to give P3D the entire exclusive use of a couple cores beyond the main thread's 2 full cores texture loading and prevent any other exe's from sharing, or mainly just theoretically?  With two add ons and all the other windows processes sharing w/ P3D all except LPs 3,4,5,6 I notice CPU utilization on those LPs is often quite low. 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this