Jump to content
Sign in to follow this  
9GTurn

P3D 4.3 - 2nd Test required

Recommended Posts

Just a note to point out that there's also another old duffers tale going around perpetrated most likely by the 'no AM HT off' dinosaurs. The incorrect 'fact' that Windows uses core zero so we should avoid that one. Don't worry - Windows uses all cores effectively to avoid contending with resource hungry apps like P3D. Go ahead and use core zero - Windows won't mind at all.

Where this kind of stuff comes from - I have no idea.

  • Like 1

Steve Waite: Engineer at codelegend.com

Share this post


Link to post

...in fact the no-AM crowd are actually Advocating  the use of core zero as no AM actually is in fact an AM that represents the core count of the CPU. So that would mean with no AM HT would not serve well for some processes when forced to share a core with a process otherwise shoved onto a separate real core.

To make sense of "many cores and HT" we have no option but to start thinking about restricting the number of LPs and being careful not to put time sensitive threads onto the same core - one LP each.

For example - the rendering process of P3D (and FSX) is time sensitive so we use an Affinity Mask to enable the proper set-up in HT enabled systems so that the time sensitive process has a core to itself. That means - keeping other stuff away too (your addon exe apps).

Tools I build for use with P3D and FSX move themselves away from those sensitive LPs if it can be determined, therefore also have manual overrides which is usually best in a system where we do not want other processes to cause it to juggle around. Process managers might alter that arrangement so I don't recommend them in a fixed setup - they are for other types of processes not P3D or FSX.

 

 

Edited by SteveW

Steve Waite: Engineer at codelegend.com

Share this post


Link to post
14 minutes ago, SteveW said:

Don't use a process manager to manage processes that manage themselves otherwise you have odd contention in there. And that is why P3D comes with its own core Affinity setting - so use it.

7 minutes ago, SteveW said:

Just a note to point out that there's also another old duffers tale going around perpetrated most likely by the 'no AM HT off' dinosaurs. The incorrect 'fact' that Windows uses core zero so we should avoid that one.

Steve, I've been left a bit confused by your comments, so would you mind responding to a couple of questions, please?

I have a Ryzen 1700X CPU with 8 cores and using Process Lasso, I have set P3D to use CPU 0-1, 4-15. The vast majority of other applications on my PC run off CPU 2-15. Is this wrong and should I allow Windows 10 to allocate threads as it sees fit, instead of Process Lasso?

You state that P3D has its own core affinity setting, use it how? Is there an AM that I should be using?

Thanks.


AMD Ryzen 5800X3D; MSI RTX 3080 Ti VENTUS 3X; 32GB Corsair 3200 MHz; ASUS VG35VQ 35" (3440 x 1440)
Fulcrum One yoke; Thrustmaster TCA Captain Pack Airbus edition; MFG Crosswind rudder pedals; CPFlight MCP 737; Logitech FIP x3; TrackIR

MSFS; Fenix A320; A2A PA-24; HPG H145; PMDG 737-600; AIG; RealTraffic; PSXTraffic; FSiPanel; REX AccuSeason Adv; FSDT GSX Pro; FS2Crew RAAS Pro; FS-ATC Chatter

Share this post


Link to post
1 minute ago, F737NG said:

Steve, I've been left a bit confused by your comments, so would you mind responding to a couple of questions, please?

I have a Ryzen 1700X CPU with 8 cores and using Process Lasso, I have set P3D to use CPU 0-1, 4-15. The vast majority of other applications on my PC run off CPU 2-15. Is this wrong and should I allow Windows 10 to allocate threads as it sees fit, instead of Process Lasso?

You state that P3D has its own core affinity setting, use it how? Is there an AM that I should be using?

Thanks.

I really don't see why there's so much confusion - it is really simple

P3D comes with a means to set up an Affinity mask for that application - so use that!

If you do not set that P3D (or FSX) AM then P3D opens up across all your cores or LPs and then your process manager corrals all that work onto less cores that you allocated - this is not the intended result.

Use the AM in an app when it has a setting of its own or you might find odd behaviour. 

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post

This is what you do:

Enable LPs or cores one by one in the AM and see how fast the sim loads - repeat that a few times since Windows caches stuff and subsequent loading will be faster. We can do that with a watch we don't need an expensive tool. Now each time we add a core to the AM the sim loads up quicker. Up to a point where adding a core does not make much difference. At this point you don't want to use more cores. Imagine adding ten more cars to a train to distribute the load more  - simply adds weight to the train.

Now enable HT and represent that pattern across the pairs. If you have few cores then you can start enabling the sister LPs on some cores to gather data faster by intensifying work done on certain cores - but do not share the core with the renderer.

 

 

  • Like 1

Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Use a batch file to start addon exe apps and see which cores they work best with in conjunction with your sim cores. Make sure you always give a minimum of TWO LPs per app (even if those 2LPs are on the same HT core).

After ensuring that's working well we can use fancy apps to move stuff around if we are aware of the pitfalls.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Now we understand things better we can see that a simple adjustment to the PC - enable HT with no AM for certain processes - will without any shadow of doubt, reduce the performance and increase heat. Why? Because we didn't specify an AM. If we used a process manager that corrals the sim after it has started that's wrong and further intensifies work and reduces performance.

After reading all the hype going around I can understand this seems hard to grasp.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post

When P3Dv4 came out I showed on my site how this version utilises the CPU slightly better than FSX and P3Dv2/3. Basically what was said was that there are other parts of v4 that can be better served on a core of their own - in other words can do better when we keep other parts of P3D away just like the main rendering process. Now you all have the understanding to find a few more percent on your rigs. Well you have known since way back when.

Edited by SteveW

Steve Waite: Engineer at codelegend.com

Share this post


Link to post

I think the main problem of HT is understanding in a simple way how Hyperthreading can achieve more performance, and yet reduce performance with certain processes.

When I say 'certain processes', for example, P3D and FSX have a major process that needs all the cycles it can get uninterrupted - the rendering of frames. There's also other processes in P3D that take seconds to complete so these can share cores with HT enabled when we don't have very many cores.

So look at two processes running on one core - these are time-sliced so as to appear running concurrently - simultaneously.

Each time slice the core saves away the 'context' of that process - the state of affairs of that process, the CPU registers.

The other process is set up from its saved context and away it goes for its own time slice.

Now let us enable HT and purposely put those processes, one on each Logical Processor (LP).

So we still have one core time slicing the two processes.

The differences being:

1. they appear to be on separate real cores to the software running - this requires a capable Operating System like Windows.

2. their contexts are not lost when the core switches - saves half the switching time.

3. unused cycles within the core are used by the other LP.

And so HT on means an increase in performance across the CPU with hundreds of processes, more than 50 with P3D alone.

However, if that core is sharing two processes when one of those processes needs the core 100% (P3D's renderer), then by definition it is not getting 100%. The graph of that LP use will show 100% -that is only 100% of its proportion of processing on that shared core.  

 

 

Edited by SteveW

Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Steve - Thank you for you posts and detailed response, it's very helpful to understand that HT can actually benefit rather than hinder performance, providing you know how to AM it.

You say the most important process to give a dedicated core is the P3D renderer - but how do I know what process is the renderer and then how do I give that a dedicated core via an AM batch file.  Can I do it via the cfg file?

As you see from my first post I have a hexacore with HT on so I've given apps LP 1 plus sister and P3D 2-6 plus sister.

  • Like 1

Share this post


Link to post
6 hours ago, 9GTurn said:

Steve - Thank you for you posts and detailed response, it's very helpful to understand that HT can actually benefit rather than hinder performance, providing you know how to AM it.

You say the most important process to give a dedicated core is the P3D renderer - but how do I know what process is the renderer and then how do I give that a dedicated core via an AM batch file.  Can I do it via the cfg file?

As you see from my first post I have a hexacore with HT on so I've given apps LP 1 plus sister and P3D 2-6 plus sister.

We can look at the CPU graphs (set to individual cores or LPs in HT mode) and see which LPs light up the most. Remember that on the top left graph is LP zero - that's the right-most 'one' in your AM. Say your AM looks like this 11,11,11,11,11,01 (six core +HT) we can copy paste that (commas and all) into Windows Calc programmer mode binary field = 4093 DEC which Is the integer we must specify in the P3D AffinityMask= item of the JOBSCHEDULER section.

The proper nomenclature when discussing AMs is to separate LPs with commas to denote HT enabled. With that AM on a six core we are only allowing P3D to use One LP of core zero (there's no physical and logical cores there, they are both equal logical processors with no special ordering.

Once we set that we can see how long the sim takes to load  - that 4093 AM will load the sim as fast as it is possible to do on any six core PC - period.

When we reduce those LPs 01,01,01,01,01,01 = 1365 that's still SIX CORES. That might not be able to load the sim as fast, maybe, maybe-not - depends on hardware and sim textures and settings.

So what we want is the least LPs (or cores) enabled to do the job of loading the sim fastest and leaving some cores free for addon exe apps. The one on the right is the rendering followed by the others that do more in the background. If we add an LP and don't see a significant increase in performance it might not be a good idea to use it.

Remember that each core with two LPs enabled will perform hotter as it does more work so maximum overclockers turning on HT with no AM might burn up the CPU and blame HT rather than their poor technique from lack of homework.

 

  • Like 1

Steve Waite: Engineer at codelegend.com

Share this post


Link to post
On 8/21/2018 at 5:44 PM, SteveW said:

This is what you do:

Enable LPs or cores one by one in the AM and see how fast the sim loads - repeat that a few times since Windows caches stuff and subsequent loading will be faster. We can do that with a watch we don't need an expensive tool. Now each time we add a core to the AM the sim loads up quicker. Up to a point where adding a core does not make much difference. At this point you don't want to use more cores. Imagine adding ten more cars to a train to distribute the load more  - simply adds weight to the train.

Now enable HT and represent that pattern across the pairs. If you have few cores then you can start enabling the sister LPs on some cores to gather data faster by intensifying work done on certain cores - but do not share the core with the renderer.

 

 

Hi Steve,

Thanks for that! Here are my results:

https://www.avsim.com/forums/topic/541073-using-loading-times-to-determine-affinity-mask/

It proved to be an interesting exercise.

Best regards,

Mike

 

  • Like 1

Share this post


Link to post

Guys, if you  are looking for the best performance then go over to Mike's thread above whereby some results are coming in that prove what should basically be obvious, at least to the engineering mind - that there is an optimum number of cores for P3D depending on your hardware.

Edited by SteveW

Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...