Jump to content
Sign in to follow this  
ronnay

affinitymask i7 7700k

Recommended Posts

6 minutes ago, Bert Pieke said:

92 looks potentially problematic..

How come? 92 has four LPs, the first two sim jobs occupy one core but one per LP, no problems there Bert (unless you can think of one), just need avoid the second core with addons and is good on background tasks. 84=01,01,01,00 on the other hand - only one LP is stuffed with the same jobs so can't work as well since it ignores the potential of HT.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

Understanding the potential of hyperthreading (HT):

Let's say a process has two threads (to simplify matters), if we run that process on one logical processor (LP) of that HT core, or turn off HT (in fact is then merely a single LP) we get the same result (not precisely but we are simplifying things). However, if we allow that process to run each thread one per LP - since those HT LPs can switch without overhead and use otherwise lost CPU cycles - the process completes in less time.

The sim splits out its sub-processes in a particular way until it gets more than four LPs then it splits out the background tasks loading the scenario and so on. As the documentation states, at four parts the rendering stage is at its leanest. So if we allow it to split into more than three parts and place those on alternate LPs we get improved performance. 84=01,01,01,00 exploits neither of those two "need to know" snippets of hard earned know how, but 92=01,01,11,00 exploits both. :biggrin:

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

My concern is that the main FSX thread likes to use up the entire core it runs on, so adding any work to the second LP on that core has the potential to interfere..

I have never tested this,  so by all means try it,  but I would avoid putting any more work on that core - it is busy enough already..

Not sure what documentation you are referring to, but by observation I know that if there is no scenery to be loaded, or re-lit, the sim runs on one core only and does not split over four cores in any observable way..

Scenery loading indeed gets offloaded to the other cores - that was one of the breakthroughs in SP1 or SP2.


Bert

Share this post


Link to post
Share on other sites
1 hour ago, Bert Pieke said:

My concern is that the main FSX thread likes to use up the entire core it runs on,

so adding any work to the second LP on that core has the potential to interfere..

I have never tested this,  so by all means try it,  but I would avoid putting any

more work on that core - it is busy enough already..

Not sure what documentation you are referring to, but by observation I know

that if there is no scenery to be loaded, or re-lit, the sim runs on one core only

and does not split over four cores in any observable way..

Scenery loading indeed gets offloaded to the other cores - that was one of breakthroughs in SP1 or SP2.

 

Bert, you are making it hard work to get the proper facts across. The split process on the two LPs is the same as the unsplit process on one LP. You fail to understand the core is doing the same work, but gets it done faster split over two LPs. Please re-read my posts.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

I'll be patient with you Bert and since you've ignored my entire chain of posts and confused the discussion - I'll re-iterate as I am very patient:

01,01,01,00=84 = three tasks on three HT cores - "why this is not the best solution"

We know the sim works best with four tasks or more - it says so in the manual as splitting out from three to four parts reduces the work of the first part.

01,01,01,01=85 = four tasks on four HT cores - four cores are better than three

if we REALLY must use only three cores we establish that 116 is best overall performance

01,11,01,00=116 on my forum there's plenty of information why this (and the related combinations like 10,11,10,00=184) is best overall because we have four tasks and the two tasks on the third core from the right do not max out at the same time often.

In order to explain how Hyperthreading works in our favour, I examined other possibilities for the point of discussion:

We can try other places to put the extra '1'

11,01,01,00=212 = four parts best performance to the renderer

01,01,11,00=92 = four parts best performance to the background data loading (Bert took this example out of context obfuscating the discussion) even so those two parts of the rendering stage on the second core work better than they do all on one LP - performs better than 84 because of HT.

11,11,11,00=252 - I added this idea since it makes good use of HT and performs well especially in loading performance.

The difference between these setups is hard to test unless you have a test harness that guarantees similar repeatable results to make tests with, and understand how to test the capacity of the CPU setup - for example adding a load of anti-aliasing will prevent the accurate testing of AMs because the sim is held back by the GPU.

Bert, I hope that's sorted out your misunderstanding the discussion.

 

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

...remember too that no AM on an HT enabled CPU means the first two jobs of the four main sim jobs occupy only one core, even so they occupy an LP each and so work like 92 and 252 and take advantage of HT (84 does not). So 92 and 252 work no worse than No AM if you think about it. The remaining two jobs are split over the remaining LPs making the data load faster (with a small reduction in rendering performance if more than two) - that's six LPs on the four core, but then also that's 8 on the six core, 10 on the 8 core and so on - that soon starts to become more parts than necessary and an AM becomes vital.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

Steve, I follow your logic.. trust me.

All I said is that 92 does not look to me like a good way to go,  whereas 116 is fine...

Re-read my post. 

BTW, what manual are you referring to?

 

 


Bert

Share this post


Link to post
Share on other sites

Yes Bert, but I'm not recommending it for three core use (that's 116), instead I'm actually using it as an *example* of utilising HT and why it's better than 84 (face it, it's better than 84 anyway) - funny you still not getting it.

Moving on, it is also worth knowing that the best performance from SimConnect addons is when they are given two or more LPs because with only one LP (or non HT core) their interaction with the sim and system resources can be self blocking - they can be held up on that LP.

So looking at the four core no HT we have AM=14=1110 in an attempt to leave a core (zero) free for addons, even so the processes the addons create will spill onto the other cores, and again we only have the three way split for the sim mounting more work on the first LP encountered (rendering).

We can instead use no AM=0=1111 where we give the magic four parts, but have no cores free for the addons. Well - we can corral them on the last two cores (1100) or last three (1110), with a .bat or an app like Proc Lasso, which allows them only to interfere with the loading speed and not the rendering stage.

 

You need to know it.

 

 

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

When looking into performance with these type of simulators they do a lot of other stuff than GPU work. Using frames per second (fps) as a performance metric is unsound - actually it's a complete waste of time. The sim maintains an fps irrespective of how much work the background tasks do, they may take seconds to complete (too long = blurries, much too long = mad ATC etc.). Measuring fps has no bearing on how much capacity those AMs provide our sim background tasks. In fact if the background tasks are doing well we'll see a drop in fps. Instead we are looking for stability when things change in the sim. It has to handle a number of SimConnect clients with very varied requirements and any amount of data. We set up the sim AM so the CPU can provide the best bandwidth, then make sure the addons, no matter how tiny, do not interfere in the timely flow of the sim maintained by the first two jobs as I discussed. Don't look too hard for high fps, look for stability. Remember that other settings or GPU enhancements can flat line the sim making it look artificially stable, and tiny differences there can interact with the co-incidence of frames with the monitor and appear like big changes in performance.

Think about six cores minimum for the next rig.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

You are not answering my question:  "What manual..?"

That is where you have me stumped..

 


Bert

Share this post


Link to post
Share on other sites

The ESP documentation.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

...you can read all about HT in the Windows document "Windows Platform Design Notes", an excerpt follows:

 

6.2 Improving Application Performance on Hyper-Threading-Enabled Systems

In general, multithreaded Windows applications perform better when running unmodified on an HT processor than they do on a similarly equipped single-threaded processor. To optimize the application performance benefit on HT-enabled systems, the application should ensure that the threads executing on the two logical processors have minimal dependencies on the same shared resources on the physical processor. With an understanding of how the application threads and processes utilize the shared resources on an HT processor, setting processor affinity to minimize competition for these system resources can help application performance.

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

from an early P3D document (same as FSX in this respect):

"By default, Prepar3D will use all available processor cores. On machines with four or more cores, it will dedicate a core to rendering tasks."

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

So deducing from that we should say Logical Processors (not cores, per the Windows doc), and we know there are two per core, so no AM on a four core AM=0=1111, same as derivatives of 85=01,01,01,01 with HT enabled - no AM=11,11,11,11 an entirely different situation putting the first four parts onto only two cores. Four or more means the renderer is not at its leanest with three or less LPs (as per the ESP doc). If there's more than four it's not hard to imagine that there's more results to handle shall we say, for a small return in background throughput. Proper testing confirms it all pans out as they (Intel MS) intended and if it doesn't there's something else messing up the results.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...