Jump to content

Sign in to follow this  
ComSimPilot

P3Dv5 CPU utilizes all cores - an observation

Recommended Posts

For JobScheduler try 62805 (11.11.01.01.01.01.01.01)

Share this post


Link to post
Posted (edited)
32 minutes ago, w6kd said:

HT on without an appropriate Affinity Mask is a big no-no...sharing LP0's main workload with anything else really gets in the way. 

mikeymike said he was running HT off, hence my recommendation to leave affinities alone.  Running with HT on is a whole different ball game...and per-core HT is likely to be yet another one still.

Exactly. And btw I don't really find much sense in turning HT On at all if you are going to use 01.01.01.01.01.01.01.01 Affinity.

 

Cheers, 

Edited by Dirk98

Share this post


Link to post
Posted (edited)
11 minutes ago, Dirk98 said:

Exactly. And btw I don't really find much sense in turning HT On at all if you are going to use 01.01.01.01.01.01.01.01 Affinity.

 

Cheers, 

Hi Dirk,

Oh, that’s easy! P3D is a special case whereas all other apps and games can bask in the performance enhancing glory, including Windows itself, of having all LPs available. Why have a HT capable system and not allow it to be exploited as and when appropriate?

Best of all worlds.

Regards,

Mike

Edited by Cruachan
  • Like 1

My rig: ASUS ROG Rampage V Extreme, i7-5960X (Dynamic OC 4.6 GHz - all cores, HT=ON, AM=21845), Corsair Hydro Series H110i GT Cooler with 2xNoctua NF-A14 PWM 140mm fans, G.SKILL Ripjaws 4 series 16GB (4 x 4GB) DDR4 3000, ASUS GTX 1080Ti ROG STRIX 11GB, GDDR5X (Driver versions: 441.66 (Win7), 452.06 (Win10)), Samsung 850 EVO 1TB SSD x4, Samsung 970 EVO 2TB V-NAND M.2, LG BH16NS40 16x SATA Internal BDRW, EVGA 1200 P2 Watt PSU, Cooler Master HAF X, ASUS ROG Swift PG278Q (G-Sync) monitor at 120Hz. Oculus Rift. Dual Boot: Windows 10 Pro 64bit (2004) / Prepar3D v5.0.31.35253, Windows 7 Pro 64bit / Prepar3D v4.5.12.30293.

Share this post


Link to post
Posted (edited)
3 minutes ago, Cruachan said:

Hi Dirk,

Oh, that’s easy! P3D is a special case whereas all other apps and games can bask in the performance enhancing glory, including Windows itself, of having all LPs available. Why have an HT capable system and not allow it to be exploited as and when appropriate?

Best of all worlds.

Regards,

Mike

True, unless you o/c your system to 5.2GHz, but I don't worry in my case the block is directly on the die, no ihs. 

But good point anyways, I forgot about Windows program!! ))

 

 

Edited by Dirk98

Share this post


Link to post
23 minutes ago, Dirk98 said:

For JobScheduler try 62805 (11.11.01.01.01.01.01.01)

Will try this 

thanks

mike

Share this post


Link to post
21 minutes ago, Cruachan said:

Why have a HT capable system and not allow it to be exploited as and when appropriate?

A HT-enabled CPU comes with 2MB per physical core of L3 cache--that's a good reason.  And HT off usually gives you enough additional thermal headroom for another tick or two on the clock multiplier when you're overclocking.

I tend to think of having hyperthreading as analogous to having a four-wheel drive vehicle...it's good to have if you need it, but 99% of the time you don't.  Now if you live in Alaska or do logging in the back country, then that's different.  If you do video rendering, CAD/CAM etc, then the answers are going to be different. 

With ten physical cores available to P3D on the 10900K, I suspect we're close to the point of diminishing returns for adding LPs beyond the number of physical cores, anyway.  I remember Rob Ainscough saying that 10 cores was around where you reach that point when he was experimenting with the X-suffix HEDT processors some time back.  When I get some time away from prepping two machines for the arrival of Simzilla, I'm going to do some experimenting with my 10900K to see how the theory holds.


Bob Scott | AVSIM Forums Administrator | AVSIM Board of Directors

ATP Gulfstream II-III-IV-V

System: i9-10900K @ 5.2GHz on custom water loop, ASUS Maximus XII Hero, 32GB GSkill 3600MHz CAS15, eVGA 2080Ti XC Ultra, Samsung 55" JS8500 4K TV@30Hz, 5xSamsung SSD, eVGA 1KW PSU, 1Gbps internet

SB XFi Titanium, optical link to Yamaha RX-V467, Polk/Klipsch 6" bookshelf spkrs, Polk 12" subwoofer, 12.9" iPad Pro, PFC yoke/throttle quad/pedals with custom Hall sensors, Coolermaster HAF932 case, Stream Deck XL button box

Share this post


Link to post

Sorry, got called away.

Looks like you guys have the right idea. Use "01"s. make sure to use LP zero HT enabled (or core zero HT disabled). Put addons on the last cores use the leftmost "10"s so as to use the sister LP of the data loading cores for addons.

Make sure all addons use a minimum of two LPs with only one LP they can be self waiting..

Remember you can only gain fps if you reduce activity on the sister LPs of the first cores (rightmost in the AM binary). You can only improve smoothness by using the optimum amount of background tasks, too few or too many will cause issues.

Have fun and thanks for the support on the forums as always.

Regards

Steve

 

  • Like 3

Steve Waite: Engineer at codelegend.com

Share this post


Link to post
1 hour ago, Dirk98 said:

For JobScheduler try 62805 (11.11.01.01.01.01.01.01)

I assume you get all the addons off those LPs being used for P3d with that affinity mask?

and spreading them accordingly to the LPs

that are free?

thanks 

mike

Share this post


Link to post
3 minutes ago, mikeymike said:

I assume you get all the addons off those LPs being used for P3d with that affinity mask?

and spreading them accordingly to the LPs

that are free?

thanks 

mike

mike, I put add ons on:

10.10.00.00.00.00.00.00 and some on 00.10.10.00.00.00.00.00

But my ASP3D and P2A run on networked PCs. If all your add-ons are on the main sim PC, then you may want to try 01.01.01.01.01.01.01.01 AM for P3D and compare. In my case 62805 works best as I wrote.

Share this post


Link to post
Posted (edited)

My add-on EXEs that run alongside P3D are: VoiceAttack, GFDevP3dv4.exe, FFTF Dynamic, Littlenavconnect.exe, NavigraphSimlink.exe, SimObjectDisplayEngine.exe, couatl.exe, EZCA.exe, ActiveSkyUtils.exe  all set on LPs as mentioned in my previous post.

Edited by Dirk98

Share this post


Link to post
4 minutes ago, Dirk98 said:

My add-on EXEs that run alongside P3D are: VoiceAttack, GFDevP3dv4.exe, FFTF Dynamic, Littlenavconnect.exe, NavigraphSimlink.exe, SimObjectDisplayEngine.exe, couatl.exe, EZCA.exe, ActiveSkyUtils.exe  all set on LPs as mentioned in my previous post.

Ok thank you 

is it possible for you to post some pics of your task mAnager?

so I get a visual idea of it?

thanks 

mike

Share this post


Link to post
3 hours ago, SteveW said:

Let's look at how a single core handles two identical threads:

SwitchingImproves.jpg

On the right the core is HT disabled - no Hyperthreading, each thread (one orange and one purple) gets a time slice and between each slice the CPU has to save the situation to be loaded later.

Now in a simple way of showing it, on the left is the same core with HT enabled, and with the same two threads arranged so that each thread occupies an LP to itself. The two threads are still time sliced, since there is only one core. But these two threads finish sooner with HT enabled because their context is saved.

Heya Steve, 

I think this picture is a bit misleading, and does not acknowledge why hyper-threading was invented in the first place.

Way back in the Aughts (2000-2010) Intel, IBM, and SPARC all came out with a version of hyper-threading. The impetus was that when a processor data request missed all caches and had to get it from memory the CPU was waiting hundreds of CPU cycles for the information before resuming processing. The clever monkeys at Intel, IBM, et al figured that "hey, since the processor units were sitting around doing nothing, why don't we run another thread while we wait?".

The various approaches by the respective vendors yielded various results, but the main benefit was that a single processor core could now support 1.x of work (e.g more than 1.0), with "x" varying by vendor depending how they implemented hyper-threading and their underlying memory architectures. For example, IBM Power processors are RISC (Reduced Instruction Set Computing) based and had somewhat weak memory sub-systems, so "x" was actually pretty high, around 5, sometime 6., meaning that a system based on IBM Power processors would produce 1.5-1.6 times the throughput for a given workload. Intel processors are CISC (Complex Instruction Set Computing), so hyper-threading was a bigger challenge to begin with, and coupled with the fact the Intel processor had a slightly stronger memory sub-system means there was less benefit, i.e. "x"  is around 4.  SPARC took a unique approach and instead of waiting for a cache miss they run the threads like a zipper, and when one side of the zipper had to go to memory the other side kept on running, sans zipper.

This was a real big advancement in processor throughput except when two threads from a COMMON application shared a hyper-threaded core. For example, when hyper-threading was first introduced on Intel processors Microsoft cautioned the industry to NOT enable hyper-threading when running the SQL Server application. This was because sometimes a SQL thread on a shared core would set a resource lock that may be required by another SQL thread. The other thread would "wake up" (i.e., have its memory request serviced to it could resume processing), only to find it could not resume processing because a resource it required was locked. So, for early implementations of SQL Server running on hyper-threaded Intel processors, the throughput dropped to less than 1.0 due to lock contention. It only too Microsoft 4-5 months before it made SQL Server "thread aware" so it would no longer dispatch threads from a common work unit to a shared core.

Going back to your picture, the left side should show overlap between the threads.. Assuming the IBM 1.5x throughput benefit, the amount of overlap would be 50-percent of a given thread, and the amount of overall time consumed would be less (i.e., a shorter graphical stack) than the non-threaded example, demonstrating the benefit of more throughput, i.e., more work gets down in less time.

Now, the above comments totally ignore why turning hyper-threading on or off makes Prepar3D run smoother, stronger, faster (or not). Sometimes I think there is a bit of "SQL Server" behavior, or as others have pointed out in this thread (sorry), cache thrashing and/or dilution. That whole subject is a demonstrable rat-hole, and my comments above are only intended to show why hyper-threading was invented in the first place and why it theoretically, in in practice, produced higher throughput for a given workload.

  • Like 2
  • Upvote 3

John Howell

Prepar3D V5, Windows 10 Pro, I7-9700K @ 4.6Ghz, EVGA GTX1080, 32GB Corsair Dominator 3200GHz, SanDisk Ultimate Pro 480GB SSD (OS), 2x Samsung 1TB 970 EVO M.2 (P3D), Corsair H80i V2 AIO Cooler 

Share this post


Link to post

The image as it says is intended to show in a simplified way another thread can run while one is waiting but the image only shows time slicing otherwise it would be a complex diagram. But I wouldn't say misleading no.

 


Steve Waite: Engineer at codelegend.com

Share this post


Link to post

...the intention of the image is to show in a simplified way why FSX and P3D with fewer cores available can load up faster with HT enabled. That can be measured on a stop watch. If they can load up faster then they can load and process data faster while on the flight since these simulators can't have a complete set of data for the entire trip and must load scenery as we fly. To say it is misleading is basically misleading because the image does not try to attempt to show in a simple diagram the nuances of hyperthreading designs across different CPUs and was obviously never meant to.

  • Upvote 1

Steve Waite: Engineer at codelegend.com

Share this post


Link to post
On 8/13/2020 at 1:33 PM, Dirk98 said:

mike, I put add ons on:

10.10.00.00.00.00.00.00 and some on 00.10.10.00.00.00.00.00

But my ASP3D and P2A run on networked PCs. If all your add-ons are on the main sim PC, then you may want to try 01.01.01.01.01.01.01.01 AM for P3D and compare. In my case 62805 works best as I wrote.

Well mate, I have tried as suggested 

I reverted back to ht off no affinity mask.

smoother.

thanks thou

mike

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
  • Donation Goals

    AVSIM's 2020 Fundraising Goal

    Donate to our annual general fundraising goal. This donation keeps our doors open and providing you service 24 x 7 x 365. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. We reset this goal every new year for the following year's goal.


    31%
    $7,965.00 of $25,000.00 Donate Now
×
×
  • Create New...