Jump to content
Sign in to follow this  
Ray Proudfoot

CPU core 0 used more than GPU in v5.2.

Recommended Posts

Hi all.

Now that I've finally done some flying in 5.2, I thought I'd report my slightly odd results...

My sim uses two PCs, both running 5.2. Once airborne, I could see that one machine had Core 0 pinned and the other cores were spiking up and down as textures were loaded - which is pretty usual for P3D in my experience, at least on this system. But the other machine - the 'client' machine - had all cores pinned at 100%. The difference between them? Almost nothing. Settings are identical. Different CPUs, though - 8th gen to 9th gen i7.

A quick check of temps on the client machine showed Core 0 spiking up to 95C, which is high for this rig - usually flat-out temps are 80-85C - though not high enough for thermal throttling, and the cooling is pretty strong so it was only peaking at 90+ but quickly going down to 75-80 again. The other cores were around 70-80C and occasionally lower.

Now you might think - if all cores were at 100%, why weren't they all that hot? Well, 100% is not necessarily 100%. That's a sampled average. So in this case temps are telling me that Core 0 is busier than the other cores, despite the graphs showing 100% across the board. I'm starting to wonder if maybe CPU usage is not being shown correctly? Perhaps different chipsets are being read differently? Perhaps it's a Windows issue - though these who machines are at the same OS and patch level. 

No AI aircraft on either machine, BTW - as I haven't finished re-installing AIG after nuking the host PC. And, in fact, that's probably the most significant difference between the machines; one has been upgraded over several versions of Windows 10, the other was flattened and re-installed a couple of weeks ago (pre-5.2). 

Make of all that what you will... I'm at a loss. But I will say, apart from a regular stutter on the ground which I think is to do with animations, performance is great on both machines. Better than 5.1 for certain. I can have EA sliders higher (including godrays) than I could in 5.1, although High/Ultra clouds does tank my FPS. Hopefully AIG will not hit my FPS too badly once it's working again. 


Temporary sim: 9700K @ 5GHz, 2TB NVMe SSD, RTX 3080Ti, MSFS + SPAD.NeXT

Share this post


Link to post
5 hours ago, Ray Proudfoot said:

@SteveW, I’m happy to try anything that will improve performance. Before that rainy day I’m hoping LM will have identified the problem and fixed it.

The reason I use SimStarterNG is because I have around five supporting programs to run and doing everything via SSNG is the easy option. It’s a solid program I have used for years.

I used to hate it; now I can't fly without it :>)

Funny how things change.

Share this post


Link to post

@neilhewitt, odd results indeed. It would be very helpful if you could replicate my test as close as possible. Open Task Manager and choose Performance tab so CPU usage is visible.

No AM as mine wasn’t set because of a misspelling. Comment it out in P3d.cfg.

EGLC Rwy 29. 12:00 local time.

Default aircraft such as Moony Bravo.

Scenery\World\Scenery. Ensure Aitraffic.bgl is enabled (actual name may be different).

In P3D turn off all Scenery object settings. Sliders full left. Top right section.

Take off and change heading to 250. Tune Nav1 to 109.50 OBS 273° to intercept 27L localiser at EGLL. Once localiser and glide slope is captured land and check Task Manager for CPU usage. Core0 should be close to 100% throughout test.

Second Test. As above but rename Aitraffic.bgl to bg# so it’s not used. Same flight procedure but this time with no Ai to process CPU0 should not max out.


Ray (Cheshire, England).
System: P3D v5.3HF2, Intel i9-13900K, MSI 4090 GAMING X TRIO 24G, Crucial T700 4Tb M.2 SSD, Asus ROG Maximus Z790 Hero, 32Gb Corsair Vengeance DDR5 6000Mhz RAM, Win 11 Pro 64-bit, BenQ PD3200U 32” UHD monitor, Fulcrum One yoke.
Cheadle Hulme Weather

Share this post


Link to post
5 hours ago, Alexx Pilot said:

I noticed the same thing, V5.2 is definitely less GPU but more CPU eater Vs past version.

This is bad word good news. Now we don’t have to spend thousand on a GPU. Get that cheaper CPU and all its cores to do something for a change. 
And control the (now budget and cheaper)GPU by the scenery sliders and AA settings. 
 

 

Share this post


Link to post
6 hours ago, neilhewitt said:

Now that I've finally done some flying in 5.2, I thought I'd report my slightly odd results...

My sim uses two PCs, both running 5.2. Once airborne, I could see that one machine had Core 0 pinned and the other cores were spiking up and down as textures were loaded - which is pretty usual for P3D in my experience, at least on this system. But the other machine - the 'client' machine - had all cores pinned at 100%. The difference between them? Almost nothing. Settings are identical. Different CPUs, though - 8th gen to 9th gen i7.

A quick check of temps on the client machine showed Core 0 spiking up to 95C, which is high for this rig - usually flat-out temps are 80-85C - though not high enough for thermal throttling, and the cooling is pretty strong so it was only peaking at 90+ but quickly going down to 75-80 again. The other cores were around 70-80C and occasionally lower.

Now you might think - if all cores were at 100%, why weren't they all that hot? Well, 100% is not necessarily 100%. That's a sampled average. So in this case temps are telling me that Core 0 is busier than the other cores, despite the graphs showing 100% across the board. I'm starting to wonder if maybe CPU usage is not being shown correctly? Perhaps different chipsets are being read differently? Perhaps it's a Windows issue - though these who machines are at the same OS and patch level.

I'm not sure it's super-mysterious--as I read the system specs in your sig, one machine (I assume to be the "server") is an 8-core, the other a six-core with 33% more L3 cache per core.  I think it's quite conceivable that the 8700K might have higher actual throughput per core due to the better memory bandwidth (less idle cycles waiting for data) and the fact that a similar amount of processing workload is being spread over less cores.  If HT is enabled on the "client" machine, then even more workload is being taken on to manage the thread-swapping between the virtual processors on each physical core.

The temperature differences across the various cores could be explained by the type of workload on the various cores--core loading is only the ratio of busy cycles to total cycles, not actual work being done.  Something like a simple data move instruction is still "busy" but exercises far fewer heat-generating transistors than an AVX instruction working the SIMD registers in the processor core.  So the variance in the types of work being done by the various threads could easily account for having similar CPU load figures while varying considerably in work done as evidenced by the core temperature.

Cheers

  • Upvote 1

Bob Scott | President and CEO, AVSIM Inc
ATP Gulfstream II-III-IV-V

System1 (P3Dv5/v4): i9-13900KS @ 6.0GHz, water 2x360mm, ASUS Z790 Hero, 32GB GSkill 7800MHz CAS36, ASUS RTX4090
Samsung 55" JS8500 4K TV@30Hz,
3x 2TB WD SN850X 1x 4TB Crucial P3 M.2 NVME SSD, EVGA 1600T2 PSU, 1.2Gbps internet
Fiber link to Yamaha RX-V467 Home Theater Receiver, Polk/Klipsch 6" bookshelf speakers, Polk 12" subwoofer, 12.9" iPad Pro
PFC yoke/throttle quad/pedals with custom Hall sensor retrofit, Thermaltake View 71 case, Stream Deck XL button box

Sys2 (MSFS/XPlane): i9-10900K @ 5.1GHz, 32GB 3600/15, nVidia RTX4090FE, Alienware AW3821DW 38" 21:9 GSync, EVGA 1000P2
Thrustmaster TCA Boeing Yoke, TCA Airbus Sidestick, 2x TCA Airbus Throttle quads, PFC Cirrus Pedals, Coolermaster HAF932 case

Portable Sys3 (P3Dv4/FSX/DCS): i9-9900K @ 5.0 Ghz, Noctua NH-D15, 32GB 3200/16, EVGA RTX3090, Dell S2417DG 24" GSync
Corsair RM850x PSU, TM TCA Officer Pack, Saitek combat pedals, TM Warthog HOTAS, Coolermaster HAF XB case

Share this post


Link to post

i often wonder in another 20 years in 2041 if we will still be farting about with all this nosense, or I will i be able to install a  sim, and just use it (like I can with a plethora of other software)  I was so dumb to think ms2020 was going to be the answer to this. 

how in 2021 do we not have a flightsim we can just use instead of all this tweaking nonsense with affinty masks,lods, fibers,texture_exps  etc etc etc, and  how we can put a little rover thing on mars, but seemily cant double click on a icon, load up sim and get it to run smoothly is beyond me.

Edited by fluffyflops
  • Upvote 1

 
 
 
 
14ppkc-6.png
  913456

Share this post


Link to post
5 hours ago, w6kd said:

I'm not sure it's super-mysterious--as I read the system specs in your sig, one machine (I assume to be the "server") is an 8-core, the other a six-core with 33% more L3 cache per core.  I think it's quite conceivable that the 8700K might have higher actual throughput per core due to the better memory bandwidth (less idle cycles waiting for data) and the fact that a similar amount of processing workload is being spread over less cores.  If HT is enabled on the "client" machine, then even more workload is being taken on to manage the thread-swapping between the virtual processors on each physical core.

The temperature differences across the various cores could be explained by the type of workload on the various cores--core loading is only the ratio of busy cycles to total cycles, not actual work being done.  Something like a simple data move instruction is still "busy" but exercises far fewer heat-generating transistors than an AVX instruction working the SIMD registers in the processor core.  So the variance in the types of work being done by the various threads could easily account for having similar CPU load figures while varying considerably in work done as evidenced by the core temperature.

Cheers

All great points, @w6kd. CPU load is not a simple thing, as you point out, different operations involve different parts of the processor and generate different amounts of waste heat. As a software guy I often tend to take the hardware for granted; OTOH I've written CPU emulators (albeit not x86) so I know how processors work at least in theory. I currently have HT disabled on the hexa-core, so there are two more physical cores available on the octa-core CPU to handle a similar load, which will also contribute to the loading patterns. Also, I have no AVX step-down on the overclock, so I would expect the CPU to run hot as a matter of course, and TBH I haven't sat and watched temps under load for a long time so maybe spiking to 95C is normal for that machine with that config.

At the end of the day if you have good cooling then heavy load is nothing to fear (and like I've said, it can't break your machine even if cooling is inadequate - it'll just cause throttling). 

Having just done a test with AI (and lots of it) turned on, I see no difference in the CPU loading patterns on my machines vs AI off. I do see a noticeable difference in FPS - AIG BGL AI at 50% drops my FPS by 5 and AIGFP AI also at 50% seems to cost another 3-4 FPS. But I've yet to tweak it fully.

4 hours ago, fluffyflops said:

i often wonder in another 20 years in 2041 if we will still be farting about with all this nosense, or I will i be able to install a  sim, and just use it (like I can with a plethora of other software)  I was so dumb to think ms2020 was going to be the answer to this. 

how in 2021 do we not have a flightsim we can just use instead of all this tweaking nonsense with affinty masks,lods, fibers,texture_exps  etc etc etc, and  how we can put a little rover thing on mars, but seemily cant double click on a icon, load up sim and get it to run smoothly is beyond me.

Welcome to the wonderful world of software engineering. The thing is, if you stick with the defaults you generally get a good experience, but we simmers tend to want the best possible experience (best quality, best performance, best detail, lots of 'extras') and that always involves tweaking and changing things under the hood; just like you can own a modern car and have it work reliably most of the time, and get it serviced by experts when it doesn't, but if you want a hot rod then you either have to pay through the nose for someone to tweak it for you, or you have to learn to do it yourself.

Actually that's probably not the best analogy, but you get my point. Of course, the fact that P3D currently is somewhat experimental out of the box (EA + 'beta' features, though these are nothing new, remember 'DX10 preview' in FSX?) invalidates my point slightly. Let's say, on the faff scale of 0-10, that default P3D is maybe a 3, but P3D how many of us run it is more like a 9.  

Edited by neilhewitt
  • Upvote 1

Temporary sim: 9700K @ 5GHz, 2TB NVMe SSD, RTX 3080Ti, MSFS + SPAD.NeXT

Share this post


Link to post
16 hours ago, Ray Proudfoot said:

@SteveW, I’m happy to try anything that will improve performance. Before that rainy day I’m hoping LM will have identified the problem and fixed it.

The reason I use SimStarterNG is because I have around five supporting programs to run and doing everything via SSNG is the easy option. It’s a solid program I have used for years.

Just to be sure you understood me when I said to start the sim from it's own desktop shortcut instead of simstarter, that was because your AM wasn't working and you had to be sure your sim could run it's AM properly before you included other things into the mix. You didn't know if it was your cfg statements or if there was some contention between your cfg statements and your simstarter settings working together. I know why you use simstarter, so obviously, it wasn't a suggestion to never use simstarter again, it was advice helping you to sensibly go about eliminating variables to find the cause of your error, basic problem solving theory. The error turned out to be the simple spelling mistake I could see in your cfg statements which had not been tested properly.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
5 hours ago, w6kd said:

If HT is enabled on the "client" machine, then even more workload is being taken on to manage the thread-swapping between the virtual processors on each physical core.

One of the main principles of HT is that it reduces the workload of thread-swapping by maintaining the context of those threads and reduces cycles required to load each context in the non-HT setting.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Hyperthreading is not called slow-go-threading. Perhaps an aspect of HT can be more easily understood with the following over-simplified way of explanation: Take two identical threads to be run on one non-HT core, a core with one Logical Processor, they are, as we know, time sliced so they appear to be running concurrently. The core must save the context of the stopping thread and reload the context of the starting thread before it can continue. The context of each are stowed and reloaded which takes up extra cycles. On the HT enabled core with two Logical Processors these two threads would be distributed, one on each LP, they run the same as two on one non-HT core, time sliced to appear concurrent, but each LP maintains the context of that thread so it is already available when that LP continues the thread. These two threads, two LPs, are time sliced just as they are on the non-HT core except they don't stow and reload their context each time and so together they can complete sooner with less cycles consumed. Obviously with many thousands of threads running concurrently many contexts are stowed and reloaded anyway, but with threads distributed across two HT LPs the switching process is reduced overall compared to all threads running on one LP. It's not perfect but it's more than we had with one LP per core.

Another simplified aspect of HT is that with two LPs, a program with two threads appears to have two cores or LPs available to it, the jobscheduler attempts to distribute those threads on separate LPs. If that program only has one LP those threads can be very often put into situations waiting for each other. With HT enabled they continue concurrently using less cycles with less apparent waiting.

So when using core affinity settings it is usually better to assign two LPs minimum to exe's, depending on what they do.

  • Like 1
  • Upvote 1

Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Another aspect of the modern CPU uses a technique whereby it looks ahead in the threads running to determine program branching, or branch prediction This is so that a pre-arranged order of memory transactions, that is, data and instructions, can be maintained to more often benefit the program in terms of execution speed. If the algorithm for branch prediction is not very good then there will be time wasted to drop one set of memory transactions and arrange another set of memory transactions, utilising more memory bandwidth. The HT enabled CPU allows the branch prediction algorithm to be improved and the pre-ordering of memory transactions to be less wasteful.

Edited by SteveW
  • Upvote 1

Steve Waite: Engineer at codelegend.com

Share this post


Link to post
16 hours ago, Ray Proudfoot said:

I put them all on core11. They just report the aircraft’s position to software running on a WideFS computer.

So Ray, if you conclude from some of my posts above you might find those work better by allocating at least two or more LPs each, making sure to not allocate LPs from the main core. On LPs of the same HT core is fine, or distributed over other least used cores. It's hard to say if that will be very helpful for any of them but I expect they will be multi-threaded and with the use of simconnect at least they might benefit a little. Experimentation required.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post
1 hour ago, SteveW said:

Just to be sure you understood me when I said to start the sim from it's own desktop shortcut instead of simstarter, that was because your AM wasn't working and you had to be sure your sim could run it's AM properly before you included other things into the mix. You didn't know if it was your cfg statements or if there was some contention between your cfg statements and your simstarter settings working together. I know why you use simstarter, so obviously, it wasn't a suggestion to never use simstarter again, it was advice helping you to sensibly go about eliminating variables to find the cause of your error, basic problem solving theory. The error turned out to be the simple spelling mistake I could see in your cfg statements which had not been tested properly.

Steve, for testing this week I've been using the desktop shortcut all the time. It's only when I want to have a proper fly I use SimStarter. I'm fully conversant with the AM settting. It was just a simple typo that caused it not to be executed.

I'm going to try those three flights again to see what difference there is compared to running with no AM yesterday.

And I'm also going to load the same scenery into P3Dv4.5 with the same scenery settings to compare results.

It's a bit disappointing that having performed these tests yesterday and posted on the LM thread there's been no response from them.

  • Like 1

Ray (Cheshire, England).
System: P3D v5.3HF2, Intel i9-13900K, MSI 4090 GAMING X TRIO 24G, Crucial T700 4Tb M.2 SSD, Asus ROG Maximus Z790 Hero, 32Gb Corsair Vengeance DDR5 6000Mhz RAM, Win 11 Pro 64-bit, BenQ PD3200U 32” UHD monitor, Fulcrum One yoke.
Cheadle Hulme Weather

Share this post


Link to post
7 minutes ago, SteveW said:

So Ray, if you conclude from some of my posts above you might find those work better by allocating at least two or more LPs each, making sure to not allocate LPs from the main core.

These are very low level executables Steve. Chaseplane, GIT, ASP3D/ASCA, LittleNavConnect and Aivlasoft's EFB.

I have no problem with their performance. Far bigger fish to fry with 5.2 performance.


Ray (Cheshire, England).
System: P3D v5.3HF2, Intel i9-13900K, MSI 4090 GAMING X TRIO 24G, Crucial T700 4Tb M.2 SSD, Asus ROG Maximus Z790 Hero, 32Gb Corsair Vengeance DDR5 6000Mhz RAM, Win 11 Pro 64-bit, BenQ PD3200U 32” UHD monitor, Fulcrum One yoke.
Cheadle Hulme Weather

Share this post


Link to post

I understand that Ray, I'm not trying to divert your efforts away from the fish frying, but I only mention those ideas in order to help you with your attempts to improve performance at some stage along the way at least. With those exe's it's not necessarily their small demands that count, but since they are simconnect clients they can produce hold-ups depending on how they behave. So I would say to try at some stage, a minimum of two LPs per exe in any case to be sure.

In your posts on the LM forum your screenshots show LP0 and LP1 both nearly at maximum throughput and that condition would definitely be reducing the performance of the main task in P3D anyway and reducing the overall performance of the sim. I hope that's not put them off reacting to your concerns.


Steve Waite: Engineer at codelegend.com

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...