SledDriver

P3D multicore usage anomoly

Recommended Posts

Noticed this today. Anyone else understand what is happening here.

If I fire up P3D4.4 as normal and go flying, when I check Taskman it shows all 12 cores on my PC being used nicely, but core0 is at 100% usage all the time. The rest are churning along between 20 and 50%.

Yet if I go into the CPU affinity settings and tell it not to use core0, then core0 as expected stops being hammered at 100%, and the 100% hammering does not transfer to another core.

However, if I then within the same sim session I re-enable core0 for P3D, core0 now acts just like all the others, jogging along nicely at 20-50% exactly like all the other cores. The 100% hammering is gone, yet the sim continues to work just fine with no visible difference in performance.

Has this been discussed before? Anyone know what is going on here? Is this some legacy bad programmed from FSX days - FSX alwasy hammered one core at 100% regardless of what you did to affinity, at least in my experience on exactly the same machine.

It would be nice to get to the bottom of this and get it 'fixed', because obviously something is not quite right in this department.

Share this post


Link to post
Help AVSIM continue to serve you!
Please donate today!

I can't offer an answer , but confirm that your experience is correct as I can reproduce it. Also it is not always with core0 , if you mask core0 on start with an AM , the next core will be maxed out , repeating the process it will use all cores equally.

 

Jorge

Share this post


Link to post

This was discovered quite some time ago with varying degrees of effect.

I suspect that the way P3D was designed included core 0 at or about 100% for whatever reason. Just because that core is at 100% does not mean that something is wrong or it needs to be fixed.

But, you never know.

Cheers,

Mark

Share this post


Link to post

Ultimately, the assignment of threads to specific CPUs (physical and/or virtual) is done by the operating system, not by the application.  The application (P3D in this case) only has limited input into that process, e.g. which cores the OS should be allowed to use, but beyond that, this is OS behavior.

The OS will assign threads to cores (within the affinity mask constraints) based on the processor loading at the time, so when P3D first loads up on an idle PC, it assigns the first thread(s) associated with the processor-intensive main program to core 0, more than likely because not much else is running and it's the first available core.  When you change the CPU affinity after it's already running, those threads from the main program are likely being reassigned based on the core loads at that time, which is going to look quite a bit different than an initial load from idle.

That said, I don't think there's anything here to fix.  Looks like a normal multi-processor OS at work.

Regards

  • Like 1

Share this post


Link to post
Posted (edited)

There has been some debate about how much control P3D has over which cores to use for its various threads, but the behavior that you noted is indeed typical. It looks to me that the first two available cores are specifically assigned by P3D, and the remaining cores get used as required for scenery loading tasks.

If you go and reassign the affinity, you change this behavior.. whether that is for the better or the worse I do not know.. :wink:

Edited by Bert Pieke

Share this post


Link to post

I would like to disagree to a certain extent here about "the assignment of threads to specific CPUs (physical and/or virtual) is done by the operating system, not by the application."

Besides the OS (operating system) also coding in applications for multiple cores is a factor. Some games and applications are better or worse optimized to take advantage of multiple cores usage, even LM (P3D) acknowledged that they improved multicore usage in our game. Unfortunately not good enough and they still have issues (mostly related to a core being at 100% saturated that is causing stutters etc) 

I'm a P3D user since day one and still, I'm not ready to switch to XP11 where XP11 has a far better core usage across the software and fewer issues related to core optimization.

Even you can use affinity mask settings, unfortunately, you always will have a core at 100% (on P3D) and that one will create some issues because of improper load balance.

We always can hope for a fix or to switch to something else with fewer headaches. 

Share this post


Link to post

This is a bit tangential but I've used my old SB-E w/ HT on w/ an AM of 4084 versus HT off w/ essentially the same AM (111110) for my 6 core non-HT and in the exact same flight, while the non-HT config runs about 4C cooler, it's not nearly as smooth as w/ HT enabled I had a truly perfectly smooth flight, side-to-side panning, taxiing.  With HT off while I was able to O'C 0.12gHz more ultimately the flight was considerably less smooth.  For my next upgrade I am considering I9-9900K just to have headroom for better multicore utilization. Near as I can tell the future for developing is exploiting multicore and GPU so having at least 8 physical cores w/ HT potential is the way to go for today.

Share this post


Link to post
1 hour ago, killthespam said:

Unfortunately not good enough and they still have issues (mostly related to a core being at 100% saturated that is causing stutters etc) 

Even you can use affinity mask settings, unfortunately, you always will have a core at 100% (on P3D) and that one will create some issues because of improper load balance.

You always have a core at 100%?

Are you using P3Dv4.4?

Share this post


Link to post

Hmm. With 12 cores available, having one at 100% and the rest merely jogging along has to be bad programming/OS assignment/whatever.

It certainly cannot be a good thing in a real-time critical app like a simulator.

I'm always amazed in the forum world when people try to defend and explain away plain simple bad things like bad programming from a multimillion dollar company. This 100% core0 thing has existed since MS programmed FSX many years ago. The fact that the sim runs very nicely with all cores just jogging along after a simply disable/enable core zero, and thereafter for the rest of that sim session, shows that something is wrong, probably in the startup assignment of threads, and it needs fixing.

Or do you recommend that we keep the 100% core0 and all continue to buy more and more expensive CPU's to cope with the core0 demand, while we have plenty more cores not even breaking a sweat?

Especially when it appear a quick line of code after sim start to disable/enable core0 might well cover this issue with 5 minutes of coding.

  • Like 1

Share this post


Link to post

I'm pretty sure core0 is getting hammered for very good reasons.. programming across multiple cores isn't just some straight forward thing and yo don't jus give more cores and the code handles this in a nice and orderly way. Core0 is no doubt doing the heavy lifting, making calls and trying to orchestrate all of what you see going on in game by getting other cores to do some of the work, but ultimately it is that one base core that's running the show.

It would be ace if you could just add more cores and things simply got faster and better, but sadly this is most definitely not the case.

I'd also check how the sim is running after you mess with the affinity settings whilst it's running. I'm pretty sure it's detrimental to the overall running of the sim.

Share this post


Link to post

Sure. I'm still testing this. I still think it's legacy coding from FSX. I have plenty of other detailed graphical games which manage photoreal scenery without any one core taking a beating.

  • Like 1

Share this post


Link to post

With hardware VSync and a 30 Hz (4K) monitor, my core 0 in P3Dv4 typically sits around 70% on the ground with complex acft/scenery and AI, and often in the mid 30% range inflight, certainly not 100%.

@SledDriver--do you have 12 cores, or are you running a 6-core CPU with hyperthreading?  There is a significant difference between a physical core and a virtual CPU on an HT-enabled processor.

@killthespam--a good multi-threaded application enables the OS to spread its workload out by splitting the workload across threads.  Unfortunately, with real-time processes with lots of thread dependencies (like P3D), that's a lot harder than it sounds. 

Regards

Share this post


Link to post
6 minutes ago, SledDriver said:

Sure. I'm still testing this. I still think it's legacy coding from FSX. I have plenty of other detailed graphical games which manage photoreal scenery without any one core taking a beating.

That's why I mentioned that LM (p3D) needs to find a way of optimizing the core usage properly, not as it is now.

Share this post


Link to post
2 minutes ago, w6kd said:

 

@killthespam--a good multi-threaded application enables the OS to spread its workload out by splitting the workload across threads.  Unfortunately, with real-time processes with lots of thread dependencies (like P3D), that's a lot harder than it sounds. 

Regards

1

Bob,

That's why I mentioned that IMHO it's not optimized properly when in their coding one core gets 100% hammered and other barely 40 or 50%. And when 1 core gets saturated that CPU as a whole is not working properly anymore, overheating, stutters etc, based on what I did see on my PCs with 8700K, 8086K and 9900K with a  GTX 1080Ti or RTX 2080 TI. Same PCs running XP11 or other games I did not notice this issue. 


 

Share this post


Link to post
4 hours ago, killthespam said:

I would like to disagree to a certain extent here about "the assignment of threads to specific CPUs (physical and/or virtual) is done by the operating system, not by the application."

Besides the OS (operating system) also coding in applications for multiple cores is a factor.

There is no disagreement. The application as its coded can divide the work between separate threads. What CPUs those threads are executed on is up to the OS.

Cheers!

 

Share this post


Link to post
21 minutes ago, killthespam said:

That's why I mentioned that LM (p3D) needs to find a way of optimizing the core usage properly, not as it is now.

You make it sound like it's actually possible. Actually there are very few cases in algorithm design where you can scale infinitely based on core count, and they're usually related to graphics processing or stream processing where there are no dependencies between the different data elements.

Imagine it's Thanksgiving, and you're cooking an elaborate five course meal. It will take you a long time. Add a person to help you, and things go much faster because you can divide the work. Add a third person, then a fourth, and you'll discover that your marginal productivity is decreasing because you're getting in each other's way, or you're waiting for a shared resource (cutting board, oven, large pot, microwave, etc). It's why stuff doesn't scale linearly upwards with core counts.

It's Amdahl's Law - which was identified a half century ago. https://en.wikipedia.org/wiki/Amdahl's_law

Cheers!

  • Like 1
  • Upvote 1

Share this post


Link to post
2 hours ago, SledDriver said:

Hmm. With 12 cores available, having one at 100% and the rest merely jogging along has to be bad programming/OS assignment/whatever.

With P3D 4.4, you do not have to (should not) run with one core at 100%.   In fact, it is not advisable.  The application will indeed spread loads much better than in the past, it just needs to be configured to do so.

As Bob mentions above -- 

"...With hardware VSync and a 30 Hz (4K) monitor, my core 0 in P3Dv4 typically sits around 70% on the ground with complex acft/scenery and AI, and often in the mid 30% range inflight, certainly not 100%...."

If you have modern, top-end hardware, and you aren't seeing the Core0 cpu usage that Bob (and I) see, then I'd suggest revisiting P3D configuration threads.

Share this post


Link to post
1 hour ago, Mace said:

If you have modern, top-end hardware, and you aren't seeing the Core0 cpu usage that Bob (and I) see, then I'd suggest revisiting P3D configuration threads.

Got any pointers to useful config threads? More than happy to read up. I have years of experience tweaking FSX, and about 1 week on P3D, so am all ears for learning right now.

Share this post


Link to post
Posted (edited)

Gents,

what I don't understand is the following. No HT, no affinity mask, first core is core 0 and most of the time is at a very high percentage sometimes 100% for long periods of time based on a/c complexity, scenery, and other sim settings. The other cores are at low 50% or 30% across. When a core is at the max and the demand is higher we will get high temps and stutters.

Now if I use again no HT and an affinity mask to remove core 0 and assign the other cores active only for P3D the first core for the sim now lets say is core 1 (could also assign  2, 3 4 etc) that core now  will be at a very high percentage close or at 100% and the others  at low 50% or 30% across. At this point, I intend to believe that no matter what you do, the way P3D is programmed you will always have that problem core assigned to be 1, that particular one will be at 100% or a very high value than the others and when there is a higher demand hitting 100% or maybe more it will create stutters or high temps. By the way, I noticed that to be the situation watching task manager every time when I used a different affinity mask.

I did not see this issue with XP11. Now not being a programmer, of course, I don't have the knowledge and understanding of this, all I can see is facts of core loading for long periods of time at 100%, high temps on CPU and stutters on P3D and better usage of computer resources on the other sim. 

At this point when I can compare facts and numbers, I still think that there is something not properly optimized and there is much space for improvement.

Edited by killthespam

Share this post


Link to post

I don't know the technical background behind the load on cores, but I advise that you just leave things the way they are and don't turn off and on core0. It is true that when you load a scenario and you are at an airport not yet flying, you will have core0 used to 100% and the other cores relatively low. However, when you actually fly, you will see that the other cores (especially 3 and 4 if you have a quad-core CPU) will be hammered to almost 100% when scenery is being loaded en route - especially if you have complex scenery.

If you equalize the CPU load by turning off and on core0 in the CPU affinity setting, you may initially get the impression that an equalized load is better. However, when you overfly complex scenery, you won't have as much CPU resources available as without doing the said thing. I tested it myself a while ago and found delayed autogen loading and blurry textures on approach when doing the procedure. You can test it as well if you like.

Share this post


Link to post
Posted (edited)
4 hours ago, Luke said:

You make it sound like it's actually possible.

It's Amdahl's Law - which was identified a half century ago. https://en.wikipedia.org/wiki/Amdahl's_law

Cheers!

In your analogy there is no question there is an optimal configuration of workers, tasks, and dependencies.   Are you suggesting LM has already fully optimized this configuration in V4.4 and if so, why is that your conclusion?

Majestic, our beloved Q400 developer, uses some methodology to offload I think the flight modeling off of P3D/FSX's main thread.  That particular quite complex aircraft is as easy on processing demand as the default planes, and this makes me think there might be headroom to creatively access w/ more cores.  From your citation it appears to come down to this:  

"the theoretical speedup is limited to at most 20 times. For this reason, parallel computing with many processors is useful only for highly parallelizable programs."

Ok then the question becomes what more can be 'highly parallelized'.  I get the sense terrain loading must be, and that is already distributing over multiple cores.  Can all of the AI aircraft be offloaded off the main thread?  How about ATC?  How about the flight model, as w/ the incredibly performing Majestic Q400?

 

 

Edited by Noel

Share this post


Link to post
48 minutes ago, Noel said:

In your analogy there is no question there is an optimal configuration of workers, tasks, and dependencies.   Are you suggesting LM has already fully optimized this configuration in V4.4 and if so, why is that your conclusion?

It's not, but my suggestion is that the optimal configuration may take more effort and risk than it's worth, and we may be closer to it than we think. We need to realize that we're never going to get 8 cores pegged at 100% - there are probably enhancements to be gained but they are not going to be trivial in effort. We also have to remember what L-M's "other" customers consider acceptable may be very different.

 

51 minutes ago, Noel said:

Majestic, our beloved Q400 developer, uses some methodology to offload I think the flight modeling off of P3D/FSX's main thread.  That particular quite complex aircraft is as easy on processing demand as the default planes, and this makes me think there might be headroom to creatively access w/ more cores.  From your citation it appears to come down to this:  

I would ask you the question - "are you suggesting that the flight model consumes a significant amount of CPU/GPU time?" I seriously question this, mostly because calculations on small data sets are something CPUs do very, very well. Additionally, when I am in a PMDG aircraft I switch from the VC to spot view and my frame rate will easily go up 50%, even though the flight modeling remains active.

Cheers!

Share this post


Link to post

I don't know Luke but I do know the Q400 is very sophisticated and yet is easy on frames and yes that may be a red herring.

Terrain texture loading seems to be a process than can occur in parallel to a higher degree and hence gets distribute over my 8 available logical processors.  What others are there?   I know when I have lots of commercial airport traffic performance gets impacted.  Why not have ALL ground and air traffic off of the main thread for example?  

Share this post


Link to post
9 hours ago, Noel said:

Terrain texture loading seems to be a process than can occur in parallel to a higher degree and hence gets distribute over my 8 available logical processors.  What others are there?   I know when I have lots of commercial airport traffic performance gets impacted.  Why not have ALL ground and air traffic off of the main thread for example?  

Texture loading is I/O, which is trivial to multi-thread since those threads aren't doing much except waiting for said I/O to finish. That's why there was such a huge win with FSX SP1. (If you want to feel nostalgic, set your AFFINITYMASK to 1 and get that good old fashioned single-core experience).

Your example is more difficult, because the traffic isn't independent of you, or each other. There's a single global world state, and the traffic is part of it. My (uninformed) guess is that the rendering isn't something that they can easily break out into different threads, or that the rendering engine itself is something that makes so many assumptions about being single-threaded that it would take a complete (risky, expensive, but probably needed) rewrite.

Cheers!

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now