Jump to content

Recommended Posts

12 hours ago, MDFlier said:

Good advice. My only concern with per core is as I stated, there was not a lot of thought given into thread safety (preventing threads from stepping on each other's resources) or thread scheduling when FSX was complied. .. . So... Modern SW written and compiled with multi threading in mind should handle unsynced cores just fine, because all of the threads are aware of each other and the the SW has built in provisions for playing "traffic cop" to make sure that everything gets done in the proper order. When we run FSX on a multi core CPU, all of the thread distribution and timing is being handled by the hardware and OS. FSX isn't really managing all of those threads. I'm just not 100% confident that given a lack of thread awareness in FSX that the HW and OS will always get things done in the correct way. It's a bit of a hack. 

This is completely backwards. Modern software does NOT attempt to handle thread awareness or scheduling. It spawns multiple threads and then lets the OS kernel prioritize their execution. The kernel is the only entity that knows what is going on across the entire system; it's designed to ensure that threads aren't starved for execution time, stay on the same core whenever possible, etc. Different core speeds aren't a problem either - even if you are running at the same clock speed you may have vastly different execution time between threads as they spend time being blocked waiting for memory access, I/O, etc. If you've been moved to a different core and your data isn't in the L1 cache, you're waiting several hundred (thousand?) cycles.

The number of people in the world who can write a proper thread scheduling engine in an OS kernel can probably fit into my living room, which is why good software just spawns threads, gives them a priority and lets the OS kernel do the rest. Same thing for memory management - proper virtual memory managers are really hard to write, so applications should just memory-map ALL of their data and let the OS do the rest. Fibers in the Win32 world (which is what FSX uses) are a bad idea and I expect that nothing modern is using them today. They were questionable even a dozen years ago:

https://blogs.msdn.microsoft.com/larryosterman/2005/01/05/why-does-win32-even-have-fibers/

I've been writing multi-threaded software for almost 20 years. My first rule was "this is hard, and I am stupid" and I follow it even today. I suspect that's why I've been somewhat good at it, and trying to get into thread-awareness or scheduling violates that prime directive. I leave all of that stuff to people who are much, much smarter than me.

BTW, FSX (and FS9) are mulit-threaded and thread-safe. If they weren't, you'd very rapidly run into data corruption issues as threads stomp on each other. FSX and P3D's problems have nothing to do with thread scheduling; they just place too much work on a single thread rather than spreading the load.

Cheers!

 


Luke Kolin

I make simFDR, the most advanced flight data recorder for FSX, Prepar3D and X-Plane.

Share this post


Link to post
Share on other sites
32 minutes ago, dmiannay said:

The good news, though... just finished a 7 hour flight in the PMDG747 from KSDF to PANC with no issues.  CPU package temp peak was 77*C, which is acceptable.

Great to hear!  Have fun!

Greg

Share this post


Link to post
Share on other sites
10 hours ago, lownslo said:

Ahh, thanks Martin.  I recalled that the 5xxx series were the first to use non-soldered TIM... not so though.

 

 

It's Haswell E, so soldered. But some crazy individuals have delidded them. Very risky though, and not much of of temp improvement as solder is already top notch for thermal conductivity. 

 

Share this post


Link to post
Share on other sites
9 hours ago, Luke said:

BTW, FSX (and FS9) are mulit-threaded and thread-safe. If they weren't, you'd very rapidly run into data corruption issues as threads stomp on each other. FSX and P3D's problems have nothing to do with thread scheduling; they just place too much work on a single thread rather than spreading the load.

That's very interesting, Luke.  Based on your experience, would it be a massive rewrite for LM to implement better thread distribution in P3D?



Doug Miannay

PC: i9-13900K (OC 6.1) | ASUS Maximus Z790 Hero | ASUS Strix RTX4080 (OC) | ASUS ROG Strix LC II 360 AIO | 32GB G.Skill DDR5 TridentZ RGB 6400Hz | Samsung 990 Pro 1TB M.2 (OS/Apps) | Samsung 990 Pro 2TB M.2 (Sim) | Samsung 990 Pro 2TB M.2 (Games) | Fractal Design Define R7 Blackout Case | Win11 Pro x64

Share this post


Link to post
Share on other sites
4 hours ago, dmiannay said:

That's very interesting, Luke.  Based on your experience, would it be a massive rewrite for LM to implement better thread distribution in P3D?

Probably, but I've never seen the code. I suspect the low hanging fruit got captured in FSX SP1 and LM has been nibbling away at it since then (IIRC gauges are run on a separate thread in v4).

The best analogy is to think of a big Thanksgiving dinner for your family and relatives. It's a lot of food and different courses, and for you to do it by yourself in your kitchen will take a long time. You enlist your wife to help you, and things go a lot faster. But as you start adding more and more people to help you, there are certain things you simply cannot speed up.

For example, the turkey takes a set amount of time in the oven. You can't go from 4 hours to 2 hours by getting a second oven. If it's at a different temperature than what the pie needs, you're stuck being unable to bake the pie until the bird is done. When there's several of you in the kitchen, you start getting in each other's way, or the ingredients you need may not be ready yet by another person.

That's the challenge with multi-threaded code - it's a process problem more than anything else: dividing up the work in a logical fashion where things can be done independently and conflicts or stalls minimized. That's rarely easy.

In a previous job I built a system for automated processing of weather imagery - it was pretty easy because we got a set of images that we split into tiles, resized, shrunk the color palette and then converted to PNG. It was an embarrassingly easy problem because there were almost no data dependencies so the work could be done independently without waiting on anything else. Just like our Thanksgiving example - you and I can each cook Thanksgiving dinner and gain double the productivity because we're each using our own family and kitchen to make it.

I don't think P3D is that way - it probably could get better but it likely involves some significant rewrite.

Cheers!

 


Luke Kolin

I make simFDR, the most advanced flight data recorder for FSX, Prepar3D and X-Plane.

Share this post


Link to post
Share on other sites

Interesting, indeed. 

I've have been writing multi threaded applications for a bit longer than you have. I wrote Unix applications (AT&T System 5 Releases 3 and 4, and SCO Unix based on AT&T 4.2) from 1989-94, and I've got 25-30 Windows releases under my belt. Our software is a distributed transaction processing system that coordinates over 2 million transactions per day coming in from over 150,000 people, working on 43,000 distinct devices in over 13,000 individual locations. We do inventory, accounting, ordering, payroll, shipping and receiving, and general retail transactions. We have paid very close attention to threading in every single project I have ever worked on. Not all OS's and hardware behave the same. The only guarantee for us as developers to ensure that the product we build will work reliably on all of the various platforms is to be meticulous in making sure that worker and background threads do what they are supposed to do, when they are supposed to do it, without conflicting with other thread's resources, and completing in a time frame that prevents race conditions and other potential blocking or locking situations with other threads.  

The article that you posted the link to was from 2005, and is quite a bit dated. Here's one 3 months old that gives some basic considerations for current generation of MS develoment tools. https://docs.microsoft.com/en-us/dotnet/standard/threading/managed-threading-best-practices

And here's a brief comment from Phil Taylor of the ACES studio discussing the threading capabilities of FSX (https://blogs.msdn.microsoft.com/ptaylor/2006/11/30/fsxtoday-and-tomorrow/)

4 ) CPU architecture and moving forward.

Aces made its architectural decisions about FSX 2-3 years ago.

It wasn’t clear to me, and I am sure it wasn’t clear to the rest of Aces and many of our readers in 2003 and 2004, that multicore was the future. Since those sorts of design decisions are baked in early, as it became clear in late 2005 and 2006 that the CPU landscape had changed it was just too late to make the major architectural changes required to make our internal architecture more parallel.

We use fibers and threads, but still have serialization issues to work out. Which is why our second core (and beyond) usage is low, on the order of 20%. And the changes required are not trivial changes, like simply shifting thread affinity. The order of operations required for correct rendering and sim behavior and the linkage between subsystems is what it is, and it means that none of our options include simple fixes.

Once you are on the glide path it is a very risky decision to change the architecture underneath the product. For better or worse, we decided to not do that and ship the product.

-----

My system running FSX puts a heavy load on 4 cores. By Phil Taylor's own admission, FSX is clearly not responsible for doing that. It's Win 10 and Intel working together to make a "best effort". 


 i9-10850K, ASUS TUF GAMING Z490-PLUS (WI-FI), 32GB G.SKILL DDR4-3603 / PC4-28800, EVGA GeForce RTX 2080 Ti BLACK EDITION 11GB running 3440x1440 

Share this post


Link to post
Share on other sites

Thanks for the analogy, Luke.  A complex problem, indeed.



Doug Miannay

PC: i9-13900K (OC 6.1) | ASUS Maximus Z790 Hero | ASUS Strix RTX4080 (OC) | ASUS ROG Strix LC II 360 AIO | 32GB G.Skill DDR5 TridentZ RGB 6400Hz | Samsung 990 Pro 1TB M.2 (OS/Apps) | Samsung 990 Pro 2TB M.2 (Sim) | Samsung 990 Pro 2TB M.2 (Games) | Fractal Design Define R7 Blackout Case | Win11 Pro x64

Share this post


Link to post
Share on other sites
1 hour ago, MDFlier said:

IOur software is a distributed transaction processing system that coordinates over 2 million transactions per day coming in from over 150,000 people, working on 43,000 distinct devices in over 13,000 individual locations.

Not bad - we topped out around 480m transactions per day but we're definitely in the same ballpark and faced some of the same issues. (And yours was I expect more write-heavy than mine.) The biggest difference that you may have faced that I didn't (through deliberate design) or that MS faces is the notion of cross-platform compatibility. In the FS world, we're all running on x86(-64) on Intel. (thankfully)

1 hour ago, MDFlier said:

The only guarantee for us as developers to ensure that the product we build will work reliably on all of the various platforms is to be meticulous in making sure that worker and background threads do what they are supposed to do, when they are supposed to do it, without conflicting with other thread's resources, and completing in a time frame that prevents race conditions and other potential blocking or locking situations with other threads.

I am in violent agreement with you on this statement.

But it it's important to note that this statement explicitly doesn't cover the notion of scheduling your own threads and processes. I agree with you that your threads should be limited in scope, efficient in their execution, ensure that locks or other exclusive access sections are as short as possible to avoid blocking other threads. I agree, because we don't know what other threads are running or where they are - threads are simple and dumb because our apps are simple and dumb and let the kernel handle the scheduling.

1 hour ago, MDFlier said:

The article that you posted the link to was from 2005, and is quite a bit dated. Here's one 3 months old that gives some basic considerations for current generation of MS develoment tools. https://docs.microsoft.com/en-us/dotnet/standard/threading/managed-threading-best-practices

And here's a brief comment from Phil Taylor of the ACES studio discussing the threading capabilities of FSX (https://blogs.msdn.microsoft.com/ptaylor/2006/11/30/fsxtoday-and-tomorrow/)

I agree with you that the article is old, and that's the point. Even over a decade ago, there were questions regarding whether Fibers were appropriate. They were a "solution" to the problem that context switches between real threads were too expensive, and even then people were suspecting that it would be a minor or non-existent problem on faster chips. As you point out, modern standards don't use fibers at all - they use 'real' threads, which are up to the OS kernel to schedule and run.

1 hour ago, MDFlier said:

My system running FSX puts a heavy load on 4 cores. By Phil Taylor's own admission, FSX is clearly not responsible for doing that. It's Win 10 and Intel working together to make a "best effort". 

Honest question - what does Intel have to do with it? The CPU is just running a set of instructions on a given core, no more, no less. The chips may do a little bit of speculative execution, but beyond that they don't do what you don't tell them to do.

Here's the fundamental business problem I have - let's assume that you're right, and FSX had a vastly better code execution engine than the Windows kernel. If that was the cause, why on Earth would Microsoft not immediately steal that person and have him (or her!) rewrite the NT kernel? Why would that developer want to work for a significantly lower salary? Companies aren't dumb. Talent ends up in the right place (even if at a competitor).

Again, I've never actually met anyone who can write a world-class thread scheduler or VM manager. I've seen or heard of a lot of people who thought they could, and that's where they got into trouble.

Cheers!

Luke

PS: On a side note, I wonder when P3D drops its own memory management for textures and models and merely memory maps everything into VAS and lets the OS handle it.


Luke Kolin

I make simFDR, the most advanced flight data recorder for FSX, Prepar3D and X-Plane.

Share this post


Link to post
Share on other sites
1 hour ago, Luke said:

Here's the fundamental business problem I have - let's assume that you're right, and FSX had a vastly better code execution engine than the Windows kernel. If that was the cause, why on Earth would Microsoft not immediately steal that person and have him (or her!) rewrite the NT kernel? Why would that developer want to work for a significantly lower salary? Companies aren't dumb. Talent ends up in the right place (even if at a competitor)

Luke, I am having difficulty understanding this portion of your response. 

I never said that FSX was a "code execution engine". FSX is the code being executed. The NT kernel is fine. Why would anyone want, or need to do anything to with it?

Application design specifies if and how threads will be used within the application.  

An example would be if I write an application in Visual C++, VB. or C# that does not  explicitly use background or worker threads. When I run my application on an i5-8700K with 12 threads, it is entirely possible that some of the instructions from my single threaded application will end up executing on more than one core (or a hyperthread) even though I made no actual attempt at all to make it do so. Some of the x86 instructions utilize more than one "execution pipeline" by design. If my application writes to or reads from a file, the kernel handles that request, which may result in a driver launching the IO requests as a new thread or threads. Graphics calls made by my application might get serviced by a driver that uses multiple threads. 

I might choose to write that same app and intentionally design it to utilize background or worker threads. If I chose to, I could make my IO calls their own thread while the main program continues on. I could split up large array manipulation into 4 threads each doing 1/4 of the work, or 12 threads doing 1/12 of the work. But in doing so, I need to provide a safeguard (via mutex, lock, etc.) so that the next section of code that utilizes the manipulated array doesn't start working with the array until all 12 of the manipulation threads have finished. If only 11 are done, and I start using the array, I suspect that the results would be "unexpected'. 

The Windows kernel schedules time slices for applications. It will not split up your application into multiple threads. It provides a context (set of registers and memory space) for each application to run in, and it manages the switching of each context to run on the processor cores in succession.  

Application design dictates whether or not an application will utilize multiple threads. All applications have one thread. Many have more than one thread. When the complied program is run, any API calls (either to the OS, to a driver, or to another application) might result in multiple threads if the called API, service, or driver is written to do so. All programs, APIs, and drivers are compiled into the x86 instruction set. Many of the actual x86 instructions in the x86 instruction set use multiple "execution pipelines".

As Phil Taylor said, they didn't consider that they would need to incorporate proper safeguards into FSX in order to prevent threads from getting finished out of order when they first started writing FSX. In his statement, he said that they intentionally designed FSX so that only 20% of the workload was offloaded to the 2nd (and above) core(s) precisely because they were worried about things not getting done in the proper order. These are not my thoughts, they are directly from Phil Taylor, who was one of the lead developers of FSX. 

When we run FSX on a 6 core CPU, we clearly see more than 20% utilization on the cores beyond #1. So... What causes that since we know that FSX is not intentionally doing it? I'll bet the Nvidia driver API routines are multi threaded. I'd bet that Windows IO calls and task scheduling routines are multi-threaded. Throw in the additional 20% usage of every core above the primary by FSX, and it looks like FSX all by itself is utilizing 90% of 4 or more cores. In reality, it is the entire FSX context that is using that much CPU, not just the FSX application running in that context.    


 i9-10850K, ASUS TUF GAMING Z490-PLUS (WI-FI), 32GB G.SKILL DDR4-3603 / PC4-28800, EVGA GeForce RTX 2080 Ti BLACK EDITION 11GB running 3440x1440 

Share this post


Link to post
Share on other sites
36 minutes ago, MDFlier said:

Luke, I am having difficulty understanding this portion of your response. I never said that FSX was a "code execution engine". FSX is the code being executed. The NT kernel is fine. Why would anyone want, or need to do anything to with it?

I think the statement of yours I have most difficulty with and am addressing is "I'm just not 100% confident that given a lack of thread awareness in FSX that the HW and OS will always get things done in the correct way". The hardware makes no difference, and to be honest the OS should make no difference either unless the scheduler has changed. Looking back on your statement, in retrospect you're agreeing with me more than I initially gave you credit for - you're saying that FSX isn't managing the threads. I just am unsure why you think it's a bad thing or that the operating system will do so poorly. It will be correct.

39 minutes ago, MDFlier said:

As Phil Taylor said, they didn't consider that they would need to incorporate proper safeguards into FSX in order to prevent threads from getting finished out of order when they first started writing FSX. In his statement, he said that they intentionally designed FSX so that only 20% of the workload was offloaded to the 2nd (and above) core(s) precisely because they were worried about things not getting done in the proper order. These are not my thoughts, they are directly from Phil Taylor, who was one of the lead developers of FSX.

Agreed. The core of FSX (and FS9 and other before it) was guaranteed to run on a single thread, so they didn't make it thread-safe. Why would one? That's probably why performance hasn't scaled well and part of the challenge.

Keep in mind that IIRC Phil Taylor was not a developer, he was the Program Manager for FSX. That's a significant role in the Microsoft model, making decisions and tradeoffs on the product which by necessity requires some technical understanding, but it's very different than being a core developer. I would be very surprised if he wrote even a single line in the actual code base. That doesn't mean his statements are invalid; it just means that there's plenty of other reasons why you may see more core utilization.

1) They estimated incorrectly. TBH if I am in the same order of magnitude before further optimization with my estimates I consider it a win.

2) They (gasp) under-promised and over-delivered.

3) They made assumptions based on the cores of the time, which were often P4s with HyperThreading or dual-core machines. Perhaps they were able to scale better than they expected.

4) They made assumptions based on the data of the time. Perhaps our scenery, landclasses and terrain mesh are denser than 2006 and their work scales better with demand.

5) I'm not sure about the API calls or driver work - wouldn't that be reflected under kernel, not user, time? I don't see substantial kernel usage on the other threads. The API calls won't be multi-threaded under the covers unless they say so, and that's almost certainly not going to change behind the scenes. Microsoft (for good reason) is exceptionally averse to changing the behaviors of core windows APIs.

6) What are your add-ons? I'd venture that things like FSUIPC, AS and other add-ons that wholly or partially execute within FSX are adding to this number.

I'm just saying that I wouldn't take a statement from Phil of "on the order of 20%" and draw too much extrapolation if you're getting 40 or even 60% more CPU usage.

Cheers!

Luke


Luke Kolin

I make simFDR, the most advanced flight data recorder for FSX, Prepar3D and X-Plane.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...