Skip to content
View in the app

A better way to browse. Learn more.

The AVSIM Community

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

P3Dv5 CPU utilizes all cores - an observation

Featured Replies

Hi all,

I have recently upgraded from the 7700K to the 10900K and I would like to share this observation. For years we have been complaining that FSX/P3D do not utilize all CPU threads. This image below proves that P3Dv5 is using all cores a very positive sign that of how much P3D has evolved since its first iteration.

Sjylugh.png

Edited by ComSimPilot

Simulators: Prepar3D v5.4  | X-Plane 12 | DCS  World  MSFS 2024 | 
PC Hardware: Dell U3417W AMD Ryzen 7 9800 X3D | ASUS TUF 5070 Ti ASUS TUF B580 Plus Wifi | G.Skill Z5 Neo 64GB 3000Mhz CL30 | Samsung 990 Pro 2TB + 970 EVO Plus 1TB + 860 EVO 2TB + 850 EVO 1TB, Western Digital Black Caviar Black 6TB Corsair RM1000i Corsair 280 Titan RX | VRM Fan | Fractal Design Define S2 Gunmetal |
Flight Controls: Fulcrum One Yoke Virpil VPC WarBRD Base Virpil VPC MongoosT-50CM Grip, Thrustmaster Warthog+F/A-18C Grip VIER IM POTT Sidestick CPT Side | Thrustmaster TPR Rudder Pedals | Virtual Fly TQ6+Throttle Quadrant | Sismo B737 Max Gear Lever Monsterteck Desk Mounts WINWING EfisL+FCU+MCDU |
My fleet catalog: Link                                                                                                                                                       

  • Replies 53
  • Views 9.6k
  • Created
  • Last Reply

It did in P3D v4 also. Just sayin'.
(Ryzen 1700X and now Ryzen 3700X user). 😊
 

Edited by F737NG

AMD Ryzen 5800X3D; MSI RTX 3080 Ti ; 32GB Corsair 3200 MHz; ASUS VG35VQ 35" (3440 x 1440)
Fulcrum One yoke; Thrustmaster TCA Captain Pack Airbus edition; MFG Crosswind rudder pedals; miniCockpit FCU; CPFlight MCP 737; Logitech FIP x3; TrackIR

MSFS; Fenix A320; A2A PA-24; HPG H145; PMDG 737-600; AIG; RealTraffic; PSXTraffic; FSiPanel; REX AccuSeason Adv; FSDT GSX Pro; FS2Crew RAAS Pro; FS-ATC Chatter

  • Commercial Member

FSX is the same. With HT enabled you can see that both Logical Processors (LP) of core zero are nearly maxed therefore the main rendering thread is sharing bandwidth of core zero with the second thread. With FSX and P3D we need to see only the first LP utilised. We use an Affinity Mask to do that, there's lots of info around the site regarding this problem.

Steve Waite: Engineer at codelegend.com

  • Author
2 minutes ago, SteveW said:

FSX is the same. With HT enabled you can see that both Logical Processors (LP) of core zero are nearly maxed therefore the main rendering thread is sharing bandwidth of core zero with the second thread. With FSX and P3D we need to see only the first LP utilised. We use an Affinity Mask to do that, there's lots of info around the site regarding this problem.

Steve thanks for the reply. Trying to understand your comment, you mean that the first LP is not fully utilized? As I see it my processor is utilized more than 90% which is a good sign, isn't it?  Would you suggest a specific AM for the 10900K. Currently I run with no set AM in the .cfg file. 

Simulators: Prepar3D v5.4  | X-Plane 12 | DCS  World  MSFS 2024 | 
PC Hardware: Dell U3417W AMD Ryzen 7 9800 X3D | ASUS TUF 5070 Ti ASUS TUF B580 Plus Wifi | G.Skill Z5 Neo 64GB 3000Mhz CL30 | Samsung 990 Pro 2TB + 970 EVO Plus 1TB + 860 EVO 2TB + 850 EVO 1TB, Western Digital Black Caviar Black 6TB Corsair RM1000i Corsair 280 Titan RX | VRM Fan | Fractal Design Define S2 Gunmetal |
Flight Controls: Fulcrum One Yoke Virpil VPC WarBRD Base Virpil VPC MongoosT-50CM Grip, Thrustmaster Warthog+F/A-18C Grip VIER IM POTT Sidestick CPT Side | Thrustmaster TPR Rudder Pedals | Virtual Fly TQ6+Throttle Quadrant | Sismo B737 Max Gear Lever Monsterteck Desk Mounts WINWING EfisL+FCU+MCDU |
My fleet catalog: Link                                                                                                                                                       

  • Commercial Member

No. I am saying the first two LPs of HT core 0, the two top leftmost graphs, are both nearly fully utilised. That means they are each getting only 50% of the possible throughput. So what we do is use an AM with an "01" for each HT core so that these tasks gain up to 100% of core bandwidth because they are not shared.

We could turn HT off and by that we would enable only one LP per core, so that we are avoiding the sharing of the core.

Rather than disable HT we can use the AM to enable only one LP per core.

with HT disabled we get ten cores 1111111111

and with HT enabled we get 20 LPs 01,01,01,01,01,01,01,01,01,01

So we could use an AM = 349525 instead of disabling HT and we are using ten cores.

With the ten core CPU we get 20 LPs with HT enabled, two per core. It is still only ten cores. Even so we might only want to use 8 cores and leave two cores for the system (4 LPs). SO giving an example AM of 00,00,01,01,01,01,01,01,01,01. See that each pair, separated by commas, belong to one core.

So we are using 8 cores (8 LPs) for P3D and leaving two cores (4 LPs) for the system. Remember that the "01" on the far right represents the two top graphs on the left. We can copy and paste that list of 01's into the binary field of Windows calculator set to programmer mode and we get the decimal value 21845 So we can use an AM of 21845 for the ten core CPU and get good performance

Edited by SteveW

Steve Waite: Engineer at codelegend.com

  • Commercial Member

But you are quite right to point out that P3D will use every Logical processor it finds.

With Hyperthreading (HT) disabled P3D will find all ten cores since there is only one LP per core and will make a task on each LP. However, when HT is enabled there are 2 LPs per core and we see P3D filling all 20 LPs with 20 tasks, that's still the ten cores we started with each core shared between two tasks. With a movie converter we probably want all 20 LPs. But with P3D (and FSX) we have one task that renders the screen, the remaining tasks pull in the data.

We want to be sure that the main rendering task get's 100% of the core. So that by applying the AM in HT mode we can enable only the first LP with that leftmost two graphs  the rightmost  "01" in the binary AM without switching off HT.

We also see an overhead on the CPU for each task set up so we don't necessarily want too many tasks, because after a certain point, no matter how many tasks, we can't pull in data any faster, the system is saturated at some point. Again with HT disabled we half the task count, or leaving HT enabled we restrict the task count with the AM.

Edited by SteveW

Steve Waite: Engineer at codelegend.com

Well, this whole discussion has me scratching my head.  The pic posted by the OP shows no activity on LPs 1, 12, and 15, which is odd with no AM unless the OP is using per-core hyperthreading on the 10900K to disable HT on physical cores 0, 6, and 7.

And Steve...I'm having a hard time connecting the poster's original thesis with your posts here...you talk about a "problem", but the OP is saying the opposite--that the software is finally using all the LPs (though it isn't in this case).

One question comes to mind as I try to make sense of this...if you do use per-core HT, I wonder what the affinity mask looks like (e.g. if HT is disabled on cores 0 and 1, and enabled on the other 8, does the affinity mask reflect an 18 LP processor?

 

Bob Scott | President and CEO, AVSIM Inc
ATP Gulfstream II-III-IV-V

Sys1 (MSFS20+24/XPlane12+11): AMD 9800X3D, water 2x240mm, MSI MPG X670E Carbon, 64GB GSkill 6000/30, nVidia RTX4090FE
Alienware AW3821DW 38" 21:9 GSync, 2x4TB Crucial T705 PCIe5 + 2x2TB Samsung 990 SSD, EVGA 1000P2 PSU, 12.9" iPad Pro
Thrustmaster TCA Boeing Yoke, TCA Airbus Sidestick, Twin TCA Airbus Throttle quads, PFC Cirrus Pedals, Coolermaster HAF932 case

Sys2 (P3Dv5/v4): i9-13900KS, water 2x360mm, ASUS Z790 Hero, 32GB GSkill 7800MHz CAS36, ASUS RTX4090
Samsung 55" JS8500 4K TV@60Hz,
3x 2TB WD SN850X 1x 4TB Crucial P3 M.2 NVME SSD, EVGA 1600T2 PSU
Fiber link to Yamaha RX-V467 Home Theater Receiver, Polk/Klipsch 6" bookshelf speakers, Polk 12" subwoofer, 12.9" iPad Pro
PFC yoke/throttle quad/pedals with custom Hall sensor retrofit, Thermaltake View 71 case, Stream Deck XL button box

Sys3 (DCS/P3Dv4/ATS/ETS): AMD 7800X3D, MSI MPG X870E Carbon, Noctua NH-D15S, 64GB GSkill 6000/30, EVGA RTX3090
Alienware AW3420DW 34" 21:9 GSync, Corsair HX1000i PSU, 4TB Crucial T705 PCIe5 + 2TB Samsung 970Evo Plus,
TM TCA Officer Pack
, Saitek combat pedals, TM Warthog, TM RS300 FF wheel/pedals, Coolermaster HAF XB case

  • Commercial Member

In the image the top leftmost two graphs are nearly maxed, that's two LPs of one core shared, maybe look again.

Steve Waite: Engineer at codelegend.com

  • Commercial Member

...double checked and sure enough LP0 is almost maxed (top left), and LP1 (Just right of top left) is fully maxed. Both those LPs are getting around 50% of the core throughput.

Disabling HT would allow only one task per core. Using the AM method of "01" for each core allows only one Task per core.

The overall CPU throughput is also showing around 94% because most of the 20 LPs are all maxed.

Steve Waite: Engineer at codelegend.com

  • Commercial Member
10 minutes ago, w6kd said:

if you do use per-core HT, I wonder what the affinity mask looks like (e.g. if HT is disabled on cores 0 and 1, and enabled on the other 8, does the affinity mask reflect an 18 LP processor?

First of all, always use LP0 (or call it core zero if you like, HT disabled). generally HT on or off is across the whole CPU. However with HT disabled on 0 and 1 and the remaining all HT enabled, then the CPU looks like an 18 core CPU. and you would use an AM 01,01,01,01,01,01,01,01,1,1, the two rightmost ones representing cores 0 and 1. Always use the comma or dot delimited nomenclature when using or mixing in HT.

Steve Waite: Engineer at codelegend.com

  • Commercial Member

...in technical discussions it's also more professional and less disconcerting to avoid the terms "scratching of head", "having a hard time" and so on as are not necessary and put off those trying to learn something. 

Edited by SteveW

Steve Waite: Engineer at codelegend.com

  • Commercial Member

Going back to the example of converting a video stream, the program can render a frame on each LP. With HT enabled the program can render two frames at once on one core. Even so  those two frames are rendered at half speed because the core is shared. Even so, because the HT mode saves time swapping the context (that is, saving and loading registers of each thread for it's share of the core time, or time-slice) that time is shorter.

Similarly we can use that gain in performance with some of the tasks in P3D (and FSX) by allowing two tasks per core, one on each LP. We can see that P3D (and FSX) do that with every task. We avoid sharing the first task with another task on the first core (core zero) by disabling HT or using "01".

However, with many cored CPUs (such as the ten core) we easily gain the maximum draw on scenery from the system without using all ten cores. With lesser cores we might gain a small amount of performance pulling in the scenery data with two per core.

A test can be made by measuring how quickly the system can get to the first render of the simulator screen. We can use a stopwatch and compare as we add cores or LPs to the AM. We see that the time taken reduces as we add cores. At some point that time decrease becomes very short, we have reached the maximum pull on the data. However we still continue to see small gains as each LP added enabled the data to be assembled more quickly. Since these tasks are not time sensitive like the main rendering task, we can see that enabling pairs of HT LPs continue to reduce the time taken to get to the start of the simulation. However, as I mentioned, too many tasks becomes a burden on the system overall and contributes to poorer performance in the main rendering task.

Edited by SteveW

Steve Waite: Engineer at codelegend.com

16 minutes ago, SteveW said:

...double checked and sure enough LP0 is almost maxed (top left), and LP1 (Just right of top left) is fully maxed. Both those LPs are getting around 50% of the core throughput.

Disabling HT would allow only one task per core. Using the AM method of "01" for each core allows only one Task per core.

The overall CPU throughput is also showing around 94% because most of the 20 LPs are all maxed.

OK, the 94% overall CPU load supports that.  The graphs show the same heavy line at 0 and 100%, and I find it rare that a core would be firewalled at 100% without even a momentary dip.

7 minutes ago, SteveW said:

...in technical discussions it's also more professional and less disconcerting to avoid the terms "scratching of head", "having a hard time" and so on as are not necessary and put of those trying to learn something. 

I consider the terms quite appropriate to impart a sense of confusion about what was being said.  Maybe it's a US vs British language thing.  If anyone is truly so weak-kneed that "I'm scratching my head" (in America, a colloquialism for "this has me confused") makes them run away and hide, well, so be it. 

Bob Scott | President and CEO, AVSIM Inc
ATP Gulfstream II-III-IV-V

Sys1 (MSFS20+24/XPlane12+11): AMD 9800X3D, water 2x240mm, MSI MPG X670E Carbon, 64GB GSkill 6000/30, nVidia RTX4090FE
Alienware AW3821DW 38" 21:9 GSync, 2x4TB Crucial T705 PCIe5 + 2x2TB Samsung 990 SSD, EVGA 1000P2 PSU, 12.9" iPad Pro
Thrustmaster TCA Boeing Yoke, TCA Airbus Sidestick, Twin TCA Airbus Throttle quads, PFC Cirrus Pedals, Coolermaster HAF932 case

Sys2 (P3Dv5/v4): i9-13900KS, water 2x360mm, ASUS Z790 Hero, 32GB GSkill 7800MHz CAS36, ASUS RTX4090
Samsung 55" JS8500 4K TV@60Hz,
3x 2TB WD SN850X 1x 4TB Crucial P3 M.2 NVME SSD, EVGA 1600T2 PSU
Fiber link to Yamaha RX-V467 Home Theater Receiver, Polk/Klipsch 6" bookshelf speakers, Polk 12" subwoofer, 12.9" iPad Pro
PFC yoke/throttle quad/pedals with custom Hall sensor retrofit, Thermaltake View 71 case, Stream Deck XL button box

Sys3 (DCS/P3Dv4/ATS/ETS): AMD 7800X3D, MSI MPG X870E Carbon, Noctua NH-D15S, 64GB GSkill 6000/30, EVGA RTX3090
Alienware AW3420DW 34" 21:9 GSync, Corsair HX1000i PSU, 4TB Crucial T705 PCIe5 + 2TB Samsung 970Evo Plus,
TM TCA Officer Pack
, Saitek combat pedals, TM Warthog, TM RS300 FF wheel/pedals, Coolermaster HAF XB case

  • Commercial Member
3 minutes ago, w6kd said:

I consider the terms quite appropriate to impart a sense of confusion about what was being said.  Maybe it's a US vs British language thing.  If anyone is truly so weak-kneed that "I'm scratching my head" (in America, a colloquialism for "this has me confused") makes them run away and hide, well, so be it. 

That is often referred to as supercilious - it  only serves to make your post appear to be saying that anther is wrong. But they were not.

 

Edited by SteveW

Steve Waite: Engineer at codelegend.com

  • Commercial Member

Let's look at how a single core handles two identical threads:

SwitchingImproves.jpg

On the right the core is HT disabled - no Hyperthreading, each thread (one orange and one purple) gets a time slice and between each slice the CPU has to save the situation to be loaded later.

Now in a simple way of showing it, on the left is the same core with HT enabled, and with the same two threads arranged so that each thread occupies an LP to itself. The two threads are still time sliced, since there is only one core. But these two threads finish sooner with HT enabled because their context is saved.

Steve Waite: Engineer at codelegend.com

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.