Sign in to follow this  
Cruachan

P3D v3.1 Performance Testing (RESULTS with 8-Core CPU) (Updated)

Recommended Posts

Hi,

 

A few of you will be aware that I started another thread recently introducing this somewhat contentious topic as it applies to an 8-Cored CPU: http://www.avsim.com/topic/483992-8-core-cpu-performance-data-fps-delta-prepar3d-v31/

 

Some very helpful advice and guidance has been posted in that thread and it soon became clear that I needed to take this a stage further in an effort to cast further light on my experience.

 

Very few of you have invested in an 8-cored CPU like the i7-5960x. The majority are using 4-cored CPUs and a fewer number have the 6-cored variety. So, and quite understandably, attention tends to be focused on these units as FSX, and now Prepar3D, have been shown to work optimally using 4 unmasked cores with Hyperthreading enabled in the System BIOS. Yet my subjective observations appear to suggest that Prepar3D, has the ability to spread active threads over more cores and still deliver a smooth visual experience providing Hyperthreading has been disabled. All the theory states that this should not be the case and flies in the face of hard data retrieved to date.

 

At this stage I feel it is best not to labour the point so I will crack on and present my method and results. The data was collected using Ideal Flight 10 Professional and processed by Microsoft Excel. IF10 Pro allows the user to set the time interval over which data is collected and this was set to 120secs. IF10 Pro is capable of consistent weather generation and this feature was employed to ensure unchanging repeatability using the two scenarios which were created specifically for testing.

 

Each TestFlight used the Prepar3D default Beach Baron 58 and started paused at an altitude of 2000ft. The first flight begins over Seattle (KSEA) and the second a few miles NW of Eglin (KVPS). Autopilot was engaged in each saved TestFlight scenario. In the first, the B58 banks slowly to heading 180 degrees from the initial heading and, in the second, the B58 banks to heading 90 degrees from the initial heading.

 

PILOT'S FS Global 2010 FTX Compatible, ORBX FTX Global, Global Vector and FTX Trees HD were all active for both scenarios.

 

FTX Global Vector (Active Features):

Transportation: Highways (all), Primary Roads (all)

Water Features: Small Rivers & Streams)

Various Features: Beaches, Golf Courses, Parks, Power Lines

Airport Elevation Corrections: Any conflicts automatically disabled

 

Air Traffic Manager was used to restrict air traffic to a target count of 30 for each TestFlight.

 

Prepar3D.exe (32) Process Lasso Rules:

1. Excluded from ProBalance Restraint

2. Classified as a Game

3. Application Power Profile: High Performance

 

Each 8 minute procedure lead to the 2 minute collection TestFlight. As a demonstration of the procedure followed I recorded a 7.5 minute video showing an example for a TestFlight NW of Eglin (KVPS). The flight ends following an overfly of nearby Eglin Test Site B6 (FL34). The video will display up to and including 108p60, the higher the better! Be patient, sometime it may appear as if nothing is happening. Grab a hot (or cold) beverage and watch :smile:

 

http://youtu.be/9rFRzxqzUHo

 

Unless otherwise stated, GPU SLI remained enabled for each test.

 

KSEA is harder on frame rates than KVPS. I thought it would be useful for comparison purposes to include both test areas. Stuttering, when it occurred, always seemed likely under more demanding conditions such as that imposed by Scenery and the plethora of Objects at KSEA and Seattle.

 

RESULTS:

 

NOTE: Unless otherwise stated, Antialiasing - Transparency Supersampling: 4x Sparse Grid Supersampling (SGSSAA) remained enabled for each test. This was matched with MSAA = 4 samples in Prepar3D.

 

EFSA = ExpandedFSAffinity. This entry appears in the Config.ini file of IF10 Pro under [FSSetup]. The default setting is 'True'. Perhaps Steve Waite will be kind enough to explain its significance. I was asked to test with an Affinity Mask:1360 and Hyperthreading: Enabled using EFSA=True and EFSA=False.

 

Prepar3D v3.1 Settings:

 

P3DSettings-1.jpg

P3DSettings-2.jpg

P3DSettings-3.jpg

 

Prepar3D World Maps showing TestFlight start points:

 

Maps_TestFlightStartPoints.jpg

 

FPS Delta results from original thread (see first paragraph) taken at (KSEA) - included as a reminder of the original testing results, this time in expanded line graph form:

 

PerfDataOLD_LineGraphs%201.jpg.jpg

 

Compilation of latest results with Hyperthreading enabled/disabled and using various Affinity Masks:

 

PerfData_LineGraphs%202_1.jpg

PerfData_LineGraphs%202_2.jpg

PerfData_LineGraphs%202_3.jpg

PerfData_LineGraphs%202_4.jpg

PerfData_LineGraphs%202_5.jpg

PerfData_LineGraphs%202_6.jpg

PerfData_LineGraphs%202_7.jpg

PerfData_LineGraphs%202_8.jpg

PerfData_LineGraphs%202_9.jpg

PerfData_LineGraphs%202_10.jpg

 

The NVIDIA Inspector Prepar3D profile was returned to default NVIDIA settings (No SGSSAA):

 

PerfData_LineGraphs%20NoSGSSAA_3_1.jpg

PerfData_LineGraphs%20NoSGSSAA_3_2.jpg

PerfData_LineGraphs%20NoSGSSAA_3_3.jpg

PerfData_LineGraphs%20NoSGSSAA_3_4.jpg

PerfData_LineGraphs%20NoSGSSAA_3_5.jpg

PerfData_LineGraphs%20NoSGSSAA_3_6.jpg

PerfData_LineGraphs%20NoSGSSAA_3_7.jpg

 

Maxwell sample interleaving (MFAA): Enabled in the NVIDIA Inspector Prepar3D profile. SGSSAA: 2x selected in the NVIDIA Inspector Prepar3D profile. MSAA: 2 Samples selected in Prepar3D:

 

PerfData_Line%20Graphs%20MFAA-On%204_1.j

 

Hopefully, the included text added to each of the above images will be self-explanatory.

 

My System Specs are as below in the signature area.

Monitor Resolution: 2560x1440x32. GSync: Disabled

 

As stated at the beginning of this post my impression is that the smoothest, most fluid performance is achieved with 8 Cores active and Hyperthreading: Disabled. 7 Cores with masking of Core 0 may be marginally better. 4 Cores in any configuration produced, at times, intolerable stuttering rather than micro-stuttering and this was always worse at KSEA.

 

The fluidity approaches and, indeed, frequently matches that seen in Rob Ainscough's excellent series of videos. However, his solution is different as he prefers to use DSR (Edit: no longer the case - see Rob's post below) with a 4K monitor refreshing at 30Hz and anchoring the Prepar3D frame rate also to 30fps. I have not used VSync or Triple Buffering. I see no tearing and very little micro-stuttering using my two GTX 980s configured in SLI and the higher frame rates achieved doubtless help to mask/suppress the appearance of any periodic hitching that can occur.

 

As the graphs show, the situation gets even better now that I can use Multi-Frame Sampled Anti-Aliasing (MFAA) along with a reduction of SGSSAA and MSAA. :smile:

 

Despite all I have said, I think I can say that my mind still remains open in the same way as any enthusiastic AVSIM member is keen to improve performance of his favourite sim without significant loss of detail. I would be reluctant to change any of my current settings without a convincing explanation as to why it would be a good idea.

 

Finally, I believe I've reached the point when this topic should be opened for discussion and, Instead, I can return, at long last, to engage in some real world activities...at least for a time..LOL!

 

Cheers!

Mike

Share this post


Link to post
Help AVSIM continue to serve you!
Please donate today!

 

 


his solution is different as he prefers to use DSR with a 4K monitor refreshing at 30Hz and anchoring the Prepar3D frame rate also to 30fps.

 

Excellent post Mike, but I'm NOT using DSR?  For my normal flights I'm 8X MSAA no NI, default NCP, 4K res.

 

 

 


As stated at the beginning of this post my impression is that the smoothest, most fluid performance is achieved with 8 Cores active and Hyperthreading: Disabled.

 

Stick with what works for you ... my results are the same as yours for my 5960X, but I'm always open to suggestions and testing out other settings when I have time.  Ghz and RAM latency/bandwidth rules the day for my setup ... and that is best achieved with HT OFF.  

 

Going from 4.3 GHz HT ON to 4.6 Ghz HT OFF (I can't get very stable 4.6+ with HT ON) gets me that extra FPS to maintain 30 (30hz monitor) and still keeps my memory bandwidth at 60 GB/s.  I've also ran at 4.7+ Ghz but stability required 1.5v - started to turn my Sim Pit into a sauna ... diminishing returns in terms of room heat.  4.6Ghz seemed to be that sweet sport where fan speed isn't too noisey and room temps stay comfortable.

 

Cheers, Rob.

Share this post


Link to post

Hi Rob,

 

Thanks, much appreciated. I must say that this exercise has given me renewed respect for all the time and effort you must put into this. There comes a point when such activity crosses the boundaries of being a hobby and enters the realms of becoming an unpaid vocation. Where on earth do you find the time?!

 

Regards,

Mike

Share this post


Link to post

Nice results. However wasn't the consensus in the other thread that MFAA is still not supported by P3D? It could be that the performance improvement coming from enabling MFAA is just essentially coming from reducing MSAA and SGSSAA.

Share this post


Link to post

However wasn't the consensus in the other thread that MFAA is still not supported by P3D? It could be that the performance improvement coming from enabling MFAA is just essentially coming from reducing MSAA and SGSSAA.

 

Hi, (sorry, I don't know who you are)

 

Yes, I believe you are correct. I've spent the last couple of days, as time permitted, testing with every combination of MSAA, SGSSAA and MFAA settings, with and without SLI enabled, and I have been unable to convince myself that MFAA is having any effect whatsoever. So I guess I should forget about MFAA for the time being. The only visible improvements have been with MSAA along with SGSSAA.

 

On my screen the best visual results generally are seen with MSAA (4 samples) in Prepar3D coupled with 4x SGSS in NVIDIA Inspector. The main body of graphs in my first post were derived from testing with this these settings in place. There is, of course, a penalty in terms of a slight drop in performance, but I judge that as being acceptable.

 

Ah well, just underlines, once again, that the maintainance of an initial level of healthy scepticism in the face of new knowledge is usually the wisest policy.

 

Regards,

Mike

Share this post


Link to post

Hi, (sorry, I don't know who you are)

 

Yes, I believe you are correct. I've spent the last couple of days, as time permitted, testing with every combination of MSAA, SGSSAA and MFAA settings, with and without SLI enabled, and I have been unable to convince myself that MFAA is having any effect whatsoever. So I guess I should forget about MFAA for the time being. The only visible improvements have been with MSAA along with SGSSAA.

 

On my screen the best visual results generally are seen with MSAA (4 samples) in Prepar3D coupled with 4x SGSS in NVIDIA Inspector. The main body of graphs in my first post were derived from testing with this these settings in place. There is, of course, a penalty in terms of a slight drop in performance, but I judge that as being acceptable.

 

Ah well, just underlines, once again, that the maintainance of an initial level of healthy scepticism in the face of new knowledge is usually the wisest policy.

 

Regards,

Mike

 

Well at least we finally have some proof that P3D can take advantage of more cores with the right AM. MFAA is the least of our concerns  here.

 

Good work on the tests. Steve was saying the other day that nobody has provided substantial results on P3D working better on more than 4 cores and I believe it is finally here.

 

Regards,

Shanan

Share this post


Link to post

 

 


Well at least we finally have some proof that P3D can take advantage of more cores with the right AM.

 

Yeah, don't block P3d from using core 0 with the AM, if HT=off.

 

Else, the only reason to use AM with HT off, is to prevent P3d and specific add-ons from "fighting" over specific physical cores. I have 12 physical cores and I only use AM to keep addons like ASN, the GTN 750, etc. on the top 3 cores. P3d runs smoothly on the other 9 with HT=off.

Share this post


Link to post

To round things off, I thought some of you might like to see a couple of videos of the TestFlight recorded at KSEA. Each lasts approximately 2 minutes, so no time for anyone to get bored :smile:

The first, like the video of the TestFlight procedure taken at KVPS (link in original post), is from an external viewpoint. The second is the exact same flight and is viewed from the virtual cockpit. This latter view was not used in the testing procedure, but I have included it for comparison and does serve to demonstrate why it is usually better to fly out of the more complex airports from within the cockpit rather than viewing the take-off externally. As ever, your results will vary.

KSEA imposes a significantly greater load than KVPS and frame rates are lower until the P3D default B58 clears KSEA. You will notice some intermittent microstuttering (not nearly as severe as seen with other settings which, frankly, produce what can be described as unacceptable stuttering), observed especially in the external view during the steep bank of the B58, but these abate and the experience quickly becomes very smooth as the aircraft levels out and moves away from the airport.

External View:

Virtual Cockpit: 

 

Sorry about the quality. Somehow during processing YouTube seems to have eliminated any vibrance displayed in the original recordings. Also, posting videos to YouTube is a new experience for me so perhaps there are things I should be doing to improve their appearance.

Remember, the externally viewed 2 minute flight is exactly the same as that used to collect the data displayed in the series of line graphs in my original post. So, you can now compare the graph with the performance displayed in the video of the TestFlight which ran with the following settings:

8 Cores (HT=Off) Affinity Mask= 0 (1,1,1,1,1,1,1,1)

MSAA = 4 Samples along with 4x SGSSAA is assumed for this flight and, indeed, all the rest as that is currently my preferred antialiasing solution. Screen resolution: 2560x1440x32. Monitor refresh rate: 120Hz

BTW, did anyone spot the error? In the series of graphs under the heading "The NVIDIA Inspector Prepar3D profile was returned to default NVIDIA settings (No SGSSAA)", the graph second up from the bottom on the left is labelled incorrectly. It should, of course, be Affinity Mask=0 (1,1,1,1,1,1,1,1).

Cheers,
Mike

Share this post


Link to post

Very well documented review of the 5960 and it's best settings for use with P3Dv3.

You showed the same as Rob experienced : with no HT and all cores active it works the best with P3D.

 

Myself, I am using a 5820 , which is a 6-core processor and works the best on AM=340 or 1360.

I am curious why this 8-core is working better on no HT / all cores enabled than a 6-core,...

Both more than 4 cores...

 

Perhaps Steve ( or someone else ) can explain this.

Share this post


Link to post

Gerard,

 

Could be the L3 cache ... it's shared across CPUs ... no HT so less concurrency and increased probability of L3 Cache hit (5960X = 20MB, 5820K 15MB).

 

But that's just a guess, to really know why, you would need access to the source code and specialized analysis tools from Intel or write your own tools to integrate into the source code for logging.  If I'm not mistaken such analysis is done on an attached PC such that the logging doesn't skew results on the monitored PC ... I haven't ever needed to do this type of analysis myself, but I'm going way back to my days working with Spectrum Holobytes and some of the custom tools they used to look at performance.

 

Cheers, Rob.

Share this post


Link to post

Unfortunately all the results show massive fps Delta. It still appears that you are falling into the same trap of halving the thread count per core by turning off HT.

Share this post


Link to post

I see the problem Mike, you used lasso. You cannot "touch" the affinity of P3D, that's complicated. With HT enabled you'll get two jobs per core instead of one, and so you will find HT disabled works better. And don't lasso IF10 either...

Share this post


Link to post

I see the problem Mike, you used lasso. You cannot "touch" the affinity of P3D, that's complicated. With HT enabled you'll get two jobs per core instead of one, and so you will find HT disabled works better. And don't lasso IF10 either...

Hi Steve,

 

Thanks very much for contributing. I have been hoping you would join the party as, to be honest, you were the main drive behind my decision to embark on this interesting exercise.

 

Prepar3D.exe (32) Process Lasso Rules:

1. Excluded from ProBalance Restraint

2. Classified as a Game

3. Application Power Profile: High Performance

 

In fact. I was careful to pay heed to your advice in this and other respects. If you have time, please watch the video of the testing method followed prior to each 2 minute flight:

 

You will see that in each case the affinity mask was applied via Prepar3D.cfg rather than Process Lasso. The only rules applied were as listed above and these ensured that Prepar3D ran optimally. No rules were applied to IF10 Pro although it may have been subject to some automatic ProBalance Restraint.

 

Regards,

Mike

Share this post


Link to post

If you're seeing worse performance with HT enabled, that's only because there's more stuff happening on the core - period.

 

It is a mathematical fact that with the proper configuration, your simulator will work better with HT enabled. It's just up to you to work out how to do that.

 

I'm afraid you can't run P3D and touch it with any affinity rules. I've no idea why you would want to have anything running that may bring some kind of uncertainty to the testing.

 

Forget lassoing P3D or IF10, whatever the rules, and can you honestly describe to me what those rules mean?

 

No rules were applied to IF10 Pro although it may have been subject to some automatic ProBalance Restraint

But anyway, what sort of testing methodology includes; "it may have been subject to some automatic ProBalance Restraint"?

 

Very well documented review of the 5960 and it's best settings for use with P3Dv3.

You showed the same as Rob experienced : with no HT and all cores active it works the best with P3D.

 

Myself, I am using a 5820 , which is a 6-core processor and works the best on AM=340 or 1360.

I am curious why this 8-core is working better on no HT / all cores enabled than a 6-core,...

Both more than 4 cores...

 

Perhaps Steve ( or someone else ) can explain this.

Short answer Gerard, it's not. There's a lot of really bad performance, and some slightly less bad performance, shown in Mike's readings.

 

Well at least we finally have some proof that P3D can take advantage of more cores with the right AM. MFAA is the least of our concerns  here.

 

Good work on the tests. Steve was saying the other day that nobody has provided substantial results on P3D working better on more than 4 cores and I believe it is finally here.

 

Regards,

Shanan

Unfortunately the tests are fatally flawed for the reasons I've mentioned earlier.

 

All that's been proved is that really poor performance is beaten by less than really poor performance - meaning slightly better than poor performance doesn't mean best performance.

Share this post


Link to post

IF10 provides the exact same environment from the exact same frame in the sim, same cloud, same thermal. So it makes sense to use this facility to produce meaningful repeatable results. Use the save flight menu item in IF10 in-sim menu. If you're on AP with a GPS plan you can save there and when the sim is loaded you'll start there and you can even set it to go unpaused. Paused is OK, as long as the sim is unpaused at the same time.

 

IF10 starts the sim and the affinity assigned in the P3D cfg AM is utilised. The sim lays itself out depending on the amount of LPs it sees through the mask. If it's interfered with, it will make as many jobs as there are free LPs on the CPU, through that process, and then those jobs will cluster onto the unmasked LPs. This shows performance gravitating toward an equal count of unmasked bits to cores, and HT enabled shows very poor results since there's double the thread count being shovelled onto fewer LPs. The apparent anomaly with HT enabled and the first LP masked with AMs such as 254, is that put simply there's only half the threads on that core (zero) if the sim starts with 254, or is lasso'd to it and there's more than four jobs.

 

 

I'm waiting for proof that 5 cores (or more) is better than four, and some sensible theory to go with it...

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this