Jump to content

Kanade

Members
  • Content Count

    15
  • Donations

    $0.00 
  • Joined

  • Last visited

Community Reputation

15 Neutral

Profile Information

  • Gender
    Female
  • Location
    Japan/Hokkaido

Flight Sim Profile

  • Commercial Member
    No
  • Online Flight Organization Membership
    none
  • Virtual Airlines
    No

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Yes, but the breakdown of the 80 MB L3 cache here is that it is a total notation of all of them. These breakdowns are 1CCD = (8Core (L2Cache Par core = 1M x 8=8M) + L3Cache Unified 32MB) = 40MB, which is two sets of caches divided. This is not a configuration like the 5800X3D, which has 96MB in L3 in Unified, which would be a characteristic of the faster 5950X. For these reasons, I believe that the 7950X will probably not outperform the 5800X3D in terms of frame rate jitter and highest FPS. However, it will probably surpass the lowest FPS because it has IODs that are faster, lower heat, and has a higher memory bandwidth. I will wait for the rumored 7800X3D/7950X3D to be announced at CES in 2023/01.
  2. I agree that benchmarking is difficult because each individual has a different PC environment, settings, and time (and physical network location). I think we all just benchmark because we love flight sims and want to make them better. I see people on the forums saying that the SMT is no longer effective with the recent SU10 update, so it is possible that they changed something. DX12 allows the programmer to set any timing for drawing to the screen, so the behavior is different from DX11, where it is left up to the device driver. Since MSFS is an application that can be used in both DX11 and DX12, it may be that the timing for the main thread to submit to the drawing thread has been adjusted.
  3. Supplementation. Add a link about the high load part. At high loads, the CPU package is drawing more power and the FPS is reduced. Since the package temperature remains constant and the clock temperature is also trapezoidal, we can infer that the Shibuya-Shinjuku area is reaching its performance limits. https://forums.flightsimulator.com/uploads/default/original/4X/b/d/4/bd4d559f337c647c40e69ad197b4f35e32310b45.png
  4. @Virtual-Chris Yes, I agree with that understanding, SMTOFF may give better performance in scenarios where thread contention in SMT occurs. In the CPU utilization graph I mentioned earlier, the area where the CPU does not reach 100% during SMTON and 100% during SMTOFF is the Shibuya-Shinjuku area. Even X3D is overflowing processing objects and causing cache miss hits. It is difficult to analyze cache hit/miss hit for an external program (MSFS) that does not have its own code, because it is difficult to do so unless the CPU has a specific instruction set. (Some XeonSPs have these instructions). As additional information, the latency between SMT logical processors of the same processor is L2-dependent, which is a shared resource within the core (about 7ns for Zen3 architecture, which is about 3 times faster and bandwidth compared to 20ns for L3. This is very similar to the case of L3 and memory).
  5. Supplementary explanation. The location I used for my measurements is Tokyo, a photogrammetric city included in the WU01; the WU01 is already installed; the WU01 is a "heavy" course, with a lot of buildings. Also, the flight path is from Haneda Airport to Shinjuku. It has many buildings and belongs to a relatively "heavy" course in the whole of Japan. In the first report, the area with low FPS of 80-100 seconds belongs to Shibuya-Shinjuku, and this part is difficult to speed up even with 5800X3D. If the results do not change with SMTON/OFF, it means that there is no thread contention caused by SMT.
  6. The higher the display resolution, the more dominant the GPU load generally becomes. HT OFF "appears" to double the load because the number of CPUs is apparently halved. Either way, MSFS2020 will not scale performance properly with too many multi-core CPUs. Below are the load factors for each core for the same scenario with HT ON and HT OFF. HT ON https://forums.flightsimulator.com/uploads/default/original/4X/9/4/b/94b9c52f70d81adfc81732e7ffa83431b3fc9e51.jpeg HT OFF https://forums.flightsimulator.com/uploads/default/original/4X/2/8/2/2820fdf536867411856f83fbe5418850bbee802c.jpeg Assuming the same workload (FPS), the load should appear higher with HT OFF. SU10 has now been versioned three times since the benchmark was taken, and the load trend has changed each time. At the time the benchmark was taken it was 1.27.09 and now it is 1.27.13. SU9 may have different trends again.
  7. Yes, the new Zen4 series is announced for next month. I think waiting is a good option. Maybe a VCache product will be out in the not too distant future, Zen5 is DDR5 only, but maybe DDR5 is 2x more expensive than DDR4. I'm torn. (The reason why I dared to buy and evaluate the 5800X3D just before the new CPU is released is because a similar VCache-CPU, EPYC Milan-X/Genoa-X, is not readily available for personal use. I needed to investigate the performance of numerical simulations and various VM benchmarks running on the VCache-CPU, MSFS2020 has workloads similar to those of tightly coupled simulations and is fun to play with and investigate).
  8. I have a VRHMD (Quest2). I also feel that it is effective in my experience. However, I am withholding accurate reports because I do not yet have a logging and reporting method to quantify VR performance by the method I have done. Reason. The NVidia frameview I used cannot handle Oculus' internal data such as asynchronous space warp frequency or photon to motion delay. Also, Oculusdebug, an Oculus tool, cannot directly get GPU performance. (This is because it can only evaluate in terms of FPS and photon to motion delay.) Therefore, it is necessary to establish, test, and report on benchmarking methods in a different way than conventional reporting, otherwise it cannot be called quantification, which requires a certain amount of effort. (The report must be as accurate as possible, eliminating human senses as much as possible.)
  9. Self answered, I have resolved the CTD that occurs in ProcessLasso, I am switching from 5950X to 5800X3D and the affinity mask was still defined on a non-existent processor (16 or later).
  10. Yes, the 5800X3D cannot control SMT from Ryzen master, so it is turned off from the UEFI. There are many scenarios where SMT is useful, and it is often better to have it ON outside of games, so it is a trade-off. I just feel like sharing with the community the results of this kind of experiment about SMTON and OFF, because I have done this kind of experiment. Nothing more. I also own the Pro version of Process Lasso and have used it for several years, but on XBoxLive, I have had problems with the MSFS2020 client CTDing immediately after startup. (Probably a processor affinity issue defined in ProcessLasso. I haven't been able to track down the cause of the CTD yet.)
  11. The Zen architecture is generally more sensitive to memory bandwidth and latency than the Intel architecture. Therefore, we expect to observe a reasonable speed change, especially up to DDR4-3600. (For DDR4-3600 and above, it is implementation-dependent on the motherboard manufacturer, since the Infinity Fabric standard is 2:1 with memory and only supports up to 1800 MHz.) However, in the current Ryzen5000 series CPUs, there are already frequent memory accesses overflowing the cache, and in the scenarios I have seen, it is difficult to hope for that much performance push-up, as some timing is already stuck at 142W, which is the PPT. We could remove the PPT lock, but I have studied the performance/power of each Intel/AMD processor in the past, and I know from experience that the current Zen3 architecture's PPT of 142W is an excellent balance point, so removing it would not do much good We believe that it will not. Aside. In the past, we have repeatedly tested CineBench on many CPUs, in 10W increments. The goal was to validate the processor manufacturer's PPT=142W. We wanted to experience that processor manufacturers do not make these decisions at random. Benchmarking and plotting multiple CPUs in 10W increments from 84W (EcoMode)-230W (SMT+AllCore Static 4.9GHz), we found that the default 142W was an excellent point.
  12. The 5800X3D is only effective in a limited number of scenarios because it makes no difference in cache hit rate for applications optimized to fit in a typical cache size (L3 16-32MB). (especially most office and benchmark applications). In the traditional scenario, the requirement for maximum processor utilization is controllable by the programmer, so performance metrics are monitored during the development phase and if the speed falls below a certain level, the number of objects is limited or all relevant parts are removed. In such applications, the performance is determined by the strength of the processor itself, including the processor ALU and instruction decoder. This is the "more obvious and typical" case for the average PC user. MSFS2020 utilizes machine learning to automatically generate vast terrains. Not all conditions are under programmer control. There are probably many untested scenarios. (This is not due to lack of optimization, but to the fact that the content is growing to a size that is beyond human control.)
  13. I remember that the performance of the MSFS2020 was very confusing from the time it shipped, the ratios contributed by each component of the PC being very difficult to understand. The correlation between the results of various benchmarks and the MSFS2020 (especially for FPS and responsiveness) was quite low, and I did not see anyone doing a satisfactory analysis, so I decided to get my own data. MSFS2020 is (probably) not suited for conventional methods of the nature of being able to say "average nn FPS" throughout the entire game. The load fluctuations are so severe that averaging them out would miss the correct result. Many users say that their CPU should be more capable, but MSFS2020 is not using it to its full potential. Very true, but on the other hand it is not accurate. It is not the ALU (execution part) that is missing, but the data supply part. The execution part has no choice but to stop and wait for data to arrive. Also, the bandwidth of data transfer is important, but even if the bandwidth is excellent, it does not mean that data can be supplied without delay. Latency is more important among multi-processor, tightly coupled applications. The 5800X3D produces a latency reduction of about 300% compared to other processors, "only with about 70MB more L3 cache area" compared to main memory. That is the only difference. (Main memory is typically 70-100ns latency in modern systems, L3 is around 20ns. The ALU only waits about 1/3 of the time for a hit.) I think MSFS2020 is a typical case of an application that happens to have a cache requirement of 32MB or more.
  14. I did the benchmarks focusing on MSFS2020 (under generally standardized conditions) on a processor (5950X/5800X3D) with almost no differences other than the cache to see the effect of the cache. The discussion can be found in the official MSFS2020 forum. FPS difference verification between AMD 5950X and 5800X3D at Haneda Airport (RJTT) 34L straight out flight Two of the main charts are also described here.
×
×
  • Create New...