November 15, 200619 yr Hi Kev,Great thread and I hope something good comes of it. I don't have anything overclocked and recently changed FIBER_FRAME_TIME_FRACTION=.50 and changed TEXTURE_BANDWIDTH_MULT=50. Those two changes seem to help ever so slightly with the stutters though there is certainly room for improvement. I started with locked frame rates of 15 and bumped that up to 20 and can still generally maintain them. Flying into KEWR/KLGA/KJFK will, of course, reduce that number substantially.I have an ATI X800XT with 2 X 1-GB of PC3200 system ram._________________________________________________ Processor : AMD X2 4800+ @2.4GHzL1 Cache : 64k X 2L1 Instruction Cache : 64k X 2L2 Cache : 1024k X 2FPS : 20 - locked and maintained 80% of the timeFiber setting : 0.50Jim Karn
November 15, 200619 yr "When Win32 Fibers switch in and out, the CPU cache all but clears. Fibers renders CPU cache's useless. FSX uses fibers. FSX seems to run sweet on some (not all) lower end processors. Could this be why?"No. The L2 cache does not cache ONE chunk of memory. The cache is divided into many pages. Granded, the past several version of FS all used fiber, they don't seem to have any problem."I think we will see a trend. Faster processors with larger CPU caches will probably perform similar to a lower end processor with a smaller cache. Just an experiment really... could be worse, I could be asking you all to take your temperatures and PH levels :)"I have to ask what your background in computer science and processor design is."I want that to happen because if indeed, fibers are destroying the usefulness of large CPU caches, then it would mean probably months before a patch would be released because it wont be an easy thing to fix :("If this in fact is true, it will spell the doom of Intel and AMD. Because as you said if fiber is having severer effect on large caches, native threads will magnify the effect by a few factors, and you can forget about a fix for this. Wai Wu
November 15, 200619 yr Gentlemen,I hope you don't mind my poking my snout into this conversation, but having done a fair amount of lengthy benchmarking for another MS/ACES flight sim of some reknown (CFS3, yes there I said it) and using the tightly-controlled results to build a quantative and lengthy "Tweak and Tune" guide for the community - based on tweaks FROM the community - which used a consistent, solid flight designed to exercise the machine (very much what you are after), you might entertain this suggestion, which I definitely would have used had it been available on that platform:There is a simple, excellent utility you may know about from FS9 called "FS Recorder" which has now been ported to FSX by the author. It allows you to record - with great fidelity - an entire flight, and play it back. One of the wonderful benefits it offers is the avalability to share the flight recording file with others so they may replay it on their PC. The recorded data is very efficient, the files are small, and the processor overhead is negligible.Granted, there will be some very slight performance hit but it may be hardly measurable in this case, and the ability to have a consistent measurement could be very well worth it. It would require that all involved would install the small utility (it's a dll that goes in the modules folder, that's it) and use it to play back the test flight recording while in-cockpit. Views are not recorded nor played back, but turning towards and then away from a detailed area might suffice.FRAPS logging could also be used, which will provide an average FPS for the entire timeframe, in black and white, no subjectivity. I think there is still a freeware demo verion of FRAPS.Again, just a simple suggestion to possibly assist with obtaining a consistent yardstick. More info on this freeware tool can be found here (and no I am not a salesman for it, just an enthusiastic user):Link -- http://fs-recorder.net/And not to jinx FSX by mentioning CFS3 but they both come from the same garage, so to speak....so keep the faith, if some of us could get whole large squadrons flying that sim happily enough, and the sim became a gem in it's own right over time, I'm sure the same thing and more can and will be done for FSX.Best of luck. I'd be glad to assist but am still waiting on my copy to show up, and as far as I know the recorder won't work in the FSX demo...which is, at least, quite fine-tuned by now and running beyond the 'acceptable' level...to me.
November 15, 200619 yr It may page that 1024k into many smaller pages, but how many threads are we running, and how many fibers on each? Too many fiber switches will result in the cache completely vaporizing in some cases, even on 1 meg processors.Native threads do not see much of this effect, as they use the Win32 thread pool's I/O completion port model to limit context switches. Fibers don't, meaning that the context switching is completely in the hands of the developer and he better be sure he is doing it right or performance is going to dramatically suffer. The developer has to write their own switching scheduler and must find a way to limit context switches in order to minimize page after page of cache clearing, because too much can render even a meg of cache pretty much useless.What is my background? I have no real background, I'm 100% self taught. I tried college but got bored because I knew everything and more than what they offered. I have extensive background in application design, graphics programming, web design, digital photography etc.... all of it is self taught.Before anyone jumps on this to say that I am far from being a pro on this whole subject, I would like to point out that this self taught individual has held a few decent high profile jobs in the past, for example I have actually worked for Nikon Canada in a few different capacities, IT infrastructure, web designer, security, database design... you name it. A company like that wouldn't invest the amount they were paying me if I didn't know my stuff. That being said, I am not a processor specialist. I have rudimentary knowledge of fibers and threads, but I wholly comprehend the information I've been reading on the subject. When I read about some far out technology advancement I usually find myself thinking on how it works, in depth and all. I'm not saying I'm a know it all.... I'm just saying that I tend to comprehend some pretty complex stuff. So yes, I'm not a pro when it comes to threads, fibers and caches... but I can have decent input on the subject because I understand them, and the surrounding technologies.There are smarter people than me out there. There are people who are up and knowledgeable about processors and fibers. If one could chime in I would be more than happy to tell him that I've drawn completely wrong conclusions. If this is in fact, the performance 'bug', then we know theres nothing we can do about it other than wait for a patch... otherwise, we keep looking for a root cause and find a temporary solution until that patch is released. There is definitely something wrong with FSX and it may be a while before a patch, so it makes perfect sense to explore any theories in an effort to try to find a temporary band aid to us all until the patch comes out. In the case of fibers being the root cause, there won't be any band-aid. The application itself is responsible for the fibers and how they execute, therefore we can't use a typical windows setting or registry tweak to change their behavior :(That being said, I will reiterate something I mentioned earlier.... this is only a theory, one that has to be explored. Whether I'm right or wrong has zero impact on me, I'm simply offering up an explanation as to why it appears that even very similar processors are getting unexpected performance in FSX. If I'm wrong, I'm wrong.... I don't mind that, in fact it would be enjoyable because it wouldn't require a complete rewrite to fix it.... but if I'm right, and that inefficient context switches are causing high end processors to drop in performance, then at least validation would serve a purpose.
November 15, 200619 yr I can tell you the issue is not the processor cache. I increase the memory latency and see no impact in performance. The way it looks like is that FSX is spending too much time on what 3D object to put on the scene and where. Only ACES will know provided the people who designed this part of the code are still with them.
November 16, 200619 yr >I can tell you the issue is not the processor cache. I increase the>memory latency and see no impact in performance. The way it looks like>is that FSX is spending too much time on what 3D object to put on the >scene and where. Only ACES will know provided the people who designed>this part of the code are still with them.Memory latency and CPU cache are two completely different beasts. Memory latency will affect overall performance to a certain point, texture load times, transfer to GPU time etc, it doesn't necessarily result in any loss to the raw processor throughput. The CPU cache does though, it's lightning fast in comparison to system RAM, and most higher end processors really utilize it to post the kind of benchmarks they are getting.You know that the Windows Page File is a disk location that windows can swap things in and out of memory. For example, if you don't have enough ram to run Photoshop, FSX, and Firefox all at once, you can still do it because of the page file. Basically, as windows wants to run threads from either application, it is swapped from disk to system ram so that it can be run.... while the dormant application swaps out to disk. Muchly simplified explanation of what the system page file does.What a CPU cache does is almost the exact same though at a much higher level. As programs are in system memory, some of the instructions or data from that application are swapped from system memory to the processor cache. The cache is lightning fast in comparison to system ram, as is system ram compared to the disk page file. This is why we use caches really.So basically, as a thread runs, the working instructions and data are within the cache and running at an intense speed. When a new thread gets processor time, it has to swap out the cache to the slower system ram and load up the new threads working set into the cache.... it then gives that thread it's processing time and goes on to the next thread, and so on. Continually, the processor cache will swap out. However, threads don't usually utilize the whole cache, just part of it. You can have many threads running and the cache won't need completely reloaded at any one time, only small pages of memory at a time. Thread switching is efficient, and scheduling is very dependable.Fibers however, are a different beast from threads. They require that the developer do all the work. Everything in the context event is done by the developer, including scheduling or exit routines. The developer must have written the code that will save the existing fiber's cache, load its own data, execute that data, determine status of runtime allowance, then save its data to system ram and recall the previous cache data so that when it hands control back to its governing thread, it hasn't missed a beat.It is a lot to accomplish, and everything must be done right. If you have 20 fibers all running and doing their own things, you can be sure that the whole 1meg cache is continually being used to swap data in and out instead of crunching the larger numbers that its used to.Someone here has actually mentioned a very good argument. I've been giving it much thought too. The fact that FS2004 also used fibers, yet doesn't have these performance issues is deflecting any truth to this theory. Indeed, I believe that it is a good argument.However. With all that has been added to FSX, with the new shaders, bloom, and many other enhancements, we can rest assured that the use of fibers has increased dramatically as well. This severe increase and dependency on fibers may still fall to this theory... this is why FS2004 works well in comparison. Just my observation though.Another thing I've thought about is shaders. ACES developed this NextGen software for DX10, yet they have in fact provided us with DX9 shaders in the software. This is why I don't get people who claim that the software runs sluggy because it is designed for tomorrows hardware. No. When they release the DX10 patch for FSX, it will have DX10 based shaders.... for now, it has DX9 shaders that are running horribly slow in comparison to other games that utilize shaders. I know some will mention that, "but it renders the world!!!"... but really, your screen in FSX is no more exotic and complex than some other game titles out there that render a few square miles in very high detail, right down to the window frames.The fact that they provided DX9 shaders are so sluggy tells me that there is a possibility that other development mistakes exist. Could the fiber scheduling system be buggy and inefficient? It could very well be. Is it the reason it runs like crap on very high throughput processors while running just as crappy on an older processors? Who knows. We either need a knowledgeable techie on the subject, or conclusive proof via enough test data. Thus, this thread.
November 16, 200619 yr >What I think you really need to do is get someone with an>Intel E6600 or better to test these findings>>I dont know enough about the different Kernal Software to give>an opinion - BUT I notice that so far you are all using AMD's>and there IS a difference in the way Intel has worked the>Conroe with it's 4Mb Cache>>Just an observationI do know that AMD and Intel handle cache differently.One uses a inclusive cache setup and the other a exclusive cache?I think the inclusive setup has a copy of some instructions that are in L1, in L2.Not at all sure if thats the reason for the difference in performance, or whatever??I fly PMDG's 747 a lot. FSX 747 is flyable but KJFK looks more like KBGR pretty much barren. I can see a few cars and trucks going by tho. With all settings set low I get 12 on the ground and 12-14 in the air.I got an old 754 pin CPU, SC A-3400 1 MB-L2 with 1 gig corsair memory. Rather dated now.With PMDG my FPS never goes below 35 at any crowded airport?In the air looking up in spot view I've seen the FPS bounce off 100.The only setting maxed in FSX is the global one, the install instructions say to advance it for AMD CPU's.
November 16, 200619 yr Boy oh boy, this is fascinating!Keep it up Kev, we're with you every step of the way.Cheers!MikeEdit: BTW, I second the proposal that we use FS Recorder as a tool to help with this research.
November 16, 200619 yr Here ya go: AMD64 3500 Venice 512KB L2 Cache 2.2GHZ ASUS A8N Deluxe SLI EVGA 7900GT KO 256mb 2G Corsair PC3200 400DDR SB Audigy 2 80G ATA FS only 200G SATA Track IR3 w/vector FS Genesis mesh for FSX My settings are below. Sitting on 34R at KSEA in default 737 VC I get 14-16FPS and when airborn it rises to target framerate or 1-2 below. If I use if for bush flying then I crank autogen ALL the way up as well as water effects and consistently get between 20 and 25. This is after reducing cloud textures. I have not reduced scenery textures. I have added the fiber frame tweak set at .25 Target Framerate 25 1280x1024x32 Trilinear Filtering (app controlled) AA checked (app controlled) Both program controlled Global Texture High Advanced Animations Text single line High Res Cockpit Aircraft Casts Shadows Aircraft landing lights Level of Detail Radius-midrange Mesh Complexity 100 Mesh Resolution 10m Texture Resolution 1m Water Effects 1x Scenery Complexity Dense Autogen none Special Effects High Airline Density 40 General Aviation 23 Airport Vehicle Medium Road,Ship,Leisure-0 Detailed Clouds Medium Cloud Draw 0 Weather Changes Medium I have been wondering myself why I am fairly happy with my performance while those that rushed out and bought the Core2Duosare complaining.Not sure why but mine works well. Craig
November 16, 200619 yr >>Memory latency and CPU cache are two completely different>beasts. Memory latency will affect overall performance to a>certain point, texture load times, transfer to GPU time etc,>it doesn't necessarily result in any loss to the raw processor>throughput. The CPU cache does though, it's lightning fast in>comparison to system RAM, and most higher end processors>really utilize it to post the kind of benchmarks they are>getting.>That exactly what I was trying to say if every the CPU tries to access memory results a cache miss. Slowing the memory should reduce CPU performance considerably.>>Fibers however, are a different beast from threads. They>require that the developer do all the work. Everything in the>context event is done by the developer, including scheduling>or exit routines. The developer must have written the code>that will save the existing fiber's cache, load its own data,>execute that data, determine status of runtime allowance, then>save its data to system ram and recall the previous cache data>so that when it hands control back to its governing thread, it>hasn't missed a beat.>>It is a lot to accomplish, and everything must be done right. >If you have 20 fibers all running and doing their own things,>you can be sure that the whole 1meg cache is continually being>used to swap data in and out instead of crunching the larger>numbers that its used to.>Not that I like the use of fiber, ACES did give a reason that they are using it. Fibers are cooperative thread. If one of them is in a extended loop and the programmer forgot to insert a yield call, it still slow down the system considerably, and who know how many programmers are in ACES writing codes that supposed to work cooperatively. However, ACES perfer fibers. What can I say. But than again, what the heck does this have to do with the L2 cache?
November 16, 200619 yr > That exactly what I was trying to say if every the CPU tries to access> memory results a cache miss. Slowing the memory should reduce CPU > performance considerably.Yes, but I think if we take into consideration that an AMD64 4000+ would perform just as well mathematically (true throughput) on relaxed memory timings as opposed to even overclocked memory. This is because the barebones throughput is designed to utilize average quality main system ram. So with that in hand, it would be logical to assume than even if main system ram is heavily taxed in a game, that the processor is still able to perform because there will always be enough available ram to meet the supply and demand of the processors minimum needs.I don't know if that comes across right. Basically what I'm suggesting is that while system ram plays a large role in the performance of most games, it doesn't actually reduce the overall processor effectiveness via cache starvation. I could be wrong though... any insight would help.> Not that I like the use of fiber, ACES did give a reason that they > are using it. Fibers are cooperative thread. If one of them is in a > extended loop and the programmer forgot to insert a yield call, it > still slow down the system considerably, and who know how many > programmers are in ACES writing codes that supposed to work > cooperatively. However, ACES perfer fibers. What can I say. But than> again, what the heck does this have to do with the L2 cache?Yes, it's very hard to keep even a small team of people on the same page when designing something as complex as a simulator. Indeed, hats off to ACES for even pulling off what they have. The problem with fibers isn't just that they, by nature, can clear a whole cache if used in excess while being poorly managed, but that switching so many fibers over and over even with good scheduling can actually lead to the severe crippling of any processor that depends on it for sheer throughput.What does it have to do with the L2 cache? Well, the L2 caches of todays processors are what enables out systems to run many threads seemingly simultaneously on a single core. In essence, the large L2 cache is responsible for the multitasking abilities of our day... as well as large single process working datasets. If the L1 cache is cleared constantly to the L2 cache as rampant fibers run loose, the L2 cache will eventually serve but one purpose.... a place to swap to. Run enough fibers and that place gets full as well, resulting in a swap to even slower system memory.If the L2 cache is being utilized solely for context switching, its not being utilized for performance. This means that even a single thread application would run like a slug on this processor if its fibers were inefficient enough to render the L2 overtaxed.... add to the equation , that our OS is demanding several threads, as well as FSX is... and you got a whole lot of bad performance right there.Again. I know nothing concrete, just speculation based on my own understanding of the architecture. Fibers have two problems. When they stop their own execution in order to do something, the whole thread stops. If a thread is responsible for several fibers, we have to be sure that the fibers do not block or every other fiber halts as well as the thread.Secondly, the other thing fibers are bad for, is context switches. The overheads in thread switching are a lot less severe and better managed through a time-proven set of API instructions developed long ago and being expanded upon every instance of Windows. Fibers do not have this infrastructure.... they depend on the fact that the developer better have a clue about what he's doing and be very proficient at a logical analysis of his application as a whole. It's a grand feat for someone to actually pull this off... in fact it'll never happen. No one is perfect, and no developer can ever keep tabs on every part of his code in such a large project. So really, it's not a surprise that this may be the cause, it's actually expected.
November 16, 200619 yr Taling about the cache performance. If the fiber switching costs a lot of cache misses, shouldn't it be reflected through the performance monitor showing the processor not at 100%? Because whenever there is a cache miss, the processor waits until the data is brough into the cache.
November 16, 200619 yr > BTW, I second the proposal that we use FS Recorder as a tool to help > with this research.Me too, makes sense really. Especially since fraps really bites the biggie right now. Because Fraps is writing to an AVI file continually, the texture loading and scenery in my movie really sucks.Also, for some reason it loses sync right off the bat. I don't know why, but the AVI doesn't play back in real time, it simply plays back the recorded frames at 30fps regardless of what my in-game fps was, this means that the avi constantly speeds up and slows down. Working on the sync problem though. The Fraps idea was just to give people an idea of what the flight is, but I like the FS Recorder idea better.... so maybe we'll just use that.Before going that far though, I would like to wait for more people to chime in with how their system is performing and what their processor specs are.... specifically, I'm dying for someone with a 6800 Ultra and a processor slower than 2.6ghz. Because I'm betting that they too are out performing my machine unless they bought the high performance version of that particular processor.BTW: Something else I've thought about lately. I know that some people were annoyed with the FSX survey that was posted. If anyone would like to propose a proper survey I will be more than happy to not only host it, but design it from the ground up.... it'll not look pretty though, I don't want to invest too much time making a very good looking site around it. But the idea is there anyways.... see what people truely think of FSX in real time, as it displays results even as the survey is incomplete.
November 16, 200619 yr What is the resolution of the performance monitor? I think it's every half second or so? Not sure. Anyways, good point, one that I'll have to play with thats for sure.From the top of my head though, the blown cache is something that lasts for mere microseconds. Sure, when it happens 20-30 times per second it should be visible by some means. However I think that we would need to sample processor utilization at a very high resolution to see the peaks and troughs related to the problem. Actually.... a thought.The cache misses and the processor has to wait for more data to supply the L2 cache which, in turn has to repopulate the L1 cache. During this time, nothing is done save for the basic management of this internal process.... all threads on the PC basically halt for that microsecond or so.If this is the case, the performance monitor itself, being a process in the operating system, cannot actually do anything.... it can't run the code to get the cpu usage sample. This could be why the Perf Monitor would be ineffective in judging whether the cache is being dumped or not... we wouldn't even see a drop in CPU usage at all :(I'm not completely satisfied with my own explanation really, I'm going to read up on it a little.
November 16, 200619 yr >Actually.... a thought.>>The cache misses and the processor has to wait for more data>to supply the L2 cache which, in turn has to repopulate the L1>cache. During this time, nothing is done save for the basic>management of this internal process.... all threads on the PC>basically halt for that microsecond or so.>>If this is the case, the performance monitor itself, being a>process in the operating system, cannot actually do>anything.... it can't run the code to get the cpu usage>sample. This could be why the Perf Monitor would be>ineffective in judging whether the cache is being dumped or>not... we wouldn't even see a drop in CPU usage at all :(>>I'm not completely satisfied with my own explanation really,>I'm going to read up on it a little.>LoL. Again, not the entire L2 cache needs to be refleshed. If the perfmon is a very small program, say less than 128K (code + data), which fits itself within ONE page in the cache, during other program's cache misses, it can still run, just no output.
Create an account or sign in to comment