Sign in to follow this  
rsrandazzo

[07MAY15] Detailed discussion of resolved 777 VAS leak...

Recommended Posts

Captains,

 

I am putting this information into a thread separate from the update release announcement because it is detailed, lengthy, and deserves it's own topic.  I hope you will find the following information educational and useful!

 

As we have discussed here in the forum a few times, I want to give you some specific information related to a memory leak that, with the help of some customers, we were able to identify and kill.  For those who didn't follow the thread in which the item was discussed, the short version of the problem was that we had a function in the path calculation logic of the 777 FMS that would run constantly under a **very specific and very rare** set of circumstances.  In other words, most of you never experienced the issue...  But it did exist and has been resolved.

 

I want to give you a detailed discussion of it in order to avoid some common misconceptions, and also to help you understand what exactly we found and fixed.  Most people see the term "memory leak" and attribute a whole host of outcomes to it that are actually unrelated.  You guys are a bright bunch, so we figure we would throw this information out to you so that you have a better understanding of what happened, what was done, and what you can expect in your own memory management efforts.

 

Anatomy of a Leak:

To describe the problem:  What we had was a function that, when running, would consume memory addresses and not return them to the pool when finished because it would never finish.  Normally this computation should run, pass along a result, and then return it's consumed memory to the pool.  However, in this case, under some very specific circumstances, it would just keep starting the process over, without ever finishing.  The end result was that it would consume blocks of memory as time went on without ever returning them.  (This is a very basic definition of a memory leak, for those who don't know.) 

 

With me so far?

 

Memory leaks like this are actually really easy to find in a debugger.  All you do is graph memory use over time while running the simulation in a debugger, and as long as your memory use curve has a zero slope, you are fine.  We did quite a bit of this sort of testing prior to the 777 release and we were satisfied that we didn't have any issues.

 

In order to trigger this leak however, you had to load an ILS procedure with a heading-to-radial-intercept maneuver located at a specific location within the missed approach.  Not only that, but you had to load this procedure while on the ground prior to departure (optimally) AND on top of all of that you had to be flying a segment that would take a significant amount of watch-on-your-wrist time.  The amount of time it would take to fly the flight didn't matter- what mattered was how long you let the simulation run on your computer.  Then, finally, the airplane had to be in motion.

 

For example, if you had a 16hr flight planned, but flew it using time compression and completed it in 2hrs, you likely weren't negatively impacted.  If, however, you had a six hour flight planned and flew it in real time, or near real time you were likely affected by the leak.

 

So, let me define it for you again, because I want to make it clear how hard it was to run into this particular problem:

  • Prior to departure you had to load an approach with a heading-to-radial-intercept maneuver.
  • That heading-to-radial-intercept maneuver had to be in a specific location in the missed approach procedure.  (Presence alone was not enough- it was also related to location of the maneuver...)
  • You had to then run the simulation for 4-10 hours before you would begin to notice the effects of the runaway computation or experience an OOM that wasn't related to scenery loading.
  • The airplane had to be in motion.

 

If you attempted to fly a flight in real time over a 15hr segment, AND you managed to trigger the path calculation by loading one of these specific approaches/missed approaches, you probably saw an OOM about 8hrs into your flight because If you did all of the correct steps you could expect to lose 100-225MB of VAS per actual-time hour that the computation was allowed to run.  This would push you over the edge into an OOM at between 5-8hrs of flight time, depending on how much scenery loading consumed from VAS at your departure airport.

 

Now- I want to make a very clear point here:  This is a very unusual set of circumstances that most users of the 777 would never run into.  It was hard to trigger, even when you knew what to look for and how to create the issue in the first place.  If you succeeded in triggering the leak, you would see a constant slope increase in VAS use as related to time because the computation ran at a constant rate, consuming memory addresses as it ran without them being properly cleaned up.  If you triggered this problem, chances are you OOM'd in cruise flight out over open water or a large expanse of landscape with very little scenery.  It was rare in our testing that a user would actually make it into the descent on a long range over-water flight.

 

During the course of our research, we interacted with a handful of folks here in the forum who were able to help us narrow the test process down significantly- and I want to thank those folks for their help.  We sincerely couldn't have found this without your help because the circumstances were so rare they hardly constituted a pattern for us to begin unwinding.

 

So What About OOMs?:

I think it is also important to point out, very clearly, some of our thoughts on OOMs generally- as they are a constant topic of discussion among FSX/FSX-SE/P3D users. 

 

If you go back in time to our early FSX experience in, say, 2007-2008, OOMs were really quite rare.  As you move forward to 2015, they have been an increasing menace to simmers using FSX, FSX-SE and P3D (which is based on the original Microsoft ESP, which was a code drop from FSX's state in late 2005...) 

 

As we have researched OOMs, VAS, and this memory leak in particular, we reached some conclusions that are really not surprising- but are apparently not entirely obvious to many simmers- so I figure I will state them out loud here so that maybe we can help a few folks avoid them.

 

The cause of OOMs is (obviously) because we are collectively asking our sim to do more than it is capable of doing in the memory space available to it.  As an aircraft developer, PMDG puts a HUGE amount of effort into controlling growth of VAS use.  We continually evaluate increasing visual quality, fidelity, capability and appearance of our products against the amount of VAS that it consumes and we work very hard to prevent VAS growth since VAS needs to be available for both the airplane, AI traffic, scenery and weather engines alike.  The less we can use, the better off everyone will be.

 

We did a relatively good job of this between the NGX and the 777, and the 744 thus far uses less VAS than the 777.  Overall our VAS use is not dramatically higher than it was with the original 400X released in 2005 and even though it is common for people to "blame the 777" the reality is that the 777 is using around 650MB of VAS, which puts it in the lower half of commonly used addons during a flight.

 

By necessity, scenery takes up far more VAS than the airplane does.  (Re-read that sentence.)  Scenery designers by-and-large do a great job of controlling their VAS use- but by virtue of what they are doing- their products usually take the lion's share of VAS.  Just like aircraft developers, some scenery developers do a better job of controlling VAS use than others- but as a simmer you can also play a role by being very selective about what you run and what you load during a flight.

 

If you are flying a long range segment from highly dense scenery to highly dense scenery using AI traffic, a custom weather engine with custom cloud textures, custom scenery enroute in anything other than the default cessna- you are running the risk of hitting an OOM during the descent when your destination airport scenery begins to load.

 

During our testing, we ran the 777 into a bunch of scenery areas and cataloged the different VAS consumption of various scenery titles, and it was not uncommon to see certain scenery packages consume 1.2GB of VAS as we were in the descent.  Think about that a moment:  The arrival scenery was consuming twice as much VAS as the 777...  Now consider that same scenario where the 777 had been slowly growing at the rate of 200MB/hr...  OOM was a certainty- even if it was only happening in rare instances.

 

Most of the time, the sim gets away with what we are asking of it. Collectively, all of us developers are pushing the sim to the very brink by jamming in all the detail you have come to know and love, so simmers have to start employing smart strategies to conserve memory so that they can run long haul flights.  We have been making recommendations to you via our INTRODUCTION document for a few years now, and those recommendations are even more valid if you like to fly long haul flights.  (Open the 777 INTRODUCTION and go to page 31, even if you think you already know everything there is to know about VAS... you might learn something that will help you!)

 

I too have come to love highly detailed airport scenery thanks to our friendship with Amir at Flightbeam.  Prior to interacting with Amir I didn't understand how much depth good airports can bring to the sim.  (Yes- that is plug... I know I don't do it very often- but seriously... if ever a fellow was deserving of accolades..)

 

In Conclusion:

A few of you here in the forum were incredibly helpful when bringing your findings to our attention.  I cannot be more complimentary of the folks who helped get our attention focused on this issue, because their personal efforts to find a method of repeating the problem in a scientific manner made all the difference our ability to replicate the issue in-house under the microscope of the debugger and our own diagnostic tools.

 

When you have as many customers as we have, our primary method for finding problems in the field is to watch for patterns and a strong signal-to-noise ratio.  If a measurable percentage of our customers are being impacted by a particular problem, we are very likely to devote developer time to researching and resolving the issue.  If only a small percentage of customers are reporting to us a problem that doesn't happen frequently enough for it to develop a discernible pattern, then chances are we will make note of the issue and continue to watch without taking much action.

 

In this particular case, this issue was so incredibly obscure, it took a good cooperative partnership to make the issue appear in an environment where we could then approach it with the correct problem solving tools.

 

I love it when a system works.  B)

 

 

Share this post


Link to post
Share on other sites
Help AVSIM continue to serve you!
Please donate today!

I've had my fair share of strange issues to debug, but this is a real corker. Congratulations are deserved for finding and fixing it!

Share this post


Link to post
Share on other sites

And I thought all you need is a bit of bubblegum to stop the leak...  :wacko:

 

 

Seriously though, thanks for the insight and your continued efforts to improve the experience of your products.

Share this post


Link to post
Share on other sites

Thanks for letting me help contribute to the testing, Robert and Ryan, very much appreciated. Even more so, thank you analyzing the root cause and fixing this! Can't wait to try the update on a long haul tonight.

Share this post


Link to post
Share on other sites

Captain Randazzo and PMDG staff,

 

Thanks for the professional and scientific approach dedicated to the best aircraft ever built in the FS existence.

 

Regards,

Share this post


Link to post
Share on other sites

How on earth did you folks figure out that the leak was happening in the first place, especially given the specificity of the scenario in which it would occur? Kudos, gentlemen.

Share this post


Link to post
Share on other sites

Greg-

 

We had help.  The key piece was that a couple of users replicated the problem.  They came to us with a very specific method that they found would occasionally cause a steady increase in VAS consumption.

 

We took that information, and ran the whole thing in debug mode to collect some data, then we began developing very specific, controlled scenarios until we were able to replicate the problem with certainty.

 

Once we had a certain replication process, we began to remove pieces of the process until the problem ceased.  This took us into the portion of the airplane that was exhibiting the problem... and we just kept paring down from there until we found it.

 

In this case it took an exceptional amount of time because it wasn't easy to test solutions since it required someone to fly the thing over long periods of time.  As a result, we had many a 777 flying overnight in test scenarios to determine results.

 

It wasn't fun- but we you know...  we fixed it.  Somehow we always do.

Share this post


Link to post
Share on other sites

Very impressive what the PMDG team has accomplished.  Congratulations and thanks for all of your hard work.  Time for me to pick up the P3D 300ER from the shop :smile:

Share this post


Link to post
Share on other sites

I have had some good luck with avoiding OOM errors even with the T7 and VAS as high has 3.2ghz, but with the new update, I can't wait to see the difference in how much VAS is used in FSX. Thank you PMDG staff

Share this post


Link to post
Share on other sites

Troy-

 

For the most part, you really shouldn't see improvement unless you were routinely triggering the item I discuss here.

 

That is why I am giving you so much detail:  I want people to understand that the 777, overall, really is not a major consumer of VAS, and the item we found and killed is not a magical cure-all for folks with problems.  It was a very specific, corner case that should not impact most users.

Share this post


Link to post
Share on other sites

This very (robust) post is exactly why I love PMDG!  Thanks for the explanation Robert! 

Share this post


Link to post
Share on other sites

Your perfectionism is highly appreciated........ Good Job, now go and have a beer

 

cheers

Share this post


Link to post
Share on other sites

Please forgive my ignorance but I have been suffering constant OOM's recently through VAS.

 

However,how do we get the fix that Robert is talking about.

 

If it is an update how do we access it.

 

I am sorry if the answer is obvious.

 

Michael

Share this post


Link to post
Share on other sites

Please forgive my ignorance but I have been suffering constant OOM's recently through VAS.

 

However,how do we get the fix that Robert is talking about.

 

If it is an update how do we access it.

 

I am sorry if the answer is obvious.

 

Michael

The initial post in this topic explains the procedure. First, run the PMDG Operations Center program on your computer, then close it. This is just to make sure that the OC database of your installed liveries is current.

 

Log into your account at PMDG, re-download the full installers for the 777. It should NOT be necessary to uninstall your existing version first. Just run the newly downloaded installer. It should detect your existing 777 installation, and offer to update it. Confirm you want to run the update, and let it proceed.

 

When the update is complete, run the Operations Center again, and select your 777. The OC should ask you if you want to repair your liveries. Answer "yes". This livery update takes just a few seconds.

 

I believe that installing the update will roll back your PMDG nav data to an older version, so if you subscribe to Navigraph or Aerosoft nav data, you will need to run the appropriate nav data manager to in order to bring the AIRAC back to the latest version.

 

The VAS fix in this new update, only cures one very specific VAS leak that only affects a minority of users, flying long haul flights to a small number of specific airports in the FSX/P3D world.

Share this post


Link to post
Share on other sites

 

 


By necessity, scenery takes up far more VAS than the airplane does.

 

Not to sidetrack the thread (absolutely excellent post that'll help folks understand the VAS problem) I'd add this.  I think that the VAS used by scenery is much worse than it needs to be.  It seems that any 'hand placed' scenery objects are never freed once loaded by the sim.  If correct, that would mean that, when you land, your departure airport is still taking up VAS.  I was flying (FSX) with Xtreme Cities X - USA SouthCentral (a massive amount of hand-placed scenery) loaded and flew over Dallas and watched my VAS climb to just under 4Gig.  I set my heading and flew for another hour and the VAS never went down.  VAS continued to slowly, slowly climb...probably due to the small fixed buildings and antennas along the way.  If P3D can get that 'behind you, too far away' memory freed our precious 4Gig of space might be able to handle much more than it currently can.

Share this post


Link to post
Share on other sites

I think that the VAS used by scenery is much worse than it needs to be. It seems that any 'hand placed' scenery objects are never freed once loaded by the sim.

 

Exactly.  The code that loads scenery does a very bad job of taking out the trash after the scenery is no longer needed.  This was just sloppy programming by a very small team at Microsoft, who was probably constrained by tight budgets and schedules and the need for eye-candy a higher priority than building a platform that would be relevant for at least the next decade. To Microsoft, it was just a game.  My faith is in Lockheed-Martin to take it into 64b using modern development environments to produce a product that professionals will admire.

Share this post


Link to post
Share on other sites

Thanks for that detailed elucidation, Jim. I now get a better understanding. I've done all the Updating sequence, etc. It was the mention of long-haul VAS that grabbed my attention, because that is a bane of contention for me at the moment in P3D.

Share this post


Link to post
Share on other sites

Rick-

 

I am not well informed enough about P3D's VAS management to comment with authority- but I think the problems in FSX and P3D are similar- but different.

 

With some luck, DTG and Lockheed Martin will independently sort them out- and then we can all go back to enjoying our sims.  B)

Share this post


Link to post
Share on other sites

Robert,

Many thanks for all your efforts. I am already getting used to reloading my B777-300ER over un-busy airspace!

Share this post


Link to post
Share on other sites

Rsrandazzo, thanks for that comprehensive and impressive post. Thank you indeed, Sir.

 

 

Best regards.

 

John.

Share this post


Link to post
Share on other sites

  • I have a question more on the operational/procedure side of flying the 777.

 

      - Prior to departure you had to load an approach with a heading-to-radial-intercept maneuver.

      - That heading-to-radial-intercept maneuver had to be in a specific location in the missed approach procedure.  (Presence alone was not enough- it was also related to location of the maneuver...)

 

I am assuming the procedure described above is one that is already programmed in as part of say the Navigraph procedure data and not a manually programmed procedure.  I wonder if someone could post a chart of the type of procedure above that this refers to.  thanks.

 

Share this post


Link to post
Share on other sites

 

  • I have a question more on the operational/procedure side of flying the 777.
  •  
  •       - Prior to departure you had to load an approach with a heading-to-radial-intercept maneuver.
  •       - That heading-to-radial-intercept maneuver had to be in a specific location in the missed approach procedure.  (Presence alone was not enough- it was also related to location of the maneuver...)
 

I am assuming the procedure described above is one that is already programmed in as part of say the Navigraph procedure data and not a manually programmed procedure.  I wonder if someone could post a chart of the type of procedure above that this refers to.  thanks.

Take a look at the missed approach procedure for KLAX ILS 25L and 25R, those were two of the troublesome approaches. (And I'm wanting to say KJFK ILS 22L also maybe, don't have my notes in front of me.)

Share this post


Link to post
Share on other sites

Take a look at the missed approach procedure for KLAX ILS 25L and 25R, those were two of the troublesome approaches. (And I'm wanting to say KJFK ILS 22L also maybe, don't have my notes in front of me.)

 

OK got it, thanks.

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.
Sign in to follow this