Jump to content
Sign in to follow this  
Cruachan

STOP ERRORS - (Diagnostic arcane art?)

Recommended Posts

Guest Nick_N

I am tied up MikeYour digging in the right place but what you find may not be the reason or causeYou need to ask these questions when approaching like that:1. Is the STOP error address the same each time2. Is the STOP error type the same each time3. Is the STOP error IRQL the same each timeThe reason I said to check memory timing and voltage first was to eliminate instability in that area. Setting sub timings and tweaking is not going to tell you much especially if the memory is testing stable all the time and the system is not overclocked. Therefore. the issue is more than likely driver or motherboard IRQ assignment related to a device or card in a slot.Increasing the FAB 4 timing one click up and making sure voltage is correct is the way to eliminate a setting problem. It does mean the system runs a bit slower but eliminates the timming issue as a cause. It can also help identify a defective memory stick too in some cases but trying to set all the sub timings manually is not needed and wont point you anywhere. AUTO reads those off the SPD and its reare to find a sub causing something like this except in the case of a defective stick or poor memory code on the SPD.Here is where it gets trickyThis is either:1. Memory instability due to: a. Clock or timing - user error b. Defective parts - memory manufacture c. BIOS issue - motherboard manufacture d. voltage issue - product or motherboard e. Motherboard + memory manufacture conflict2. Motherboard IRQ conflict: a. Card hogging buss (SBLive cards were known for this) b. Motherboard IRQ slot sharing with another c. BIOS setting3. Driver conflict: a. Incorrect driver code b. Windows ACPI or IRQ routing (usually caused by poor driver write or oversight) c. Resource conflict with another device d. Chipset register conflict with device (BIOS or chipset driver)4. System Instability: a. CPU clocked too high b. Memory clocked too high (goes back to 1a) c. Defective card d. Defective motherboard or card slot issue e. Corrupt Windows install 5. Application code error when addressing hardware6. A combination of one or more in all of the above and I can add subs to every item in that list too.OK, so you can sit there and track down code and backtrack addresses, or, you can attack the situation based on process of elimination, FIRST, then look at the error code to either confirm or pinpoint (if necessary). Usually, with the process of elimination method you remove the problem child and the error and no error code sniffing is necessary, which can be inaccurate or a red hearing to begin with. It is good to write down the errors each time they appear in full so you can reference back to them for a common denominator if needed.Ask yourself: 1. Does this error occur all the time or very rarely2. Is this error seen in the same application or different applications3. Are there any consistencies when it occursThe hard part is when something only shows up rarely and the user can not spot anything consistent with it such as same application, sound, video, where in the application it happens, etc.So the way to attack something like that is a real PITA but it usually nets the culprit eventually. It

Share this post


Link to post
Share on other sites
Guest SoarPics

Hi Mike,I just returned from spending a couple hours working on a C2D system plagued with the "IRQL_NOT_LESS_OR_EQUAL" experience.Certainly you should follow Nick's suggestion for troubleshooting (i.e. ignore the fault messages).But the first thing I look at when I see this problem is the sound card... specifically I'm looking for a SB card. So I'd start with Nick's point #2a.FWIW after thoroughly checking this system today for malware/viruses etc. (he likes visiting pirate sites, so it'd be easy for him to pick up some sort of ugly thing) I removed the SB card, and now we'll be watching what it does (he uses the computer mainly for e-mail/surfing... no gaming).And once home I told my own Montego that I love it. :-) Good luck,

Share this post


Link to post
Share on other sites

Hi Greg,Nice to know you're never far away ;)"And once home I told my own Montego that I love it."Blimey, if my box of tricks doesn't know it's loved by now then I really don't know what else I can do...lol!Thanks a lot for your advice - helpful as ever :)Regards,Mike

Share this post


Link to post
Share on other sites

Hi Nick,I do appreciate you taking the time to help me again and especially so since I am aware you have been distracted recently with another set of more urgent problems much closer to home. Superb product BTW - kudos to the development team!Well, you have given me some food for thought, haven't you!You point out many possibilities and, unfortunately, this is one of those occasions when the problem appears so infrequently and without warning that I think the best I can do meantime is to copy your fantastically detailed and helpful post for future reference and guidance. Once I have more info recorded I should be in a much better position to troubleshoot this issue successfully. As things stand at the moment I don't think there is much hope of a successful outcome.System continues to perform well and remains stable at present, so I think I'll just maintain a watching brief and see what develops. Assuming I do manage to crack this some day in the future I will resurrect this thread with any follow-up.Once again, many thanks for all your trouble.Best regards,Mike :)

Share this post


Link to post
Share on other sites

Hi Jim,"Have you analyzed your crash dump file?"I did touch on that possibility a few days ago, but the whole process seemed so complicated that I confess I ducked the challenge.You have now resurrected my desire to have another go and this is what I've discovered so far:1. The crash dump files (*.dmp) are deposited in the WindowsMinidump folder.2. Currently I have 6 files in that folder - one with yesterday's date (which is strange - see later) and the rest had dates in March 2007, so no help there :(3. I needed a debugging tool to read these and preferably one with a GUI interface rather than forcing everything to be done from the command prompt. This would really underline the arcane aspect of the STOP error troubleshooting process and I would much rather avoid that if at all possible. I tried dumpchk.exe but, oh dear, forget it!4. The various error reporting options had previously been disabled in XP :( This has now been corrected so, hopefully, should now prove fruitful in the future. However, this doesn't explain how a minidump file appeared yesterday as logically it shouldn't since XP was not configured to produce it! Anyway, everything has now been cleared out in the Event Viewer in preparation for a fresh start.I have now installed Windbg and so far this seems much more user friendly.The system did crash (lockup?) yesterday, but not with a BSOD or STOP error this time. I had been looking after my 5 year old grandson and inevitably this resulted in some time spent on the computer. He loves 'HL2 Episode Two', 'Live For Speed' (isn't that a cracker of a sim?!) and lastly we played 'Live Pool'. The system was never rebooted throughout and at approx. 30mins into 'Live Pool' the system froze with a static plain screen filled with a series of vertical lines. A simple reset rectified the situation and I ran CHKDSK /f /r on C: just in case. All has remained well thus far a day later.Examining yesterday's .dmp file revealed the probable cause as being ati2cqag.dll ( ati2cqag!_NULL_IMPORT_DESCRIPTOR+c3c ). This dynamic link library is a component of the Catalyst 7.10 WHQL driver set.While I was at it, I examined the other .dmp files I mentioned were present from last year in the Minidump folder and found in each case that the file involved was csrss.exe. I understand that this file "is the main executable for the Microsoft Client/Server Runtime Server Subsystem. This process manages most graphical commands in Windows. This program is important for the stable and secure running of your computer and should not be terminated."In passing I was made aware of a recurring message appearing in WinDbg relating to the debugger not using the correct symbols:"In order for this command to work properly, your symbol path must point to .pdb files that have full type information. Certain .pdb files (such as the public OS symbols) do not contain the required information. Contact the group that provided you with these symbols if you need this command to work. Type referenced: mssmbios!_SMBIOS_DATA_OBJECT"So, a little research has now resulted in the installation of the entire collection of Microsoft's symbols for XP SP2 and the path is described in WinDbg. Unfortunately the resultant output remains the same with the above message still appearing repeatedly."A symbol file contains the same debugging information that an executable file would contain. However, the information is stored in a debug (.dbg) file or a program database (.pdb), rather than the executable file. Therefore, you can install only the symbol files you will need during debugging. This reduces the file size of the executable, saving load time and disk storage."Although it's not an executable, could this, I wonder, have anything to do with the fact that ati2cqag.dll is not a Windows file?Anyway, some further progress has been made and I'm feeling a little wiser for all that :)Thanks for the nudge!Mike

Share this post


Link to post
Share on other sites
Guest Nick_N

MikeA minidump will occur during a crash from this settinghttp://forums.avsim.net/user_files/190526.jpgIt has nothing to do with the ERROR systemIn order for that to work you must have a minimum size page file set to allow it. http://forums.avsim.net/user_files/190527.jpgand as I recall Windows will still drop a minimum dump as long as the page file is 10mb or larger no matter what that first box above is set toIRQ LESS THAN EQUAL is usually caused more often than not by overclocking, unstable memory, or, a PCI card hogging the buss and usually BIOS or sound card related although that is not always the case and can relate to anything in the list I posted earlierAS for other STOP errors they can be a bit easier to figure out. IRQ LESS THAN tends to be a bit tougher because of the list of items that can be involved and the error logged can be nothing but a red herring.Another reason for seeing IRL LESS and strange STOP errors is older chipsets/motherboards trying to run new cards and drivers. There can be issues with such devices and their driver design because the motherboard/chipset design is older and the cards need higher support or expect chipset register values/support that are not present. There can also be game issues with the driver/card and older motherboards.Its always good to collect the error data and review it for possible clues, and to have that information as you make changes and checks.. however actually resolving the issue many times comes down to the strip diagnostic to eliminate all the possible elements that can invoke a crash. By removing items from the system and running the application that crashes you are eliminating a very large amount of questionable possibilities and targeting the issue down to one card, the BIOS, its settings or the drivers in use, and last a possible Windows issueYou may find the data that you collect from the dumps and check either checks out or was completely useless and was never the center of the problem, just a final symptom. None the less knowing what is in the reports is good to have as you test.I have even seen PSU's cause strange and intermittant crash errors which record the file that crashed, but the problem had nothing to do with the error log. Rare, but it does happen.

Share this post


Link to post
Share on other sites
Guest Nick_N

Yes mike.. thats fineyou will get a dump report when a BSOD shows

Share this post


Link to post
Share on other sites

Hi Nick,I've seen you are using XP64.What improvements/issues did you experience in FSX (if any) with respect to XP32?ThanksFulvio

Share this post


Link to post
Share on other sites

Okay, after a few days of apparent stability it's happened again - twice today :( First time the computer was idle and the second time it happened while composing an e-mail.First occasion (unfortunately I didn't take note of the second):IRQL_NOT_LESS_OR_EQUALSTOP: 0x0000000A (0x024AA128, 0x00000002, 0x00000001, 0x80521B27)A MiniDump was created successfully the second time, but not on the first.Running it through WinDbg:Microsoft ® Windows Debugger Version 6.9.0003.113 X86Copyright © Microsoft Corporation. All rights reserved.Loading Dump File [C:WINDOWSMinidumpMini072008-01.dmp]Mini Kernel Dump File: Only registers and stack trace are availableSymbol search path is: *** Invalid ******************************************************************************** Symbol loading may be unreliable without a symbol search path. ** Use .symfix to have the debugger choose a symbol path. ** After setting your symbol path, use .reload to refresh symbol locations. *****************************************************************************Executable search path is: ********************************************************************** Symbols can not be loaded because symbol path is not initialized. ** ** The Symbol Path can be set by: ** using the _NT_SYMBOL_PATH environment variable. ** using the -y argument when starting the debugger. ** using .sympath and .sympath+ **********************************************************************Unable to load image ntoskrnl.exe, Win32 error 0n2*** WARNING: Unable to verify timestamp for ntoskrnl.exe*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exeWindows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatibleProduct: WinNt, suite: TerminalServer SingleUserTS PersonalKernel base = 0x804d7000 PsLoadedModuleList = 0x8055c700Debug session time: Sun Jul 20 00:20:19.015 2008 (GMT+1)System Uptime: 0 days 0:40:57.605********************************************************************** Symbols can not be loaded because symbol path is not initialized. ** ** The Symbol Path can be set by: ** using the _NT_SYMBOL_PATH environment variable. ** using the -y argument when starting the debugger. ** using .sympath and .sympath+ **********************************************************************Unable to load image ntoskrnl.exe, Win32 error 0n2*** WARNING: Unable to verify timestamp for ntoskrnl.exe*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exeLoading Kernel Symbols..................................................................................................................................................................Loading User SymbolsLoading unloaded module list..............******************************************************************************** ** Bugcheck Analysis ** ********************************************************************************Use !analyze -v to get detailed debugging information.BugCheck 1000000A, {2620b48, 2, 0, 80521aa0}***** Kernel symbols are WRONG. Please fix symbols to do analysis.**************************************************************************** ****** ****** Your debugger is not using the correct symbols ****** ****** In order for this command to work properly, your symbol path ****** must point to .pdb files that have full type information. ****** ****** Certain .pdb files (such as the public OS symbols) do not ****** contain the required information. Contact the group that ****** provided you with these symbols if you need this command to ****** work. ****** ****** Type referenced: nt!_KPRCB ****** ******************************************************************************************************************************************************** ****** ****** Your debugger is not using the correct symbols ****** ****** In order for this command to work properly, your symbol path ****** must point to .pdb files that have full type information. ****** ****** Certain .pdb files (such as the public OS symbols) do not ****** contain the required information. Contact the group that ****** provided you with these symbols if you need this command to ****** work. ****** ****** Type referenced: nt!_KPRCB ****** ************************************************************************************************************************************************** Symbols can not be loaded because symbol path is not initialized. ** ** The Symbol Path can be set by: ** using the _NT_SYMBOL_PATH environment variable. ** using the -y argument when starting the debugger. ** using .sympath and .sympath+ ******************************************************************************************************************************************** Symbols can not be loaded because symbol path is not initialized. ** ** The Symbol Path can be set by: ** using the _NT_SYMBOL_PATH environment variable. ** using the -y argument when starting the debugger. ** using .sympath and .sympath+ **********************************************************************Probably caused by : ntoskrnl.exe ( nt+4aaa0 )Followup: MachineOwner---------1: kd> !analyze -v******************************************************************************** ** Bugcheck Analysis ** ********************************************************************************IRQL_NOT_LESS_OR_EQUAL (a)An attempt was made to access a pageable (or completely invalid) address at aninterrupt request level (IRQL) that is too high. This is usuallycaused by drivers using improper addresses.If a kernel debugger is available get the stack backtrace.Arguments:Arg1: 02620b48, memory referencedArg2: 00000002, IRQLArg3: 00000000, bitfield : bit 0 : value 0 = read operation, 1 = write operation bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)Arg4: 80521aa0, address which referenced memoryDebugging Details:------------------***** Kernel symbols are WRONG. Please fix symbols to do analysis.**************************************************************************** ****** ****** Your debugger is not using the correct symbols ****** ****** In order for this command to work properly, your symbol path ****** must point to .pdb files that have full type information. ****** ****** Certain .pdb files (such as the public OS symbols) do not ****** contain the required information. Contact the group that ****** provided you with these symbols if you need this command to ****** work. ****** ****** Type referenced: nt!_KPRCB ****** ******************************************************************************************************************************************************** ****** ****** Your debugger is not using the correct symbols ****** ****** In order for this command to work properly, your symbol path ****** must point to .pdb files that have full type information. ****** ****** Certain .pdb files (such as the public OS symbols) do not ****** contain the required information. Contact the group that ****** provided you with these symbols if you need this command to ****** work. ****** ****** Type referenced: nt!_KPRCB ****** ************************************************************************************************************************************************** Symbols can not be loaded because symbol path is not initialized. ** ** The Symbol Path can be set by: ** using the _NT_SYMBOL_PATH environment variable. ** using the -y argument when starting the debugger. ** using .sympath and .sympath+ ******************************************************************************************************************************************** Symbols can not be loaded because symbol path is not initialized. ** ** The Symbol Path can be set by: ** using the _NT_SYMBOL_PATH environment variable. ** using the -y argument when starting the debugger. ** using .sympath and .sympath+ **********************************************************************MODULE_NAME: ntFAULTING_MODULE: 804d7000 ntDEBUG_FLR_IMAGE_TIMESTAMP: 45e5484aREAD_ADDRESS: unable to get nt!MmSpecialPoolStartunable to get nt!MmSpecialPoolEndunable to get nt!MmPoolCodeStartunable to get nt!MmPoolCodeEnd 02620b48 CURRENT_IRQL: 2FAULTING_IP: nt+4aaa080521aa0 8b7e0c mov edi,dword ptr [esi+0Ch]CUSTOMER_CRASH_COUNT: 1DEFAULT_BUCKET_ID: WRONG_SYMBOLSBUGCHECK_STR: 0xALAST_CONTROL_TRANSFER: from 80521e80 to 80521aa0STACK_TEXT: WARNING: Stack unwind information not available. Following frames may be wrong.a24e1a08 80521e80 00000000 c0006528 00000009 nt+0x4aaa0a24e1a24 8051fc91 00ca5000 00ca5000 a24e1b0c nt+0x4ae80a24e1a78 80543908 00000000 00ca5000 00000000 nt+0x48c91a24e1a90 80614451 badb0d00 fffff000 ffdff120 nt+0x6c908a24e1b0c 8060faab 00ca0000 00010000 00000004 nt+0x13d451a24e1d4c 805409ac 00000005 00ca0000 00010000 nt+0x138aaba24e1d64 7c90eb94 badb0d00 0013f928 a25fdd98 nt+0x699aca24e1d68 badb0d00 0013f928 a25fdd98 a25fddcc 0x7c90eb94a24e1d6c 0013f928 a25fdd98 a25fddcc 00000000 0xbadb0d00a24e1d70 a25fdd98 a25fddcc 00000000 00000000 0x13f928a24e1d74 a25fddcc 00000000 00000000 00000000 0xa25fdd98a24e1d78 00000000 00000000 00000000 00000000 0xa25fddccSTACK_COMMAND: kbFOLLOWUP_IP: nt+4aaa080521aa0 8b7e0c mov edi,dword ptr [esi+0Ch]SYMBOL_STACK_INDEX: 0SYMBOL_NAME: nt+4aaa0FOLLOWUP_NAME: MachineOwnerIMAGE_NAME: ntoskrnl.exeBUCKET_ID: WRONG_SYMBOLSFollowup: MachineOwner---------Does this help in any way? Double Dutch to me I'm afraid.Mike

Share this post


Link to post
Share on other sites

Hi,I'm guessing we're all stumped!I wonder, could this be a graphic driver issue? I know I've been a little reluctant to change from the 7.10's, but perhaps now would be a good time to consider an update?I described my symptoms to another computer savvy individual a few days ago and he appeared to recognise the STOP error and subsequent failure to warm boot with a reset (see original post). He suggested reseating the ram. Any thoughts anyone?Unfortunately I can't try any of this out at the moment as my wife and I are in the early days of a Norwegian cruise.Mike

Share this post


Link to post
Share on other sites

Perhaps it is time to configure verifier.exe and see what happens..I'm no expert, but that is where I would go next... as well as continuing to collect data.... Every picture tells a story!!!

Share this post


Link to post
Share on other sites

Hi Jim,Okay Jim, but it'll have to wait until I return from this wonderful cruise! Pity it hasn't taken my mind off this problem....yet ;)Mike

Share this post


Link to post
Share on other sites

Hi all,Well here I am, as promised, with an update.I was querying the possibility that this might have been a graphic driver issue. Turns out that this may have been correct although it's still early days and it was perhaps not a WHQL driver issue per se.When I returned from holiday over a week ago I revisited the 3D advanced tweaks in ATI Tray Tools. I remembered one tweak in particular and thought this might be a good candidate: 'Multi Thread Settings'. I had enabled MT Support some time ago and had set the Maximum number of working threads to 2 (as per Koroush Ghazi's recommendations). Prior to the increase in frequency of these crashes I had been reading the 'Unofficial ATT Tweaks Thread' over at Guru 3D.com: http://forums.guru3d.com/showthread.php?t=217673 when I was reminded about this tweak and this had prompted me to try a higher setting of 4 for 'Maximum working threads'.This value has now been reduced again to 2 and, fingers crossed, stability reigns once more.I'm still not sure that this would also explain those crashes which were occurring while the system was idling unattended.Time will tell, but so far so good!Mike

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...