Jump to content
Sign in to follow this  
Cruachan

STOP ERRORS - (Diagnostic arcane art?)

Recommended Posts

Hi,I've never understood 'STOP ERROR' messages and, I suspect, am far from being alone in this regard. Nevertheless, I'm now being forced into trying to understand the significance of at least one, the latest:IRQL_NOT_LESS_OR_EQUALSTOP: 0x0000000A (0x00000000, 0x00000002, 0x00000001, 0x80521AE8)For some time now, I've been experiencing the occasional BSOD with the STOP: 0x0000000A error being reported. These are very intermittent, can occur several weeks apart and usually while playing a graphically intensive game. However, this latest instance was unusual in that yesterday it occurred, apparently spontaneously, during an extended period (several hours) while the computer had been left to idle. Not only that, I found, for the first time, that I was unable to recover by executing a warm reboot. Nor would it POST or boot after being powered down briefly.The only clue was a single brief beep (AMI BIOS) heard within a second or two of powering on the system. The monitor screen remained blank and I did not hear the familiar beep following the detection of the graphics card. The system fans continued to run at full speed without the usual throttling down noted invariably after keyboard detection and it was clear that the boot would not progress.Thankfully, the system did boot after being powered down for 20mins or so and has been booting and running normally since.Now, some of you may have noted that I recently upgraded by installing the daughter ASRock AM2CPU board which has allowed me to upgrade my CPU and also move from DDR to DDR2: ASRock 939Dual-SATA2 (AM2CPU Board), AMD Athlon 64X2 6400+ (BE,3200MHz,Windsor), Arctic Cooling Freezer 64 Pro, 2GB Crucial Ballistix DDR2 PC2-6400 4-4-4-12(2T) (Dual Channel), (PCI-E)Sapphire ATI Radeon X1950 Pro 512MB (Catalyst 7.10 WHQL), SB Audigy2 ZS Platinum (Drivers version 5.12.0001.1196 WHQL),Antec NeoHE 650W PSU, Windows XP Home Edition (SP2), DirectX 9.0cAs you can see, I took the opportunity to upgrade the PSU as well.I should emphasize that this STOP error was occurring long before the upgrade. This suggests to me that the AM2CPU board and its installed components and the PSU are not responsible and the problem instead may lie with the 939Dual-SATA2 mainboard.So far, I have the following info:AMI BIOS Beep Codes - 1 short beep = "DRAM refresh failure. The programmable interrupt timer or programmable interrupt controller has probably failed"The 2GB Crucial Ballistix DDR2 ram has been checked over 13-14 hours (25 passes) using Memtest86+ and no errors were reported.I reproduced the same STOP error yesterday while running the memory bandwidth benchmark module in SiSoftware Sandra Professional Home XII.SP2c. It occured during the second run. Apparently the value of the 3rd parameter suggests that this occurred during a write operation. I had changed the memory command rate back from 1T to 2T but it made no difference.What I now need to know is does all the above confirm that an intermittent fault may be developing in either the Northbridge or Southbridge chipsets on the mainboard? If so, it's possible that I would have a solution. Believe it or not I do have a spare board (don't ask!) so swapping out would not be a major problem, although first I'd have to reinstate the previous configuration to allow the flashing of the BIOS update before applying the AM2CPU Board upgrade.What do you think? Am I on the right track?Thanks,Mike

Share this post


Link to post
Share on other sites

Do you see any cooresponding entries in the system loogbok?

Share this post


Link to post
Share on other sites

Hi jfri,If you are referring to the Event Viewer then, no, nothing under System or Application going back to 1st July which covers the recent period when these STOP errors have been observed.I've been error-free for the past 48 hours and the system has been cold and warm booted on several occasions. It's a bit like taking your car to the garage with an intermittent fault - chances are the engineers won't be able to reproduce the problem :(Mike

Share this post


Link to post
Share on other sites

Further to my last post I have just re-run several benchmark modules from SiSoftware Sandra:Memory BandwidthMemory LatencyCache and MemoryAll completed successfully and without any STOP errors manifesting.Looks like the memory controller is having a good day..LOL!Assuming, that is, that this issue is down to the memory controller.Mike

Share this post


Link to post
Share on other sites
Guest firehawk44

I personally think you should be very concerned with this stop error. I got this exact BSOD error occasionally for 3-4 weeks and suddenly my system would not boot. I had the RAID0 config and one of the hard drives crashed. I lost almost everything as I had not completed a backup for 6 months (the system was only 6 months old!). The only good thing was the fact the hard drive was still under warranty and I got it replaced for free. I have learned my lesson and if I ever get the IRQL_NOT_LESS_OR_EQUAL BSOD again, I'm doing a thorough check of my hardware and immediately making sure everything I need is backed up. I hope you are able to resolve your problem soon.Best regards,Jim

Share this post


Link to post
Share on other sites

>I personally think you should be very concerned with this>stop error. I got this exact BSOD error occasionally for 3-4>weeks and suddenly my system would not boot. I had the RAID0>config and one of the hard drives crashed. I lost almost>everything as I had not completed a backup for 6 months (the>system was only 6 months old!). The only good thing was the>fact the hard drive was still under warranty and I got it>replaced for free. I have learned my lesson and if I ever get>the IRQL_NOT_LESS_OR_EQUAL BSOD again, I'm doing a thorough>check of my hardware and immediately making sure everything I>need is backed up. I hope you are able to resolve your problem>soon.>I also has experienced this STOP error and also in my case the harddrives were failing and they were in raid0 setup. The logbook had entries hinting at the harddrives, that's why I asked if he had any logbbok entries.Also my experience was that the STOP errors could cease for a short while but the harddrive issues was still there.

Share this post


Link to post
Share on other sites

Hello Mike,The Microsoft TECHNET site will give you a better idea of the possible causes of any STOP error.Here is the link for searching technet ==> http://search.microsoft.com/advancedsearch...S&setlang=en-USYou can enter the stop error description like irq_not_less_or_equal or even the exact code or the last portion of the hexadecimal code like 0x80521AE8.I did a search for you and the link below shows the results. I entered "IRQ_NOT_LESS_OR_EQUAL" as the search paramater:http://search.microsoft.com/results.aspx?q...pe=1&OtherSite=Good luck. John

Share this post


Link to post
Share on other sites
Guest Nick_N

If I had to guess I would say this is probably a driver issue... older hardware and new software sometimes clashIRQ LESS THAN is usually either memory crash releated or driver crash related, one of the two, and assuming we have retarded the memory timing and have verified voltages than I would start looking at a hardware problem with drivers or a new card having problems with an old chipset or BIOS

Share this post


Link to post
Share on other sites
Guest Nick_N

If I had to guess I would say this is probably a driver issue... older hardware and new software sometimes clashIRQ LESS THAN is usually either memory crash releated or driver crash related, one of the two, and assuming we have retarded the memory timing and have verified voltages than I would start looking at a hardware problem with drivers or a new card having problems with an old chipset or BIOS

Share this post


Link to post
Share on other sites

Hi Jim, jfri and John,Thanks for your input. Perhaps reassuring to hear I'm not alone ;)However, I'm pretty sure my hard drives are okay as both (Western Digital Caviar SE16 (WDC WD5000AAKS-65YGA0) SATAII, 16MB Cache, 500GB + Hitachi Deskstar (HDP725050GLAT80) UDMA-6 / ATA133, 7MB Cache, 500GB) replaced their lower capacity sibs a couple of months or so back. This STOP error was first noted when the previous drives were in place.Also, I use Active SMART to monitor my drives and the only SMART parameters that change are Temperature and occasional slight oscillations in the value for Spin Up Times. The Temperatures don't seem to rise above 35 degrees C and on the vast majority of occasions the Spin Up Times are reported as 'OK'.I do keep recent backups - that lesson was learnt a long time ago and it has saved my bacon on several occasions.My understanding is that the 4th parameter, in this case 0x80521AE8, is the memory address accessed when the error occurred.System is still booting okay :)Regards,Mike

Share this post


Link to post
Share on other sites

Hi Nick,"IRQ LESS THAN is usually either memory crash related or driver crash related, one of the two, and assuming we have retarded the memory timing and have verified voltages than I would start looking at a hardware problem with drivers or a new card having problems with an old chipset or BIOS"Funny you should mention the memory timing as the default timings for the new 2GB Crucial Ballistix DDR2 PC2-6400 modules were 5-5-5-12 (I think it was 12 although Sandra seems to be indicating that it might have been 18)(2T) and I was trying them at 4-4-4-12 (1T). I have run Memtest86+ with settings of 4-4-4-12 (1T) for 12Hours and 4-4-4-12 (2T) for 13+ hours respectively and no errors were reported. Also, blasting the system with Stress Prime 2004 (Orthos) produced no instability or errors and was terminated only because I felt the CPU (AMD Athlon 64X2 6400+ (BE,3200MHz,Windsor) core temps were climbing a bit high. In fact, I found I was able to provoke a reset at 67 degrees C. Fortunately the Arctic Cooling Freezer 64 Pro solution does a pretty good job and the core temps (using Core Temp 0.99) never seem to rise above 57 degrees C now while playing FSX. FSX is the only software installed that challenges the CPU to this extent. I use Cool 'n' Quiet so most of the time the CPU core temps are in the low to mid 20's. Back to the memory timings. System instability did become apparent with the previous DDR memory dual channel memory modules when the command rate was tried at 1T. Backing off to the default 2T setting did resolve this to a large degree. As you will have noted I am currently trying this with the DDR2 modules.Maybe this all boils down to my lack of understanding of how to tweak the memory timings properly.Here is an image of the values reported for one of the memory modules:http://forums.avsim.net/user_files/190375.jpgI suspect the clues are there if only I was able to interpret them correctly. Again my limited understanding of such matters is that the only parameters worth changing are CL, RCD, RP, RAS and the Command Rate.Regards,MikeEdit: I've now checked the BIOS under 'Chipset Settings' and I find that there are 3 settings for DRAM Voltage, viz. AUTO, NORMAL and HIGH. I am guessing NORMAL = 1.8V and HIGH = 2.2V. Current setting is 'AUTO'.

Share this post


Link to post
Share on other sites
Guest harleyman52

I set mine to auto... Just cause I was told too...Don't really know...

Share this post


Link to post
Share on other sites

Ah, Hah! Someone's showing their age..lol!I had to look this one up.From Wikipedia:"An ABEND (also abnormal end or abend) is an abnormal termination of software, a crash.This usage derives from an error message from the IBM OS/360 operating system. It is used jokingly by hackers but seriously mainly by code grinders. Usually capitalized, but may appear as "abend".OS/360, officially known as IBM System/360 Operating System, was a group of batch processing operating systems developed by IBM for their then-new System/360 mainframe computer, announced in 1964. They were among the earliest operating systems to make direct access storage devices a prerequisite for their operation."You learn something new every day and, once again, Avsim provokes that desire in me :)Is it possible the poster's username (Scottish?) might suggest a certain level of experience on this subject? ;)-----------------------------------------------------------------Back on topic:While I have experienced no further BSODs since my original post I remain a little uneasy. The system seems stable and continues to boot normally. Yet I am still unclear as to the cause. Furthermore, this STOP error has always appeared intermittently and is hard, if not impossible to reproduce at will. I stated that I had managed to provoke the error once (when the command rate was set at 2T) by running the memory bandwidth benchmark in Sandra, but haven't been able to reproduce this since.Nick's suggestion re. memory timings could be the answer and reducing the command rated from 1T to 2T may be all that was required. However, as I said above, I have seen the STOP error once since having done so. Unfortunately, he has been a bit preoccupied of late and, quite understandably, has not, as yet, been able to respond to my reply to his post.I've run a few searches but can't find a decent recent tutorial for optimizing DDR2 memory performance. Crucial don't provide any documented help that I am aware of.As far as I can ascertain, with any degree of certainty, the optimal timings for my Crucial Ballistix DDR2 PC2-6400 modules are 4-4-4-12(2T)...which is the current settings in the BIOS. However, Sandra appears to indicate that the DRAM Voltage should be 2.2 at these settings. I have it set to 'AUTO', so can I assume that this is taken care of automatically? As I indicated in my previous post in this thread, the only options available are AUTO, NORMAL and HIGH. Again, my research on this AMIBIOS setting has been unhelpful. I flashed the BIOS on my mainboard to P2.30E prior to installing the AMD Athlon 64X2 6400+ and the cpu is recognized correctly. This is the latest BIOS available for my board.Mike

Share this post


Link to post
Share on other sites

Well, I think I'm making some headway with this - no answer as yet, but gradually I am cultivating a better understanding of what's going on.First a definition:IRQL = Interrupt Request Levelhttp://blogs.technet.com/askperf/archive/2...-important.aspx"An interrupt request level (IRQL) defines the hardware priority at which a processor operates at any given time. In the Windows Driver Model, a thread running at a low IRQL can be interrupted to run code at a higher IRQL. The number of IRQL's and their specific values are processor-dependent.Processes running at a higher IRQL will pre-empt a thread or interrupt running at a lower IRQL. An IRQL of 0 means that the processor is running a normal Kernel or User mode process. An IRQL of 1 means that the processor is running an Asynchronous Procedure Call (APC) or Page Fault. IRQL 2 is used for deferred procedure calls (DPC) and thread scheduling. IRQL 2 is known as the DISPATCH_LEVEL. When a processor is running at a given IRQL, interrupts at that IRQL and lower are blocked by the processor. Therefore, a processor currently at DISPATCH_LEVEL can only be interrupted by a request from an IRQL greater than 2. A system will schedule all threads to run at IRQL's below DISPATCH_LEVEL - this level is also where the thread scheduler itself will run. So if there is a thread that has an IRQL greater than 2, that thread will have exclusive use of the processor. Since the scheduler runs at DISPATCH_LEVEL, and that interrupt level is now blocked off by the thread at a higher IRQL, the thread scheduler cannot run and schedule any other thread. So far, this is pretty straightforward - especially when we're talking about a single processor system.On a multi-processor system, things get a little complicated. Since each processor can be running at a different IRQL, you could have a situation where one processor is running a driver routine (Device Interrupt Level - aka DIRQL), while another processor is running driver code at IRQL 0. Since more than one thread could attempt to access shared data at the same time, drivers should protect the shared data by using some method of synchronization. Drivers should use a lock that raises the IRQL to the highest level at which any code that could access the data can run. We're not going to get too much into Locks and Deadlocks here, but for the sake of our discussion, an example would be a driver using a spin lock to protect data accessible at DISPATCH_LEVEL. On a single processor system, raising the IRQL to DISPATCH_LEVEL or higher would have the same effect, because the raising of the IRQL prevents the interruption of the code currently executing."IRQL_LESS_OR_NOT_EQUALhttp://wiki.answers.com/Q/What_is_IRQL_LES...ue_screen_error"The IRQL_NOT_LESS_OR_EQUAL bug check has a value of 0x0000000A. This indicates that Microsoft Windows or a kernel-mode driver accessed paged memory at DISPATCH_LEVEL or above. Parameters: The following parameters are displayed on the blue screen. Parameter Description1......... Memory referenced 2......... IRQL at time of reference 3......... 0: Read 1: Write 4......... Address which referenced memory Cause: This bug check is issued if paged memory (or invalid memory) is accessed when the IRQL is too high. The error that generates this bug check usually occurs after the installation of a faulty device driver, system service, or BIOS."Now, turning to my BSOD with STOP error:STOP: 0x0000000A (0x00000000, 0x00000002, 0x00000001, 0x80521AE8)Looking at the 4 parameters, it would appear that:a) the memory address referenced was 00000000 (hex). According to Device Manager, this was the System Board (? physical memory).:( the instruction that referenced 00000000 came from address 80521AE8 (hex) which is identified as the PCI bus.c) the Interrupt Request Level at that moment in time was 2 "DISPATCH_LEVEL" (see above) and the operation was a "Write".So, while my understanding of the language of STOP errors is now a little less arcane, I remain unable to take this to the next logical step whereby the actual cause can be identified. Could it still be a device driver and if so, which? Nothing has changed since the AM2CPU board/CPU/DDR2 upgrade and, remember, this STOP error was appearing on my previous setup. The only driver update in recent times was the Catalyst 7.10 WHQL drivers and these have been in place now for several months.System continues to remain stable at present.MikeEdit: I don't know whether it helps or not, but I have now run PSTAT from the command prompt and the memory address 80521AE8 lies between the following 2 Load Addresses:804D7000 (ntkrnlpa.exe) and 806E2000 (hall.dll)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Tom Allensworth,
    Founder of AVSIM Online


  • Flight Simulation's Premier Resource!

    AVSIM is a free service to the flight simulation community. AVSIM is staffed completely by volunteers and all funds donated to AVSIM go directly back to supporting the community. Your donation here helps to pay our bandwidth costs, emergency funding, and other general costs that crop up from time to time. Thank you for your support!

    Click here for more information and to see all donations year to date.
×
×
  • Create New...