Sign in to follow this  
HiFlyer

Intel to address bug in Skylake Cpu's

Recommended Posts

Help AVSIM continue to serve you!
Please donate today!

Whoa, now we'll all have to stop doing that. J/K, I know that the bug is a symptom of something that is wrong and could be much worse.

Share this post


Link to post
Share on other sites

Whoa, now we'll all have to stop doing that. J/K, I know that the bug is a symptom of something that is wrong and could be much worse.

 

Yup! Last time anything similar to this happened, It caused such a stink that Intel had to recall the affected processors.  https://en.wikipedia.org/wiki/Pentium_FDIV_bug

 

They are going to jump all over this to make sure that doesn't happen this time.

Share this post


Link to post
Share on other sites

Yup! Last time anything similar to this happened, It caused such a stink that Intel had to recall the affected processors.

 

Read through all the comments in the link you posted and you'll see that the old Pentium bug was a very different beast which couldn't easily be fixed with a BIOS update like this one.

Share this post


Link to post
Share on other sites

Read through all the comments in the link you posted and you'll see that the old Pentium bug was a very different beast which couldn't easily be fixed with a BIOS update like this one.

 

I'm not sure that can be decided just yet. Until the "fix" is out in the wild and tested en masse, there's no way to know if there's going to be any sort of performance (or other) penalty.

Share this post


Link to post
Share on other sites

"When calculating prime numbers"... so when running a synthetic stress test like Prime95 or some scientific or financial applications. And only has a "slim" chance of occurring under intense workloads.

 

Won't affect most of us then. Prime95 isn't advised for the latest processors anyway.

 

BIOS update already done, sounds like it's not going to be a big deal.

Share this post


Link to post
Share on other sites

Looks like they are doing the fix via Motherboard BIOS updates.

 

Most likely Intel will just disable some aspect of the CPU ... prime numbers are used A LOT, especially in security/encryption, so it's definitely a significant issue.

 

What I find somewhat surprising is that the issue was brought to Intel's attention prior to the CPU being officially released, but Intel ignored it ... that's the bigger concern I have with Intel.  Not so much Intel had a problem, it's how they dealt with the problem before it became public.

 

I don't have a Skylake CPU, but regardless ... it's the trying to avoid a $475M write-down and hope "they don't notice" which I don't think was a wise management decision.

 

Cheers, Rob.

  • Upvote 1

Share this post


Link to post
Share on other sites

But the issue has a "slim chance of occurring" and "only under intense workloads"... So probably why Intel chose to ignore it. They knew it was a rare occurrence and they knew they could fix it.

 

Still don't think it will be a significant issue. It can't be a significant issue if it's defined as rare and fixable.

Share this post


Link to post
Share on other sites

Prime95 isn't advised for the latest processors anyway.

What is now? I used Prime95 to overclock my Sandybridge, and I understand it's been updated since to support Haswell instruction set.

Share this post


Link to post
Share on other sites

What I know is that Asus don't recommend stress tests like Prime95 or IBT.

 

Asus say they aren't fully validated for the latest CPU's, and can over stress some aspects and under stress others.. It may have been updated for Haswell but Haswell isn't Skylake. In a recent video the Asus rep once again advised against Prime95. Clearly many still use it though.

 

Personally, I haven't used it since Ivy Bridge. Aida, ROG real bench, Intel Extreme Tuning Utility I would favour. And certainly not IBT.

 

ROG Real Bench uses open source applications to test the CPU as it would be used in real life. In other words it's not synthetic like Priem95. Because Asus are using open source apps, there is clearly no bias. Asus found that no free synthetic stress test offered a rounded idea of the value of the overclock, so they created one.

Share this post


Link to post
Share on other sites

No software should be able to "freeze/lockup" a PC at the hardware level when operating valid instruction sets.  It is possible for CPUs to get into a deadlock/contention state (especially with HT enabled) but that's not the same as a "freeze/lockup".  

 

Malicious software may try to "crash" a PC or put it in vulnerable state but again that's not the same as a "freeze/lockup".  Prime95 is a rather simple program executing valid instructions (it's not malware) ... for Intel or Asus to say one "shouldn't run" any valid software is VERY dubious.

 

Asus have warned that Prime95 can damage a CPU ... if that claim is accurate (which I don't see how or why it could be), then there is either a problem with the CPU and/or other components (motherboard/chipset/RAM).

 

Analogy would be like me saying P3D caused my PSU to fail.

 

Cheers, Rob.

Share this post


Link to post
Share on other sites

Malicious software may try to "crash" a PC or put it in vulnerable state but again that's not the same as a "freeze/lockup". Prime95 is a rather simple program executing valid instructions (it's not malware) ... for Intel or Asus to say one "shouldn't run" any valid software is VERY dubious.

It isn't valid though Rob. It isn't validated for the latest architectures. Thus it doesn't test certain newer instruction sets.

 

 

Asus have warned that Prime95 can damage a CPU ... if that claim is accurate (which I don't see how or why it could be), then there is either a problem with the CPU and/or other components (motherboard/chipset/RAM).

 

They haven't warned it can damage CPU's. Simply that when a new architecture is released, it doesn't test newer instruction sets and could possibly over stress others. That doesn't necessarily equate to damaged CPU's, BSOD, lockup's maybe.

 

It isn't dubious Rob. They simply don't recommend it's use for stress testing. ROG Real Bench is free of course, so it's no skin of Asus's nose if we use it or not, no bias. Asus have nothing against the creator of Prime95, just that in their experience it's not ideal. Asus also point out that it's a synthetic stress test, thus, it doesn't stress a CPU the way the vast majority of us utilise our PC's 24/7. Which was why they created ROG Real Bench. It's not just Prime95, Asus aren't fans of any of the synthetic stress tests.

 

My Ivy Bridge system for example was 100% stable in Prime95. as soon as I fired up Battlefield, BSOD. Therefore was forced to tweak my overclock further to achieve "real" stability, stability that reflected real world use. Asus are correct, Prime95 and the other synthetic stress tests aren't ideal, as they bare little resemblance to how the vast majority of us use our PC's.

 

If Prime 95 isn't validated for the latest architecture, then it makes perfect sense to me that it's not the ideal choice. If it doesn't stress a CPU the way I do 24/7, then it makes perfect sense to me that it's not the ideal choice.

 

We build and overclock our PC's to be stable when running our favoured applications, to be stable in "real world use"... we don't build and overclock our PC's to be "Prime95 machines", or "IBT machines".

 

Asus have been saying this since Ivy Bridge, I'm surprised you've not heard it before.

 

Asus are pretty good at this stuff, it's what they do. They literally test thousands of CPU's. Therefore I give their opinions consideration, and on this occasion, I agree with them.

 

 

http://rog.asus.com/275272013/overclocking/realbench-benchmarking-stress-test-insights/

Share this post


Link to post
Share on other sites

I wasn't contending that Prime95 is not using "new instruction sets" ... but, Prime95 operates valid instructions, a CPU should be able to handle those instructions.  Not contending what should or shouldn't be used for Stress testing (I use a combination of many different products for stress testing) either, RealBench, Prime95, SiSoftware Sandra, etc. etc. 

 

From Asus:

 

http://rog.asus.com/365052014/overclocking/rog-overclocking-guide-core-for-5960x-5930k-5820k/

 

 

Stress Testing

Users should avoid running Prime95 small FFTs on 5960X CPUs when overclocked. Over 4.4GHz, the Prime software pulls 400W of power through the CPU. It is possible this can cause internal degradation of processor components.

 

The above statement is not accurate and is dubious.  I agree that no "single" application should be used when stress testing (overclocked or not).

 

Cheers, Rob.

Share this post


Link to post
Share on other sites

I wasn't contending that Prime95 is not using "new instruction sets" ... but, Prime95 operates valid instructions, a CPU should be able to handle those instructions.

That may be a bit simplistic. We are dealing with a new CPU architecture, and testing it with software not validated for that architecture. Neither of us are expert enough to state anything definitively. It may be that Prime, in this scenario, would not operate some of the newer instruction sets at all and operate other new instruction sets inappropriately. Or another scenario that neither of us are knowledgeable enough to hypothesize. Asus obviously believe something like that to be the case. And given that they are far more knowledgeable than me, and have carried out test that I haven't, I give their opinion consideration.

 

 

The above statement is not accurate and is dubious.

 

 

Why? Evidence of that? I'd be very surprised if Asus made a complete balls up with that statement, given their experience with the platform. Don't forget Rob, Asus get the CPU's quite a while before the consumer, and test a multitude of examples. They become experts, before we even sniff the CPU. Doubt they'd claim such a thing unless they had evidence it was the case.

 

I only quickly glanced at the article, but perhaps they were considering off-set or adaptive voltage, well known to be a concern when running synthetic stress tests?

Share this post


Link to post
Share on other sites

 

 


Neither of us are expert enough to state anything definitively.

 

I not pretending to be an expert, but I can definitely say that is NOT the case ... we'll just have to disagree ... that's just not how CPUs are designed.

 

 

 


I'd be very surprised if Asus made a complete balls up with that statement, given their experience with the platform.

 

I wouldn't ... they're using scare tactics to get people to use RealBench instead of Prime95 ... CPU's have Thermal  protection, they shutdown before any damage (unless you disable it - which in some CPUs I'm not sure you can).  Asus is just trying to control the benchmark ;) -- fair enough ... Asus, AMD, nVidia, Gigabyte, MSI, etc. have all at some point in time come up with "tricks" to make their products appear better than they really are ... no different than any Ad you see on TV (especially food ads) ... just marketing.

 

Anyway, I'm not suggesting everyone rush out and dump their Skylake CPUs, but I will suggest that once Intel release a "fix" run the same stress/performance tests as one did before the FIX just to make sure the numbers are the same and all is good.

 

Cheers, Rob.

Share this post


Link to post
Share on other sites

I not pretending to be an expert, but I can definitely say that is NOT the case ... we'll just have to disagree ... that's just not how CPUs are designed.

Then I would have liked you to tell me how CPU's are designed and precisely why Asus are wrong. But I guess at this point, as you say, we'll just have to agree to disagree. :smile:

 

I wouldn't ... they're using scare tactics to get people to use RealBench instead of Prime95 ...

 

 

You finally said it! :BigGrin:

 

Asus were saying this long before Real Bench was developed though. And Real Bench is free, Asus make nothing from it. Doubt they'd slag off the fine work of George Woltman just to push a utility they make nothing from. Just as some kind of marketing ploy. As you say, neither of us are experts, but there are a multitude of experts out there that would have cottoned on to the lack of validity in Asus's words.

 

CPU's have Thermal protection, they shutdown before any damage (unless you disable it - which in some CPUs I'm not sure you can).

 

 

 

Not exactly. Running any CPU just short of the point where the CPU throttles back, just short of TJ.Max, causes accelerated degradation. Not forgetting that many OC enthusiasts run Prime 95 for many, many hours. Same applies to the link you posted. Strictly speaking Asus are correct. Degradation is accelerated under those circumstances. 

Share this post


Link to post
Share on other sites

How will the release a fix...in the forum of a BOIS or driver or what?

It will be in the form of a BIOS update from the motherboard manufacturer.

Share this post


Link to post
Share on other sites
Stress Testing

 

Users should avoid running Prime95 small FFTs on 5960X CPUs when overclocked. Over 4.4GHz, the Prime software pulls 400W of power through the CPU. It is possible this can cause internal degradation of processor components.

 

 

 

Rob might be interested to know that the link he posted from Asus and the above statement, refers to the CPU automatically over volting when AVX instruction sets are used. This is Intel spec of course and not the fault of the Prime95 developer. However, it is of course right that Asus warn us of this occurrence when running small FFT's on a CPU overclocked above 4.4GHz. If the CPU behaves this way, and the user has already increased voltage markedly, then excess voltage and heat can indeed accelerate degradation of the CPU.

Share this post


Link to post
Share on other sites

Here is a good read from the Intel forums regarding the discovery of this bug (before it was officially announced).  To be NOTED, Overclocking, normal, under-clocking conditions having no impact on the issue, it can happen regardless of clock speed with the correct series of complex instructions.

 

https://communities.intel.com/thread/96157?tstart=0

 

It's a somewhat humorous read with Intel finally admitting the problem.  I'd be interested to see what the publication contains when released and exactly how they "work around" this problem without a CPU design change ... since it doesn't appear to be clock related, I'd hazard a guess they use a less efficient set of instructions to do the processing (implications being performance penalty).

 

Cheers, Rob.

Share this post


Link to post
Share on other sites

People tend to think CPU's just work, and don't normally associate them with issues. So just for a laugh, check out the Haswell errata. Some 5 pages of flaws, most labelled as "no fix.
 
 
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf
 
 

Sometimes bugs are disclosed, sometimes they aren’t — Piledriver has a significant problem with 256-bit AVX instructions, for example, that injects an 18-20 cycle delay into executing multiple consecutive instructions. Every original Intel Atom (before Bay Trail) had a floating point flaw that could insert a NOP (no operation) into every other cycle, effectively doubling FPU compute time. No one bought an Atom for its FPU performance, so the bug didn’t get talked about.

 

 

 

http://www.extremetech.com/computing/220953-skylake-bug-causes-intel-chips-to-freeze-in-complex-workloads

 

Interestingly, there was an Asus UEFI fix for my board just a couple of days ago. And it is labeled as " update microcode".

 

It seems this bug is sporadic. Some have encountered it and some haven't. Some after a few minutes, some after hours.

Share this post


Link to post
Share on other sites

I think a good part of Asus' stance on "potential issues" with Prime95 comes from the Haswell introduction of adaptive voltage control, which when enabled (its default) boosts voltage significantly when AVX or FMA instructions are executed.  If someone is overclocking with increased voltages and has adaptive voltage enabled, running Prime95 will spike voltages to dangerous levels.

 

For information only:  The latest rev of intel's LINPACK benchmark, which is really all that IBT is, uses FMA instructions (that are tailor made for super fast execution of BLAS equations which are at the core of LINPACK).  If you really want to see way serious heat generation, get the latest version from intel and run its batch script.  My "everyday" 4770k 4.5 OC will instantly spike over 95c, and all stock will hit above 70c. BTW I am not recommending this for anything other than personal entertainment.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this