View Single Post
  #1 (permalink)  
Old 01-25-2008, 09:46 AM
mikecentola's Avatar
mikecentola mikecentola is offline
the director.

 
Join Date: Mar 2004
Location: Webster, NY
Posts: 6,857
Info on the Xbox 360 Red Ring of Death

Got this from my friend Jim over on Roclife....

Quote:
DO NOT DO THE ABOVE AS SUGGESTED UNDER ANY CIRCUMSTANCES!


/resident xbox guru checking in... I've fixed several RRoD 360's including my own and I'm sure I've modded more xbox's and 360's total then anyone here... I know both consoles like the back of my hand.

The towel trick doesn't "clear codes". It is HURTING the console, despite alleviating the error condition TEMPORARILY. There is no way to "Fix" the RRoD without opening the console. PERIOD. Also never do that fucking "heat gun trick" or the god awful "eraser or penny" tricks either. THEY'RE CALLED TRICKS FOR A REASON. THEY'RE NOT FIXES. You damage your console and decrease it's lifespan with any of them. NOT TO MENTION VOID YOUR WARRANTY WHICH IS NOW THREE YEARS IF YOU GET A RRoD!

In order to understand this, you need to know what the RRoD actually IS and why it happens. It's very complex, and multifaceted and relies on several bad design decisions by microsoft to culminate, that is why the problem wasn't caught by their design team.

The typical RRoD that people are reffering too is secondary error code 0102. It's literal definition is "unknown error". The most common cause is the GPU on the motherboard is literally lifting itself out of it's socket on the board. When the GPU no longer contacts it's array, the xbox panics and throws the error. It would be like partially removing your CPU from your PC.

But what does that mean? The GPU and CPU of the 360 are processors that are not in a "socket" like a PC processor. Instead they are more like a laptop. They use a "BGA" - Balll Grid Array. This means all of the little "pins" that stick out of the bottom of the processer, have minuture spheres of solder placed onto them. These solder balls line up with holes in the motherboard, and then the whole assembly is heated up to expand the solder, permanantly securing the array to the motherboard. That way the processor cant' fall out,or move etc. IN THEROY.

The main problem in this case, is HEAT. The xbox's GPU gets so hot, that the motherboard "flexes" and distorts, enough that the bond between the solder balls and the array is broken. When this happens you get an intermittent error condition. AS the xbox gets HOTTER - sometimes it will actually bend ina way that is FAVORABLE - everything "expands" when it gets hot and the array might make contact again. That's why the "towel trick" will temporarily clear the error state. YOU'VE OVERHEATED YOUR XBOX TO THE POINT OF BENDING THE MOTHERBOARD! THis is NOT a good thing. In fact it makes the problem worse because when it does finally cool - now the array will shrink FURTHER thus damaging the array WORSE and cracking more solder connections.

This can happen so bad, that the processor can compleatly lift the array out of the board and it will never make contact again! Then you have a PERMANANT RRoD. Your chances of a permanant, UNFIXABLE situation like this DRASTICALLY increase when you overheat the xbox with the "heat gun" or "towel trick" methods.

No amount of heat is going to "resolder" or "reflow" the solder balls on the BGA either because by the time you get it hot enough to melt the now super hardened solder (the solder is harder physically and tougher to melt because of repeated high heat cycles of turning it on and off). - by the time it's that hot to melt them, you're damaging components on the board.

But why does this even happen? Here is a list of the things that CONTRIBUTE to the actual conditions that allow this to occour:

1. The xbox's thermal sensing capabilities for fan speed, are based on teh CPU temp. But the CPU has better cooling. So the GPU sits their frying while the CPU is happy and the fan speeds are low.

2. Microsofts thermal transfer material, is junk. It's a low quality pad between the processor and heatsink

3. Microsofts heatsink for the CPU is great. It's an Aluminum heatsink with a copper base and copper heatpipes. But the GPU heatsink is a VERY low quality aluminum ONLY heatsink and it's very short/undersized. It also has a poor finish on it.

4. The ducting for the fans isn't divided. Thus both fans are pulling air from the easiest way to intake it- the larger side with the CPU

5. The solder used in the BGA is lead free and the usage of it was poor.

6. The motherboards are using super thin low cost fiberglass, they are thinner and MUCH easier to bend then normal PC motherboards

7. The motherboard isn't SECURED to the metal case, under the processors like a PC would be. Thus all the weight of the heatsinks etc is just suspended on the motherboard which is only supported by the edges. tHis allows the board to "bend" or "warp".

8. The "x-clamps" used to hold the heatsinks in place, doesn't secure to the case either. And all of the pressure it exterts is from UNDER the board, and ONLY in the very center. Unlike a PC heatsink which exerts pressure from ON TOP the board, and from the EDGES. This "center only" pressure allows the heatsink AND subsequently the processor, to "rock" thus allow it to lift up it's edges etc.


This is just MOST of the things they did wrong that contribute to this. There are more little ones. To fix this, Microsoft added to later consoles - theier processors are now LITTERALLY GLUED to the board with epoxy! And they have a secondary heatsink now for the GPU which has a heatpipe to help pull some of the heat away from the primary GPU heatsink and get it to run cooler.

There are other causes for the 0102. There is a problem with the RAM getting too hot to, something Microsoft recognized by putting thermal gap filler material between the RAM on the underside of the motherboard and the metal case where there is almost no ventilation to use the case as the heatsink for the bottom ram.

Also, the CPU can have the same thing happen though it ussualy runs cooler due to better designed cooling. So it's ussualy the GPU.

To fix this yourself (if you aren't covered in the 3 year warranty for some reason), you need to not only make sure the processor is better secured, but also make sure the board can't flex, and make the system run colder. It's the heat that ultimetly "cause" this. So the follow is a list of things I did to mine to GREATLY reduce heat for "cheap" some are free. All of these things are GOOD - some of them, like X-CLAMP replacement, are CRITICAL to even get it to stop erroring:

Hopefully this will explain this to some people
__________________




team one.

Technotic Media | True Negative | Allstar Graphics
Reply With Quote
Sponsored Links