This is how capable the PS3's Cell processor really was

The Cell processor is probably the most mythical chip in console history. Many say the Cell had buckets of untapped potential, citing complex hardware design as the primary reason the system was never fully tapped. But do these claims hold water in any scenarios, practical or not? The truth about the chip isn't as straightforward as a set of statistics or a benchmark. These claims are partially correct, in fact; the PS3 was more capable than it showed in any games that were produced for the system. The issue runs deeper than difficulty with the design, however.

programming-the-ps3-4-728.jpg


Let's get to grips with what the Cell really consists of before we get into the problems. The Cell consisted of one 64-bit, dual-threaded 3.2GHz PowerPC core (the "PPE", or Power Processing Element) and 8 SPEs (Synergistic Processing Elements) - of which 7 are available in-game. Some people misconstrue these SPEs as 8 cores, but this is a misconception - I'll explain why below. The SPEs are optimised for SIMD operations (Single instruction, multiple data - self-explanatory) and have 256KB of local store/SRAM. This doesn't sound so bad; what's the issue?

Well, the SPEs are fatally flawed in a few vital ways. The SPEs are able to talk to each other, but they do not have branch prediction - in other words, self-modifying, changing or "branching" code (such as common "if", "then" or "else" statements) is very problematic, as the PPE has to babysit the SPEs when branching code is in effect. This puts a lot of strain on the single PPC core because along with sending instructions to the GPU and likely doing work itself, it has to manage 7 active SPE units. The Cell is designed to compensate for this with compiler assistance, in which prepare-to-branch instructions are created, but even then it's not a perfect solution because
Let's suppose this isn't an issue and you have very efficient PPE code that cleanly manages this process - time to deal with the next big issue with the Cell: the pipeline. The Cell has a 23-stage pipeline, between the PPE and the SPEs; this is obviously quite a long pipeline to go through every instruction. This problem is exasperated by the fact that while the PPE has some very limited out-of-order execution capabilities (the ability to execute load instructions out-of-order), the SPEs are strictly in-order - that is to say, instead of being able to execute instructions without any fixed order, every instruction must be individually processed in-order. A long, in-order pipeline is a further expense to performance.

So let's assume you have written optimised code that minimises branching operations and has the PPE efficiently babysitting the SPEs. Surely that's not unreasonable?
Still not the end of the problems, unfortunately. The SPEs only have 256KB SRAM each dedicated to them, so unless your program is small enough to fit into that 256KB, it needs to be transferred in from memory. To add salt to the wound, the SPEs cannot directly access RAM; they have to perform a direct memory access (DMA) operation through the SPE's controller to transfer the data, 256KB at a time.
On top of all these issues in a game scenario, the SPEs often had to spend much of their processing time compensating for the PS3's weak, lacklustre and in some cases broken (see: antialiasing) GPU.

That sounds pretty bad, really. In order to fully "tap" the Cell, you have to write code that effectively doesn't change/branch, it has to be straightforward enough to fit within 256KB of memory and it has to be efficiently parallelised to ensure none of the SPEs are waiting for the pipeline to finish. In a game scenario, that is realistically impossible and the peak throughput of the Cell is unattainable, with the final results being significantly less than the brute computational power it is capable of. So maybe you're wondering about scientific calculations - like, you know, all those simulations run on Cell clusters? That could be feasible, right?
Unfortunately, even then, the Cell's design backfires. In theory, this use case is very practical - and indeed, physicians used this setup to simulate black holes among other things, and in terms of theoretical throughput, a cluster of Cells remained one of the top 50 supercomputers for longer than the average supercomputer. So what's the issue?
The problem is, many scientific calculations rely on double-precision floating point calculations due to the sheer amount of numbers involved - in other words, non-integer numbers involving 64-bit calculations - and this slaughters performance on the PS3. In theory, each SPE @ 3.2GHz is capable of 25.6 GFLOPS; when using double precision floating point numbers, each SPE is capable of a pitiful 1.8 GFLOPS (with the entire system capable of 20.8 GFLOPS in total, including the PPE, in these cases). At 25.6GFLOPS, all 8 SPEs would have a combined theoretical maximum throughput of 204.8 GFLOPS; with double precision floating point calculations, 14.4 GFLOPS is the collective total the SPEs can attain in an ideal situation, which is paltry.

To summarise, in a game situation, it is effectively impossible to "fully" utilise the Cell's potential; you would need to write extremely concise and small code that has the absolute minimum amount of branching (if/then/else etc.) and code that parallelises calculations across 7 chips - keeping them all busy and timing operations so as to waste zero time with the pipeline - while also compensating for a weak GPU. Even in a scientific context, the most applicable use for the Cell would be dealing with 32-bit numbers with very efficient code. So in theory, while the Cell is very competent, the usable potential of the chip is a fraction of what it may appear to offer on paper.
  • Like
Reactions: 8 people

Comments

I’m curious, what’s the scoop on that part where you implied that antialiasing on the PS3 is broken? I remember reading that the PS2 has a broken AA solution as well, like it required something potentially too performance intensive and/or difficult to program in order to get it to work, so very few games actually used it. It was like, polygon... polygon sorting, something like that.
It’d be kind of funny if it turned out that Sony didn’t learn their lesson and made the same mistake on the PS3.
 
The PS3 antialiasing was broken because Sony threw in a GPU at the end of design (the original intent was to have the Cell do everything.. oh golly) and many bugs went unfixed. The antialiasing bug, in particular, is where GPU hardware AA (MSAA) must be performed at 960x1080, 1280x1080, 1440x1080 or 1920x1080. So if you want to antialias on the PS3 GPU, either run it at one of those arbitrary resolutions or waste memory and resources upscaling and then antialiasing.

Edit: this was commonly circumvented by using a flavour of antialiasing, usually a post process AA solution, on one of the SPEs. The most common one afaik was MLAA or a straight up edge blending/blurring technique like The Saboteur uses.

Edit 2: Double checked and not only was MSAA broken because of the resolutions it had to be used at, but according to the lead shader programmer for Call of Juarez hardware MSAA cannot be used while rendering using multiple render targets (buffers) which is hilarious.
 
  • Like
Reactions: 1 person
The initial pitch was that the PS3 didn't need a GPU since Cell would do everything on a software level for rendering.

And then you know the rest. Rushes to get any sort of dedicated graphics processor, Microsoft ends up being there when they thought Sony would be and ends up with more or less a crippled Xbox 360 for most games. Their were a few developers that could genuinely manage some insane results out of the thing however.
 
  • Like
Reactions: 1 person
BiggieCheese: The PS2 does indeed have very poor AA capabilities, and while there are certainly hardware/rendering issues at play I think it's more of a product of its time and trying to keep hardware costs manageable rather than strictly intentional oversight by Sony. As an aside to tech specs alone: Much like Blu-Rays on PS3 (even at the launch price of $599 USD, the PS3 cost less than most Blu-Ray players, and that's speaking almost strictly of the entry to mid-range players; if I recall correctly many at the time could not be had for less than $1000), many people neglect or perhaps were younger and not as aware at the time that the PS2 was a rather affordable DVD player and I believe had at least a measurable impact with the widespread adoption of the DVD format.

Depending on the disposable income of your family and friends, the PS2 was the first DVD player I saw in many homes of myself and friends..despite the reference-level players being available something like 5 years prior. Unless you knew families with LaserDisc players and AVRs capable of decoding the AC3 frequency modulated surround sound signal in the mid to late 90s (those people had disposable incomes dumped into home theater that were uh quite high at that time, by the way), DVD-on-PS2 was a huge selling point of the system for an average consumer. The PS2 really embraced the 'media center' idea of a console long before that became a marketing gimmick.

It also ended up stuck in an in-between era--the AA problems became very noticeable when we tried to plug PS2s into LCD TVs with fixed resolutions and rather poor early upscaling techniques...nevermind that many were doing so with the included-in-box composite cables. At the PS2's inception, the gaming market still consisted primarily of 480i CRT TVs with a bit of 'built-in' AA capability...as in the CRT itself could provide the 'effects' of AA via its technical shortcomings alone lol. In fact, PS2 games on a CRT that supports component video input actually look a bit worse and jaggier (despite having better color separation) in many circumstances because you're ultimately circumventing the use case the system was somewhat designed around at that point. It's a wonder they weren't more forward thinking considering the Dreamcast of all things supported progressive scan on 99% of its library through VGA output.

I often wondered if perhaps this is why the PS3 supported 1080p, HDMI, and PCM audio on day one, aside from meeting Blu-Ray standards. The fact of that time was that any 1080p TV then could easily cost double or triple the 720p models on the sales floor. So I think contrary to your point that they didn't learn their lesson so to speak, I think they did...they made sure the PS3 was 'capable' of doing what consumer-level TVs would support a few years from then rather than just on day one, unfortunately 'capable' is kind of a bare minimum word isn't it?

To your point about the MSAA TheMrIron2 I recall that a lot of high-profile multiplatform games often had comparison articles on PS3 vs. Xbox 360 with respect to the actual resolution/framerate many of the games were truly running at, which as you mentioned were often rather odd, arbitrary (since none of those values share a common divisor), while still being marketed as running at 720/1080p. I think it was often found that for performance reasons the lower end of that resolution spectrum was utilized, especially for the online competitive modes vs the single player campaign. I seem to remember that the Xbox 360 compensated in software by performing an odd trick that imitated interlacing. If I'm recalling, postprocessing effects such as AA were being achieved prior to final output by being split into two sections for each frame and effects were applied to opposing frames on a half-and-half basis. Perhaps I'm misremembering though.

At any rate, this was a very informative writeup. If you don't mind sharing, I'm assuming that unless you're a very dedicated enthusiast that you have a background in computer engineering or electrical engineering? I would tend towards CPE (I myself have that background) because of your knowledge of the software interactions and implications the hardware provides; most EEs I've met love the phrase "They can fix it in software!" I would be interested, if you yourself have any interest in the console, on a similar writeup for the Sega Saturn. That was another rather powerful system on paper, and really should have been able to outperform the PS1 in some regards yet most if not all multiplatform games looked quite poor and often ran even more poorly in comparison on the Saturn.
 
  • Like
Reactions: 3 people
@Idontknowwhattoputhere Don't take this the wrong way, but.. did you actually read the article? The entire point is that the PS3 doesn't beat the Switch at all in any practical situation, unless you are performing some esoteric supercomputer calculations that happen to not deal with anything but 32-bit integers and happens to be straightforward/brute-force enough that it would fit comfortably within cache.

@Ryccardo I'm assuming it's in jest to some degree (measuring "fun") but where did they get "G" for speed? GHz? The PS3 had 1x 3.2GHz PPC core (or 2x 1.6GHz threads) and the 8x 3.2GHz SPEs. Even then GHz is a terrible measurement. Really highlights the Wii's power efficiency though, in all fairness.

@PerfectB I think with the PS3, it would have been silly not to support 1080p from day one.. the PS2 supported 1080i output officially and hidden 720p/1080p output options have since been dug up by homebrew programmers. And that was over component, so HDMI with 1080p was the natural step forward.

As for AA, it's quite fascinating what was done most of the time -- the 360 had 10MB of eDRAM on the GPU that was obscenely fast, so developers used arbitrary resolutions (1024x600 being the most common, from memory, and some games like Black Ops used 880x720 to keep vertical resolution intact) to fit their framebuffers in there and get 4xMSAA (and post process effects) for virtually no cost, as bandwidth from that point was on the basis of "the sky's the limit". This worked especially well on 360 because it had the dedicated hardware upscaling chip (LANA?) combined with a good dose of AA, and I think developers just carried over these oddball resolutions to the PS3 because they saved precious memory. Not sure about the half-frame trick you're talking about, but I wouldn't be surprised - it sounds like field rendering? Except for half the image instead of alternating lines.

To clarify though, I have no computer qualifications. I just learn about the things I'm interested in. Regarding the Saturn, it was a ridiculous hardware setup (in a good way, for people curious about hardware!) and I think the lack of floating point was ultimately what killed its ability to effectively produce 3D games -- it might well be worth a write-up, though I'd like to do the Jaguar (in the same "could have been more" boat) before writing about the Saturn.
 
  • Like
Reactions: 1 person

Blog entry information

Author
TheMrIron2
Views
1,725
Comments
14
Last update

More entries in Personal Blogs

More entries from TheMrIron2

General chit-chat
Help Users
    Veho @ Veho: Wow, only $700?