What would 128 bits computing look like?

[ - ]

80 points

1 year ago

The PS3 had a 128-bit CPU. Sort of. “Altivec” vector processing could split each 128-bit word into several values and operate on them simultaneously. So for example if you wanted to do 3D transformations using 32-bit numbers, you could do four of them at once, as easily as one. It doesn’t make doing one any faster.

Vector processing is present in nearly every modern CPU, though. Intel’s had it since the late 90s with MMX and SSE. Those just had to load registers 32 bits at a time before performing each same-instrunction-multiple-data operation.

The benefit of increasing bit depth is that you can move that data in parallel.

The downside of increasing bit depth is that you have to move that data in parallel.

To move a 32-bit number between places in a single clock cycle, you need 32 wires between two places. And you need them between any two places that will directly move a number. Routing all those wires takes up precious space inside a microchip. Indirect movement can simplify that diagram, but then each step requires a separate clock cycle. Which is fine - this is a tradeoff every CPU has made for thirty-plus years, as “pipelining.” Instead of doing a whole operation all-at-once, or holding back the program while each instruction is being cranked out over several cycles, instructions get broken down into stages according to which internal components they need. The processor becomes a chain of steps: decode instruction, fetch data, do math, write result. CPUs can often “retire” one instruction per cycle, even if instructions take many cycles from beginning to end.

To move a 128-bit number between places in a single clock cycle, you need an obscene amount of space. Each lane is four times as wide and still has to go between all the same places. This is why 1990s consoles and graphics cards might advertise 256-bit interconnects between specific components, even for mundane 32-bit machines. They were speeding up one particular spot where a whole bunch of data went a very short distance between a few specific places.

Modern video cards no doubt have similar shortcuts, but that’s no longer the primary way the perform ridiculous quantities of work. Mostly they wait.

CPUs are linear. CPU design has sunk eleventeen hojillion dollars into getting instructions into and out of the processor, as soon as possible. They’ll pre-emptively read from slow memory into layers of progressively faster memory deeper inside the microchip. Having to fetch some random address means delaying things for agonizing microseconds with nothing to do. That focus on straight-line speed was synonymous with performance, long after clock rates hit the gigahertz barrier. There’s this Computer Science 101 concept called Amdahl’s Law that was taught wrong as a result of this - people insisted ‘more processors won’t work faster,’ when what it said was, ‘more processors do more work.’

Video cards wait better. They have wide lanes where they can afford to, especially in one fat pipe to the processor, but to my knowledge they’re fairly conservative on the inside. They don’t have hideously-complex processors with layers of exotic cache memory. If they need something that’ll take an entire millionth of a second to go fetch, they’ll start that, and then do something else. When another task stalls, they’ll get back to the other one, and hey look the fetch completed. 3D rendering is fast because it barely matters what order things happen in. Each pixel tends to be independent, at least within groups of a couple hundred to a couple million, for any part of a scene. So instead of one ultra-wide high-speed data-shredder, ready to handle one continuous thread of whatever the hell a program needs next, there’s a bunch of mundane grinders being fed by hoppers full of largely-similar tasks. It’ll all get done eventually. Adding more hardware won’t do any single thing faster, but it’ll distribute the workload.

Video cards have recently been pushing the ability to go back to 16-bit operations. It lets them do more things per second. Parallelism has finally won, and increased bit depth is mostly an obstacle to that.

So what 128-bit computing would look like is probably one core on a many-core chip. Like how Intel does mobile designs, with one fat full-featured dual-thread linear shredder, and a whole bunch of dinky little power-efficient task-grinders. Or… like a Sony console with a boring PowerPC chip glued to some wild multi-phase vector processor. A CPU that they advertised as a private supercomputer. A machine I wrote code for during a college course on machine vision. And it also plays Uncharted.

The PS3 was originally intended to ship without a GPU. That’s part of its infamous launch price. They wanted a software-rendering beast, built on the Altivec unit’s impressive-sounding parallelism. This would have been a great idea back when TVs were all 480p and games came out on one platform. As HDTVs and middleware engines took off… it probably would have killed the PlayStation brand. But in context, it was a goofy path toward exactly what we’re doing now - with video cards you can program to work however you like. They’re just parallel devices pretending to act linear, rather than they other way around.

permalink

report

reply

[ - ]

rickdgray@lemmy.world

50 points

1 year ago

They would look the same really. The word size being 128 instead of 64 doesn’t really change anything about the architecture. It just means the proc’s registers are 128 bits in size, the system bus is 128, each RAM address and data is 128, etc. The only difference would be significantly more expensive to crunch ridiculously large numbers. So really not much benefit. I expect 64 to be the standard for quite a long time, maybe forever, because we have much bigger bottlenecks to worry about.

permalink

report

reply

[ - ]

hansl@lemmy.ml

16 points

1 year ago

There are already special instruction sets that deal with 128 and up bits. Many SIMD. AVX-512 for example deals with 512 bits at a time.

At this point the advantage is parallelization and specialization of operations. AVX can be used for video encoding/decoding for example, or crypto, …

permalink

report

parent

reply

[ - ]

TehBamski@lemmy.world

6 points

1 year ago

maybe forever, because we have much bigger bottlenecks to worry about.

Well now I’m wondering what bottlenecks you have in mind. What do you believe to be the biggest bottlenecks for PCs in the near future?

permalink

report

parent

reply

[ - ]

Obsession@lemmy.world

19 points

1 year ago

We’re getting to the point where we can’t really make transistors much smaller, for one

permalink

report

parent

reply

[ - ]

mindbleach@lemmy.world

12 points

1 year ago

Mostly heat. Every gate destroys information, which is kinda the definition of entropy, so it necessarily generates heat. There’s goofy plans for “reversible computing” that swap bits - so true is 10 and false is 01 - and those should only produce heat through the resistance in the wires. (I personally suspect you’d have to shuttle data elsewhere and destroy it anyway. That’d be off-chip, so it could be arbitrarily large, instead of concentrating hundreds of watts in a thumbnail of silicon. But you’d still have a motherboard with a north bridge, a south bridge, and a woodshed.)

The other change that’d make wider lanes less egregious is 3D chip design. We’re pretty far from 2D, already. There’s dozens of layers of stuff going on in any complex microchip. AMD’s even stacking a couple naked dies on top of one another for higher memory bandwidth. But what’d be transformative is the ability to fold any square layout into a cube, with as much fine detail vertically as it has horizontally. 256-bit data paths could be 16 traces wide and tall. Some could have no presence at all, because the destination is simply atop the source, and connected by a bunch of 10nm diagonals.

But aside from the design and manufacturing complexity of that added dimension, current technology would briefly turn that cube into an incandescent lightbulb. The magic smoke would escape with unprecedented efficiency.

permalink

report

parent

reply

[ - ]

vrighter@discuss.tchncs.de

41 points

1 year ago

exactly the same as 64 bit computing, except pointers now take up twice as much ram, and therefore you need mire baseline momory throuput/ more cache, for pretty much no practical benefit. Because we aren’t close to fully using up a 64-bit address space .

permalink

report

reply

[ - ]

botengang@feddit.de

8 points

1 year ago

Our modern 64 bit processors do use 128 bits for certain vector operations though, don’t they? So there is another aspect apart from address space.

permalink

report

parent

reply

[ - ]

I need NOS@lemm.ee

5 points

1 year ago

Yes, up to 512 bits since Skylake. But there are very few real-world tasks that can make use of such wide data paths. One example is media processing, where a 512-bit register could be used to pack 8 64-bit operands and act on all of them simultaneously, because there is usually a steady stream of data to be process using similar operations. In other tasks, where processing patters can’t make use of such batched approaches, the extra bits would essentially be wasted.

permalink

report

parent

reply

[ - ]

Kevin@lemmy.world

17 points

1 year ago

It wouldn’t be much different. Was it noticeably different when you went from a 32 bit to 64 bit computer?

permalink

report

reply

[ - ]

vis4valentine@lemmy.mlOP

3 points

1 year ago

For me it was, actually. Maybe because I was late to the party so people stopped developing shit for 32 bits, and when I did the transition was like “Finally, I can install shit” also my computer was newer and the OS worked better.

permalink

report

parent

reply

[ - ]

red@feddit.de

13 points

1 year ago

So your PC was old (thus the new one faster) and its HW no longer supported by some software developers (because it was outdated and not enough users were on it anymore). The same can hold true if you have a 5 year old PC now. You didn’t notice this due to going 64bit, you noticed it due to going away from a heavily outdated system.

permalink

report

parent

reply

[ - ]

kglitch@kglitch.social

15 points

1 year ago

The big shortcoming of 32 bit hardware was that it limits the amount of RAM in the computer to 4 GB. 64 bit is not inherently faster (for most things) but it enables up to 16 exabytes of RAM, an incomprehensible amount. Going to 128 bit would only be helpful if 16 exabytes wasn’t enough.

permalink

report

reply

[ - ]

FelipeFelop@discuss.online

7 points

1 year ago

Slightly off topic, but the number of bits doesn’t necessarily describe the size of memory. For example most eight bit processors had 16bit data busses and address registers.

What would 128 bits computing look like?

No Stupid Questions

!nostupidquestions@lemmy.world

No such thing. Ask away!

Rules (interactive)

Credits

Community stats

Community moderators