Data size has less to do with processing speed than anything else.
Just to make it clear, I'm not the one that claimed that data width was a performance issue.
Wider fixed-point integers are trivial to implement in software with not a huge loss in efficiency. Greater-precision floating point is probably trickier (don't ask me, I'm an integer guy)
x86 has had 80-bit floating point since the 8087 math coprocessor, so it's clearly not related to whether we're talking about 16, 32, or 64-bit CPUs. IEEE 754 specifies a 128-bit form but AFAIK the x86 doesn't implement it - though from the preceding, it ought to be able to do so independent of turning x86 into a 128-bit architecture.
x86 has 128-bit registers for SIMD instructions but they're used to hold 2 x 64-bit numbers, 4 x 32-bit numbers, etc.
The VAX, bless it, had some 128-bit operations back in the late 1970s (though it was only a 32-bit machine)
The physical size would just be too large which is why you see multiple cores and not longer word size.
That seems implausible. The cost of a 128-bit processor is, what, doubling the register size of a couple of dozen registers, and (probably) making ALU data paths twice as wide.
The reason why there aren't 128-bit general-purpose processors is that 128 bits of VA are not yet needed. And if it doesn't have 128 bits of VA, it's not a 128-bit machine. That is all I meant.
EDIT: I just read in Wikipedia that the amount of data stored on Earth is around 2-to-the-70th bytes; i.e., around 70 bits is enough to uniquely address everything we've got. So it'll be a while before any individual computer needs 128-bit addressing.
EDIT2: I was once part of a team that defined a software shared-memory architecture that used 128-bit addresses, but the idea there was that addresses were never reused, i.e., the address space was sparse. Once you'd destroyed the object at address 0x1234, nothing else would ever have address 0x1234.