John said:
On Thu, 7 Aug 2008 07:44:19 -0700, "Chris M. Thomasson"
message [...]
Using multicore properly will require undoing about 60 years of
thinking, 60 years of believing that CPUs are expensive.
The bottleneck is the cache-coherency system.
I meant to say:
/One/ bottleneck is the cache-coherency system.
I think the trend is to have the cores surround a common shared cache;
a little local memory (and cache, if the local memory is slower for
some reason) per CPU wouldn't hurt.
For small N this can be made work very nicely.
Cache coherency is simple if you don't insist on flat-out maximum
performance. What we should insist on is flat-out unbreakable systems,
and buy better silicon to get the performance back if we need it.
Existing cache hardware on Pentiums still isn't quite good enough. Try
probing its memory with large power of two strides and you fall over a
performance limitation caused by the cheap and cheerful way it uses
lower address bits for cache associativity. See Steven Johnsons post in
the FFT Timing thread.
I'm reading Showstopper!, the story of the development of NT. It's a
great example of why we need a different way of thinking about OS's.
If it is anything like the development of OS/2 you get to see very
bright guys reinvent things from scratch that were already known in the
mini and mainframe world (sometimes with the same bugs and quirks as the
first iteration of big iron code suffered from).
Yes. Everybody thought they could write from scratch a better
(whatever) than the other groups had already developed, and in a few
weeks yet. There were "two inch pipes full of piss flowing in both
directions" between graphics groups.
Code reuse is not popular among people who live to write code.
NT 3.51 was a particularly good vintage. After that bloatware set in.
CPU cycles are cheap and getting cheaper and human cycles are expensive
and getting more expensive. But that also says that we should also be
using better tools and languages to manage the hardware.
Unfortunately time to market advantage tends to produce less than robust
applications with pretty interfaces and fragile internals. You can after
all send out code patches over the Internet all too easily ;-)
NT followed the classic methodology: code fast, build the OS,
test/test/test looking for bugs. I think there were 2000 known bugs in
the first developer's release. There must have been ballpark 100K bugs
created and fixed during development.
Since people buy the stuff (I would not wish Vista on my worst enemy by
the way) even with all its faults the market rules, and market forces
are never wrong...
Most of what you are claiming as advantages of separate CPUs can be
achieved just as easily with hardware support for protected user memory
and security privilige rings. It is more likely that virtualisation of
single, dual or quad cores will become common in domestic PCs.
Intel was criminally negligent in not providing better hardware
protections, and Microsoft a co-criminal in not using what little was
available. Microsoft has never seen data that it didn't want to
execute. I ran PDP-11 timeshare systems that couldn't be crashed by
hostile users, and ran for months between power failures.
There was a Pentium exploit documented against some brands of Unix. eg.
http://www.ssi.gouv.fr/fr/sciences/fichiers/lti/cansecwest2006-duflot.pdf
Loads of physical CPUs just creates a different set of complexity
problems. And they are a pig to program efficiently.
So program them inefficiently. Stop thinking about CPU cycles as
precious resources, and start think that users matter more. I have
personally spent far more time recovering from Windows crashes and
stupidities than I've spent waiting for compute-bound stuff to run.
If the OS runs alone on one CPU, totally hardware protected from all
other processes, totally in control, that's not complex.
As transistors get smaller and cheaper, and cores multiply into the
hundreds, the limiting resource will become power dissipation. So if
every process gets its own CPU, and idle CPUs power down, and there's
no context switching overhead, the multi-CPU system is net better off.
What else are we gonna do with 1024 cores? We'll probably see it on
Linux first.