Connect with us

Supercomputer OS's

Discussion in 'Electronic Design' started by joseph2k, Feb 15, 2007.

Scroll to continue with content
  1. joseph2k

    joseph2k Guest

    For the entire top 500 they are running a Unix variant or a linux variant.
    Linux "owns" the list with Unix claiming just 92 out of the 500 machines,
    Mac OS X on three of them and not a single one running any M$ OS. This
    includes Japan's earth simulator, a vector machine (now ranked 14). IBM
    and HP are the top hardware vendors. Some interesting surprises for me.
  2. Interesting yes, but what does it prove? That's like going to a large
    parking lot and saying oh look, none of these 500 cars run on
    kerosene. Well duh, no. Not designed to. You would think if they can
    afford to buy a super computer they could afford a custom OS designed
    for the super computer. Maybe there isn't such a thing.

    What does this have to do with installing Vista?
  3. Ian Bell

    Ian Bell Guest

    I am not surprised. Linux has been a big player in supercomputing few
    several years now and ISTR the THE fastest supercomputer runs Linux.

    Neither am I surprised MS figures nowhere, they only make toy OSs.

  4. Gibbo

    Gibbo Guest

    Nascar, Formula 1 and Indy cars don't have CD players in them either.

    If you can't see the relavance of this statement then you're not too bright.
  5. MassiveProng

    MassiveProng Guest

    I can already hear the helo blades...
  6. That you should say clear of it for high performance :)
  7. Well a unix or a linux is an open source system that
    is adaptable for the rare case of a super computer,
    a closed source MS-OS is not adaptable. And MS sees
    liitle reason to support super computers. Too little

  8. Fred Bartoli

    Fred Bartoli Guest

    Rene Tschaggelar a écrit :
    Well, here's a valid one:

    The next release after Vista will probably need clusters of supercomputers.
    At least:
    - one to run notepad
    - one to run a dos windows
    - one to report what you're doing to MS
    - two to run the auto update
    - one to report bugs
  9. Iwo Mergler

    Iwo Mergler Guest

    This is not as surprising as it sounds.

    There are thousands of people worldwide capable of
    porting Linux to a new architecture, and the source
    code is readily available. For a new processor
    or new computer architecture, Linux is the obvious
    first choice.

    You pick an already supported architecture similar
    to your own and make a few modifications. It's a few
    tens to a few hundred lines of code.

    As we speak, Linux already supports about 50 different
    processor families, with hundreds of different

    Can you imagine the amount of money required to get
    MS to port one of their OS'es to a computer of which
    less than 10 will be sold?


  10. ISTR that some years ago IBM gave up on most proprietary mainframe OSs and
    went to Unix / Linux.

    My experience with most mainframe OSs was they were vile beyond belief.
  11. Unfortunately (for Microsoft) some of these low volume technologies, or
    their spin-offs eventually grow into mainstream applications. Microsoft
    doesn't try to meet the demands of the marketplace so much as to form
    the marketplace to suit them. Their typical answer to meeting a
    requirement that they do not support is, "Why do you want to do that?"
  12. Err ...a tad optimistic that!

    In my very recent experience a Linux port to a slightly different environment is
    about 8 people's time for a year and thousands of line of code - mostly test
    cases though, but also new drivers and customising things in "sysfs" and /etc/.
    Boot scripts take a good deal of effort to get right too. Thousands of emails
    also to the kernel list because one, "surprise" will find stuff when testing
    that has been there for years and does not work because nobody used it - except
    the customer that wanted the port!

    The customer did not port the core application: They build a Virtual Machine to
    run the old application on the Linux we provided. The old app is millions of
    lines of crufty Erlang code that has been running since 1970! Nobody will change
    that just for the sake of some fashion in OS's!!
    ... Or the development Tools.
  13. joseph2k

    joseph2k Guest

    MS used to support about 6 different hardware platforms. Now they only
    support 2. They are after the millions of copies market. The list is only
    the fastest 500 supercomputers, not all of them. And, oh my, Mac OS X
    shows up there. There is a difference in the scalability of the two OS's.
  14. Iwo Mergler

    Iwo Mergler Guest

    8 man-years sounds excessive. Did any of those people
    have previous Linux-porting experience? Otherwise a
    year of learning curve is realistic. Recent kernels
    are a lot easier to port, but take longer to understand.

    I suppose I should qualify "Linux port". In my understanding
    that means porting the kernel. That usually involves adjusting
    a few addresses and rewriting or adapting the low level
    assembler stuff. Most new drivers tend to be unnecessary,
    as most jellybean hardware is compatible with the same thing
    on a different platform. It's not always obvious.

    If done correctly, there should be no change necessary
    in userspace, not even boot scripts. Your description
    sounds like you have included system setup from scratch
    and applications in "Linux port". Am I right?

    Kind regards,

  15. Rich Grise

    Rich Grise Guest

    I've just done a quick google:"supercomputer+linux"
    gets about 9,120 hits, and
    gets "about 1,320,000" hits.

    It's strangely reassuring. :)

  16. 2 are hardened expert developers, 1 intermediate (me) 1 project manager, 1
    system manager, 2 testers and 1 writing documentation and the handling off
    releases back to Open Source (which needs a process so at least one does not
    deliberately leak patented IP).
    Keep Dreaming ;-)

    This particular setup was a disk-less dual Opteron card (actually *using* IPMI
    for syslog reporting to a manglement system). The card must reserve one core on
    each CPU for a high priority process, a virtual machine, while the reast of the
    kernel, IRQ's and user stuff goes on the second core. In the case of a kernel
    panic, a capture kernel is kept in memory, the only task of that is to perform a
    memory dump over TCP, clean up the mess and restart without loosing the data in
    the VM (because there is a database in there). If that fails the board reboots.
    The Init process deals with reserving the cores and redirecting IRQ's. Athlon
    was kind-of new two years ago.

    Now, the easy way, the one with "a few adresses adjusted" would be a Linux-BIOS.
    Boot up, job done.

    However, that is not "industry standard", the BIOS must be allowed to piss over
    everything first (just in case someone wants to run windows on the board).
    Instead PXE-Linux loads the kernel and boots (but PXE-Linux's path naming
    algorithm is soo gross that it needs a patch: It assumes one unique kernel per
    board id'ed by macaddr; we want the same kernel for *many* boards in a location
    *we* name). Then the init process must undo what the BIOS+PXE Linux broke (as
    far as possible) and ... there are some gross hacks to find the address where
    the capture kernel is to be loaded (the kernel cannot touch that memory area).
    Because the kernel cannot touch that memory, EEC error correction will not be
    triggered and the capture kernel might be corrupt when the real kernel
    eventually panics so we need memory scrubbing. Memory scrubbing is not supported
    in the kernel for K8 (Athlon/Opteron) so we have to fix the bleeding-edge EDAC
    driver so it does.

    Then there are thousands of things that do not quite work as advertised - like
    truncating of core dumps f.ex. which nobody apparently used, ever. Getting kexec
    to work from one kernel to another was easy - getting kexec to work once more
    back to the old kernel HARD, nobody kexec's twice, apparently. e.t.c.

    The high-resolution timer patch was not available for Athlon back then (kernel
    2.6.16) so we had to use /dev/rtc for some "microsleep" stuff. There is a BUG in
    /dev/rtc - some interaction between the HPET hardware and the software emulation
    of the RTC device used in new kernels we never quite got to the bottom of ...
    but RTC is a fossil so it will never get fixed. RTC also goes offline every 11
    minutes when NTP updates the kernel time. Oh!

    The testing, fixing, workarounding and documenting of all the niglets and
    failures take time. However we are now quite convinced that this system *will*
    run for 10 years without a hard reboot and the customer knows how to use it from
    the documentation!

    The build system takes a lot of time too - getting from the standard kernel
    source and to the thing we ship in a sane way takes a lot of design. Basically
    we do as RedHat does: take a standard kernel and patch the hell out of it before
    the build. But just to make the process more fragile and suck more disk and
    network bandwith, we must use Clearcase - the corprat standard!
  17. Iwo Mergler

    Iwo Mergler Guest

    Quite an impressive story. I still maintain that you didn't actually
    port Linux (x86 was already supported on PC compatibles). You were
    probably able to boot a standard distribution to start with.

    What you did was a hell of a lot harder - creating a new OS based
    on Linux, to do stuff Linux wasn't able to do initially. You did
    very non standard things with a standard PC. I can see how this could
    be relevant to supercomputers.

    My experience is with completely new architectures - new ASIC designs,
    incompatible with anything that went before. The aim is usually to
    get 'normal' Linux running on them.
    I can feel your pain...

    Kind regards,

  18. Hmm, this implies that your port is not available in its entirety under
    the GPL license. Since the kernel itself is GPL, you must be very careful
    to keep your proprietary stuff at arms length---at least as kernel modules
    if not delegated to userspace.
Ask a Question
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Electronics Point Logo
Continue to site
Quote of the day