Connect with us

Processor question

Discussion in 'Electronic Design' started by Tim Williams, May 16, 2008.

Scroll to continue with content
  1. Tim Williams

    Tim Williams Guest

    I read that, most of the time, jumps make the processor flush the pipeline,
    or the cache, or the other cache, I suppose in order of severity (short /
    near / far jump). What kind of preprocessing and caching strategies are
    people mucking around with these days?

    I thought of this question while contemplating QBasic. You see, even with
    over 1.8GHz available today (I say 1.8 because that's only what I'm sitting
    in front of now :) ), a simple "sit in the corner and count to a billion"
    takes QBasic over a minute! In contrast, a "somewhat" more optimized loop
    written in assembly does 2^32 in just seven seconds.

    Tim
     
  2. Tim Williams

    Tim Williams Guest

    It is often interpreted, but I don't think it's so simple (and so slow!) as
    a runtime environment. I don't know just what Microsoft put into the
    interpreter, but given the similar speed of interpreted vs. compiled
    programs, I would imagine it's similar.
    You mean like, QBASIC? -- I'm using QuickBasic 4.5, which includes a
    compiler, and it makes bare naked .EXE files. So there. ;-) Executables
    usually run a little faster than in the IDE, but not by the orders of
    magnitude I've been seeing.

    Tim
     
  3. Guest

    I've found the QBASIC, the version that shipped with DOS, seems to
    have a somewhat edgy relationship with windows' idea of how to run a
    dos program... it could be that the IDE is being slowed by its poor
    interaction with windows in a way that the compiled program
    (presumably with little to no user interface) would not be.
     
  4. Tim Williams

    Tim Williams Guest

    Could be. If all the modules are installed, then that would explain how a
    "Hello World!" can take up 30kB...

    I debugged the bare FOR loop, and discovered it uses a FAR call for
    comparing the DWORD variable (which is moved out of memory to increment,
    back into memory, then pushed onto and off of the stack for the call). My
    question is, is the far call the fatal operation that slows it down, and
    why, in terms of today's hardware?

    Tim
     
  5. JosephKK

    JosephKK Guest

    The historical execution time cost ratio between interpreted and
    compiled is centered at about 10:1 in favor of compiled. More
    recently that seems to have decreased to about 5:1 due to interpretive
    build environments and execution environments that use a fractional /
    partial JIT compiler. Java development / runtime environments are a
    fine example of this.
     
  6. Tim Williams

    Tim Williams Guest

    So having established a bit about QuickBasic's behavior, what kind of
    preprocessing and caching strategies are people mucking around with these
    days?

    Tim
     
  7. Tim Williams

    Tim Williams Guest

    Hardware question != software answer...?

    Maybe I phrased the OP too problematically...it was a curiosity... maybe I
    should get D from BC to write for me...

    Tim
     
  8. Tim Williams

    Tim Williams Guest

    I mean hardware, like, what's on the chip, what flushes cache, etc......
    and maybe anything that controls that (BIOS?!).

    Tim
     
  9. JosephKK

    JosephKK Guest

    In most cases cache controls itself with a LRU algorithm.
     
  10. Tim Williams

    Tim Williams Guest

    Hmm, interesting.. so by the sound of it my experience may in fact be
    rather specific to the processors I've used?

    Tim
     
  11. Tim Williams

    Tim Williams Guest

    Sure it does. Would you like an MS-DOS executable that runs alone? I have
    many. It makes 8086 code, since the compiler is copyright 1985...
    I'm sure the compiler is awfully naieve though, putting pieces together.
    Well to be completely specific, I looked into it, and it seems to run a
    general loop, holding the long (32 bit) integer in some memory location,
    and making a far call (pushing values onto the stack) to compare the
    variable to the constant. Now if far calls don't cost much, I would expect
    this to run maybe 20 times slower than the most optimized loop I can
    concieve of, but we're talking several orders of magnitude here.
    Indeed. But if you may recall, optimization wasn't my question, it was
    much more general, which is why I asked here.

    Tim
     
  12. Tim Williams

    Tim Williams Guest

    Alright, well I count 23 in the loop I observed. So naievely I might
    assume the code runs about 20 times slower than the most optimal code, or
    even 40 or 80 times slower counting memory writes and stuff. But it seems
    to be a lot slower than that. A simple FOR i& = 1 TO 1000000: NEXT,
    interpreted, takes 7 seconds, evidently 4000 times slower than the assembly
    code I used (which was itself 4 opcodes).

    And something else that's weird, the time taken seems nonlinear. A million
    took 7 seconds, but as I said in my original post, a billion took "over a
    minute", which is a whole lot less than a thousand times longer. But I
    don't see how the processor might be optimizing after a few dozen, let
    alone a few million... huh maybe load sharing in Windows at work? May have
    to test this in DOS mode for total concentration...
    Indeed. Still runs near 1 opcode per clock cycle, so short jumps aren't a
    problem. I'm thinking long jumps are what really trash performance, but if
    you suggest they may still be going in cache, then I don't know what would
    be taking out so many orders of magnitude.

    Tim
     
  13. Phil Hobbs

    Phil Hobbs Guest

    Also main memory access is incredibly slow compared with cache. If your
    executable is writing through to main memory every time it stores the
    variable, that will take awhile.

    Cheers,

    Phil Hobbs
     
  14. JosephKK

    JosephKK Guest

    Very true. Even with recent 1000MB/s and faster memory interfaces
    writes are typically 3 to 10 times slower than reads.
     
Ask a Question
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Electronics Point Logo
Continue to site
Quote of the day

-