Connect with us

If clocks too slow then switch to asynchronous ?

Discussion in 'Electronic Design' started by Skybuck, Jun 8, 2007.

Scroll to continue with content
  1. Skybuck

    Skybuck Guest


    If the limit has been reached for generating clock signals then switch
    to asynchronous circuitry design ?

    For now the cpu makes multiple cycles per clock tick. (That's what the
    cpu multiplier is for)

    How long can that be a solution ?


  2. To the contrary actually. Asynchroneous reception
    means there has to be a clock usually power-of-2-
    multiple of the bit rate.

  3. MooseFET

    MooseFET Guest

    Asynchronous designs are *way* harder to do. It is much harder to
    automate the process.

    When you go through a register, the setup and hold times can be
    checked at the input and then the timing of the output can be assumed
    for further checking. In the asynchronous case, you have to follow
    through all the logic paths and figure the delays at each step. If
    there are many parts and many paths the number of computations gets

    Some people are going with a very fast clock and declocking sections
    of the chip when they are not needed. This way they can lower the
    average power to prevent overheating without lower performance in most
    cases. They include a bit of logic that slows things down if the CPU
    gets too hot.

    There is a new direction where the grain size of the declocking is
    made very small. This gets most of the reduction in power that an
    asynchronous design could do without making the design so much
    harder. I predict that the next step on this path will be the local
    monitoring of temperature.
  4. Skybuck

    Skybuck Guest

    Huh ?

    Asynchronous cpu's should not need a clock.

    It's like domino's, use it to signal stuff.

  5. Oops, I was thinking about a UART and SPI.
    What speed are you takling about ? I know synchroneous
    circuits with 3 GBits. Beyond that ?

  6. sirinath

    sirinath Guest

    Does any body know what are the research groups that are there looking
    into this area?
  7. MitchAlsup

    MitchAlsup Guest

    For the most part, the clock rate of CUs has stopped progressing
    because of power disipation issues, not because we cannot make the
    clock signal go faster. Secondarily, the wire wall (wires are getting
    slower as gates are getting faster) means that more clock cycles are
    necessary to talk to remote parts of the chip. And finally, the memory
    wall means that even if we sped up the clock rates, {Donning Nomex}
    little performance drops to the bottom line due to the vast latencies
    of a main memory read.

    So, in effect, that limit has only been reached under the assumption
    that power disipation is limited (to about 100 Watts). If somebody
    comes up with a scheme whereby 1KWatts of power can be removed from a
    chip of 13mm**3, and it costs about $10 in volume, then the clock rate
    race will be "ON" again.

    But even if the second paragraph becomes true, there is good reason to
    believe that more performance can be placed on a die via
    multiprocessors than through ever faster/bigger CPUs with more cache/
    predictors/function-units that deliver ever less advancement per unit
    area or per unit power (performance per Watt is often negative right
    now as these things are added/extended).
    Basically as long as the input clock can be detected with less than a
    handful of picoseconds of (short term) jitter, the PLL multipliers can
    multiply up that frequency to at least 10 GHz (maybe as high as 30
    GHz) with adequate end point jitter control. The Cray {1, XMP, YMP,
    2,...} computers kept a refrigerator sized boxes within a fraction of
    a nanosecond of uncontrolled skew. All it takes is the power needed to
    run the clock distribution network and a determined enginerring staf
    to distribute the clocks.

    It is that power that contributes to the lack of clock scaling you see

    Mitch Alsup
    No longer at AMD.
  8. If somebody comes up with a scheme whereby 1KWatts of power can be removed
    I'm not even sure that's true. The $10 cost will be dwarfed by the cost of
    the 1kW of power (plus air-conditioning, ...).
    Yes, there would surely be some interest in such monsters, but such
    a renewed "clock rate race" would probably stay confined to a fairly small
    market compared to what we've seen at the end of last century.

  9. acd

    acd Guest

    In a last-year's issue of the IEEE Journal on Solid-state circuits was
    an asynchronous flipflop and logic design style. As an example they
    used a multiplier. I was shocked by the overhead required for the
    asyncrhonous handshaking. Comparing this with the aggressive 11 SOI
    (if I am not mistaken) design of the Cell's SPEs synchronous design
    gets us much further.
    The on-chip clock generation I think is in principle not harder than
    the handshaking of an asynchronous circuit.

  10. Robert Myers

    Robert Myers Guest

    Hmmm. My brief review of the subject a couple of years back led me to
    the perception that one of the reasons for going asynchronous is that
    it can result in lower power operation for comparable performance. I
    also came away with the perception that asynchronous isn't common
    because it isn't common; i.e., little design experience, inadequate
    tools, formidable design challenges.

    You've proposed two walls: a power wall and a memory wall. The memory
    wall has been pounded to the end of the earth and I'd rather not go
    there again. If you could beat the power wall with asynchronous
    operation, I'll bet there's a market.

  11. sirinath

    sirinath Guest

    Asynchronous is the way forward. there are various synchonisation
    mechanisms. Resently I was reading a article on sunlabs about
    processor called FleetZero which they have made
  12. sirinath

    sirinath Guest


    Is there any possibility of a Ph.D. Studentship possition there?

    Regards Suminda
  13. MooseFET

    MooseFET Guest

    I disagree. I don't see it as a path to any major break throughs. I
    think it is tuning to a local maximum.

    Google has been having trouble posting. Before I got further I will
    post this
  14. MooseFET

    MooseFET Guest

    It seems to be working so I will say more.

    You can get about as much reduction in power by using a fine grained
    declocking of the chip. Declocking allows all the normal design
    methods to be used and reduces the troubles in following the prop.
    delays through all paths.

    Asynchronous design only reduces the number of transistors and the
    power consumption by a nearly fixed percentage. It doesn't make the
    growth in each follow a slower curve. To break the growth off the
    curve it is on, we need a technology that goes away from using a logic
    gate for each logic operation.

    To explain what I mean by this, take the case of an AND logic gate
    implemented with a rely. The coil is connected to one signal and the
    NO contact is connected to the other. The NC contact is perhaps
    grounded and the COM is the output. This makes a logic gate that does
    the needed function. If you need to implement (A and B) and (A and C)
    and (A and D), you would be tempted to put in three relays and need
    about 3 times the power. You could, however, use a relay that has
    three sets of contacts and require less than three times the power.
    This is the sort of thing that a silicon version of would allow us to
    break of the current power growth curve.
  15. neon


    Oct 21, 2006
    this remind me of something actualy a true case there was a clock that timed days and after days it was totaly wrong the solution after days of research was to make it count in gray code after that there was no more mistakes. synchronous or not one mistake ripple trough. with gray code it is absolute.
  16. How does async design compare to latch-based skew-tolerant design? With
    skew-tolerant design, you care only about the propogation delay through the
    latch, and don't care so much about the clock. This lets you borrow time
    from a shallow pipe-line stage for use in a deep-stage. Anyway, it seems
    that this methodology has many of the advantages of async design, but
    without it's problems: mainly that you don't have to worry about glitches.

    Another question I have is logic size: yes with async design you do not have
    a large global clock network, but the async design elements tend to be
    larger (to avoid hazards). It would be interesting to see a comparison
    between the best clocked logic with the best async logic. Both with scan
    chains, or whatever is used to check for silicon defects.
  17. Quite true. But in some cases you can make the async circuit smaller,
    as you don't need to optimize for rare worst-case delays.

    An example: The simplest adder is a ripple-carry adder, but that can
    in the worst case take O(N) to settle (where N is the number of bits).
    Hence, sync designs tend to use carry-lookahead or carry-select adders
    that have a worst case propagation of O(log(N)), but are considerably
    larger than ripple-carry adders. However, a ripple-carry adder has an
    average delay of O(1), so an async ripple-carry adder can be faster
    (on average) than a sync carry-lookahead or carry-select adder. And
    smaller too.

Ask a Question
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Electronics Point Logo
Continue to site
Quote of the day