Connect with us

fpga memory issues

Discussion in 'Electronic Design' started by Jon Slaughter, Oct 8, 2009.

Scroll to continue with content

    I'm trying to determine if the A3PN010 can do what I need. It's relatively
    cheap and will probably handle almost everything I need for my application
    except I need some memory.

    The fgpa will simply act as a translator/deserializer/"hub". Several
    identical IC's will be connected to it all using simple serial comm up to
    about 25Mhz. A uC will communicate with the fpga. Because I want to minimize
    the information the uC has to send I will be using a look up table for
    common things.

    This is an LED application and the fpga is taking care of refreshing all the
    IC's that drive the LED's. Hence it has to run at least 100 times faster
    than the uC hence the reason of using the fgpa so the uC doesn't have to
    communicate this fast. The fpga needs to store the state of each led since
    it is a matrix of LEDs. i.e., the fgpa will comm one row to the drivers,
    then the next row, and so on. It has to know the state of each row and not

    I'll also build in common functions such as "Blank", "All On", "Row X on",
    etc. I'll also need a LUT for color to PWM information. The uC will give an
    index into the LUT to specify the "color".

    The problem is that these fpga's do not have memory as far as I can tell. No
    RAM or bitblocks, etc... I know I can use off board memory but it would
    require very fast memory(or require a lot of them). The fpga is suppose to,
    at least I hope, parallel all the IC comm so it can run at a decent speed.
    Each IC will have to be communicated at about 10Mhz. By paralleling them it
    should mean the fgpa can run at 10Mhz rather than 300Mhz. (30 ic's in serial
    at 10Mhz each = 300Mhz)

    Very simple question really. Does the fgpa's have the capability of "memory"
    even though there is no ram blocks and such? Not sure what the versaTiles
    are and if the gates or macrocells can be used as memory. (remember, I'm a
    newb to fgpa's)

    I don't need the more powerful fgpa's since the processing is pretty simple
    and I'm looking to minimize space(I don't have much room as it is).

    If I have to use off-board memory then it becomes a big problem, I think,
    because then the memory has to run as fast as the total speed of the IC's in
    serial. Unless it's one memory per IC which increases the board space and
    cost significantly.

    I was hoping to do this with a cheap fgpa because it's relatively simple. Of
    course if I don't have memory then I don't think I can use it?

    (The thing would be a piece of cake if using a uC but would require
    significant speeds since the IC's comm would have to be serialized)
  2. krw

    krw Guest

    I'm considering the ProASIC3 for a project, as well. To answer your
    question, you need to go to the 060 to get memory. AFAICT, there is
    no memory in a ProASIC3, other than registers. Since the LUT
    configuration isn't SRAM, rather flash, it's not available for
    distributed RAM. It's a severe limitation of the small Actel stuff.

    That said, I'm either going to use the Actel A3PN030 or an Altera
    EPM570. I'm in the process of writing the VHDL and will target it for
    both. The Altera has a better package selection and is cheaper
    overall, though a little smaller. I'll use the Altera the design
    Small is good, but I can't use the .4mm or .5mm pitch BGAs. Our
    comfort limit is .8mm. Altera has a 100 ball 1mm pitch package.
    Can't you deserialize and run parallel memory?
    Could be. They like to have some reason for you to use expensive
    If that's the case, just do the serdes in the FPGA and go parallel
    from there.
  3. krw

    krw Guest

    1. Inferred memory from logic. A "fully" specified CASE with 2^n terms
    will generate ROM. Memory, even dual port, can be inferred directly
    from the VHDL, as long as you keep it obvious.

    2. You can instantiate either block memory or distributed memory
    blocks directly (not recommended)

    3. You can use their memory builder macro. Xilinx (haven't gotten
    that far with Actel or Altera yet) has a pretty substantial library of
    RAMs, ROMs, Synchronous and Asynchronous DualPorts with various
    read/write configurations. This stuff is trivial with the Xilinx
  4. I'll have to check some other makers then. I was hoping I could use some of
    the flipflops for memory but I guess not.
    I don't want bga, at least not at this point. But basically I can't add a
    lot of support circuitry(about 1sqin of room).

    Not sure what you mean. If you mean one memory per ic then I can but it
    requires too much space and probably overly costly.
    How? The fpga is right between the uC and the IC's. To do the serdes is
    outputing parallel. But because it is a matrix application it requires state
    information. That is the fgpa needs to remember the the on/off state of the
    led's and, in fact, compute them because it will also be doing PWM.

    Basically each IC needs a 16x3*(5 to 10). The 5 to 10 is the PWM resolution.
    The fgpa will look up the value for the current led and determine if it
    should turn on the bit. The uC will modify those values when needed.

    If I had some small memory IC that had "banks" that could be read in
    parallel and written simutaneously then that would solve the problem. I'll
    look for a fpga that has this stuff.

  5. They don't seem to have enough(except the largest one).

    In they datasheet they say

    "Memory Cascading

    Larger and deeper blocks of RAMs can be created using EBR sysMEM Blocks.
    Typically, the Lattice design tools

    cascade memory transparently, based on specific design inputs."

    I'm not sure what that means though. They have distributed ram and I don't
    know what that means either. One has 2k distributed ram. Does this mean that
    each "meta block" has 2k or 2k for all the meta blocks.
  6. No, The fgpa needs about 1500 bits for the LED's state(16-ch drivers driving
    3 rows each and about 30 drivers). But I would need to store the PWM of each
    LED. This is where it costs a lot of memory. 1500 LED's with a PWM
    resolution of 2^n. For n = 5, or 32 PWM steps, requires a memory of 48k. (
    would like to get 2^8 PWM steps)

    I then need to change PWM table randomly by the uC(since it would be too
    slow to update the whole block if only one value changed).

    the fgpa basically looks at the PWM value for the LED and determines if it
    should send a 0 or 1. e.g. if PWM < counter then send 1, else 0.

    That is basically all it does but just organizes the data to the right IO
    lines. It would also need to communicate with the uC(or maybe USB or some
    other method) for changing the PWM values but this doesn't require any

    It also needs to change the row's by driving the appropriate row mosfets but
    this is just a simple counter, one for each IC.

    If I were to use external sram then I have to decide if I can use one large
    single SRAM or have to use one per IC. The one per IC is probably not going
    to fit on the board and cause other problems(I'm trying to get this all on a
    2-layer board also). The single sram has to be accesses very fast because
    all the IC's are in parallel so the PWM data can only be accessed serially.
    (hence 30 times faster than the IC data rate)

    I don't think this is necessarily an issue with a 300Mhz fpga and SRAM but
    I'd like to find another option if possible.
  7. Nial Stewart

    Nial Stewart Guest

    I'd be very careful here.

    You can probably design an FPGA implementation on two layers that will work
    with low speed IO but if you're synchronously driving a large number of IO
    then you will probably have problems with ground bounce etc.

    This will cause problems with comms to the uC.

    For a reliable FPGA implementation with fast IO you want _at_least_ a
    four layer board.

  8. I guess that should be 1500*5~=10k instead.
  9. Jan Lucas

    Jan Lucas Guest

    Distributed RAM is using the internal LUT4s as 16 entry RAM blocks.
    Every second LUT4 can be configured into a RAM Mode. Distributed Mem is
    nice for small 16-64 entry RAMs but you don't want to store larger
    amounts of data in distributed mem as it consumes logic resources and
    becomes ineffective for larger memory blocks. The 2k number is a rather
    theoretical, you only get that amount of distributed memory if you use
    half your LUTs (and all LUTs with a RAM Mode) as RAM. For the 1500*8 =
    12kbits a EBR, block ram etc. is what you need. He could also look at
    the older Lattice XP Chips, these are available in a 1.8-3.3V version.
    The smallest XP chip contains 6 EBR Blocks with 9kbits each. So two EBR
    Blocks should be enough for his application, could even use 4 EBR and
    double buffer.

Ask a Question
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Electronics Point Logo
Continue to site
Quote of the day