Maker Pro
Maker Pro

Any ideas on how to do this?

4

48 bytes

Jan 1, 1970
0
Hi everyone!
First of all, thanks for being such a help for like 5 years now.
However, now I'm a serious trouble.
I have to describe and design a 32 bit binary to 40 bits BCD converter.
I'm using VHDL as it's my class' focus language.
I found a very interesting solution at
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/bin_to_BCD.html
, but when I upgraded it to 16 bits it didn't work as expected.
And imagine having to instance more than 40 of those adders... Even
using the special replication instruction, it would fill the gate
amount of the Xilinx Spartan-3 S3-200, considering that nearly 60% of
it it's being used and I have to maintain that design (a DLX monocycle
processor... Computer Architecture II, geez).
Please, any help on this problem?

Thanks

J.P. Garcia
AKA 48 bytes
 
T

Tim Wescott

Jan 1, 1970
0
48 said:
Hi everyone!
First of all, thanks for being such a help for like 5 years now.
However, now I'm a serious trouble.
I have to describe and design a 32 bit binary to 40 bits BCD converter.
I'm using VHDL as it's my class' focus language.
I found a very interesting solution at
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/bin_to_BCD.html
, but when I upgraded it to 16 bits it didn't work as expected.
And imagine having to instance more than 40 of those adders... Even
using the special replication instruction, it would fill the gate
amount of the Xilinx Spartan-3 S3-200, considering that nearly 60% of
it it's being used and I have to maintain that design (a DLX monocycle
processor... Computer Architecture II, geez).
Please, any help on this problem?

Thanks

J.P. Garcia
AKA 48 bytes
How many clock ticks do you have to do it in?

You can use the old algorithm of successive divides by ten. If you
don't mind using lots of clock ticks you can get the logic size down
quite a bit.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Posting from Google? See http://cfaj.freeshell.org/google/

"Applied Control Theory for Embedded Systems" came out in April.
See details at http://www.wescottdesign.com/actfes/actfes.html
 
J

Jan Panteltje

Jan 1, 1970
0
Hi everyone!
First of all, thanks for being such a help for like 5 years now.
However, now I'm a serious trouble.
I have to describe and design a 32 bit binary to 40 bits BCD converter.
I'm using VHDL as it's my class' focus language.
I found a very interesting solution at
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/bin_to_BCD.html
, but when I upgraded it to 16 bits it didn't work as expected.
And imagine having to instance more than 40 of those adders... Even
using the special replication instruction, it would fill the gate
amount of the Xilinx Spartan-3 S3-200, considering that nearly 60% of
it it's being used and I have to maintain that design (a DLX monocycle
processor... Computer Architecture II, geez).
Please, any help on this problem?

Thanks

J.P. Garcia
AKA 48 bytes

Use verilog and:
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/binary_to_bcd_v.html
This is open source, think from www.opencores.com,
I use it in my frequency counter in Spartan2-200 (or was it 300?) with
lots of space left for other stuff as LCD driver and what not.
Please respect the GPL.
<omitted about GPL violating people turning into frogs>
The link is on that same page you mentioned.
VHDL is too much typing anyways, why use all adders parallel when you can do
it in a loop one by one?
 
J

John Fields

Jan 1, 1970
0
Hi everyone!
First of all, thanks for being such a help for like 5 years now.
However, now I'm a serious trouble.
I have to describe and design a 32 bit binary to 40 bits BCD converter.
I'm using VHDL as it's my class' focus language.
I found a very interesting solution at
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/bin_to_BCD.html
, but when I upgraded it to 16 bits it didn't work as expected.
And imagine having to instance more than 40 of those adders... Even
using the special replication instruction, it would fill the gate
amount of the Xilinx Spartan-3 S3-200, considering that nearly 60% of
it it's being used and I have to maintain that design (a DLX monocycle
processor... Computer Architecture II, geez).
Please, any help on this problem?

---
If you can spare the time, load a binary down-counter with the
number you want to convert and clear your up-counting BCD chain with
the load pulse. Then use the same clock for both sets of counters
and count down until your binary counter gets to all zeroes, at
which time the BCD counter outputs will contain the BCD equivalent
of the number you loaded into the binary counter.

If your BCD counter maxes out at 4 294 967 296 and you're using a
10MHz clock, it'll only take about 429 seconds. ;)
 
4

48 bytes

Jan 1, 1970
0
Multiple reply coming, so, sorry for posting this at such level, and
not 3 replies to each of you.

Replying to Tim Wescott, well, that solution would be good, although I
can spare not too much clock ticks because the main use of this routine
is to show that 32 bits binary as a decimal using a VGA output module,
so using a lot of time is out of reason.
I saw this solution also (
http://www.vhdl-online.de/model_lib_patras/vhdl_sources/special/special.htm#bin_bcdN
) but it's unsyntesizable (?). I would be fondly grateful if you could
tell me anything about your solution.
By the way, sorry, as I'm replying to various people here I couldn't
apply the Usenet rules you wrote on the page on your sig. An
interesting read although!

Replying to Jan Panteltje, about VHDL... It's a requirement. All my
Electronics II, Computer Architecture I and II classes have been based
in VHDL. I never used Verilog but I can quite understand it (I follow
my rule of thumb, if it's readable then I can understand it).
In VHDL there is this language construct named GENERATE which can
create several components easily... But there's too much logic involved
using those 40 or 50 adders. Using a loop could be good although
(supported somehow in VHDL). However I'm going to look at your solution
and try to find an equivalent in VHDL.

Replying to John Fields, as I wrote before, using 429 seconds (at
10Mhz, although I would be using a 25Mhz clock) to show an answer is
out of question, even as I know that I wouldn't be using such high
numbers (I was not given any ranges, except being 32 bit numbers),
however, teachers love extreme cases to show the weakness of your
design.

Thanks all for your kind replies,

Juan Pablo Garcia
 
J

Jan Panteltje

Jan 1, 1970
0
Multiple reply coming, so, sorry for posting this at such level, and
not 3 replies to each of you.

Replying to Tim Wescott, well, that solution would be good, although I
can spare not too much clock ticks because the main use of this routine
is to show that 32 bits binary as a decimal using a VGA output module,
so using a lot of time is out of reason.

I was suspecting it was for display.
but think for a maoment how many value changes your eyes and brain can grab
per second of a 10 digit number.
1 per second would be fin in many cases.

In that case you grab the 32 counter once a second and do the BCD in a loop.

The BCD values in the registers you use are then constant for 1 second and
presented to the VGA character generator, that one runs at say 60 fps or more.
 
J

J.P. Garcia

Jan 1, 1970
0
I was suspecting it was for display.
but think for a maoment how many value changes your eyes and brain can grab
per second of a 10 digit number.
1 per second would be fin in many cases.

In that case you grab the 32 counter once a second and do the BCD in a loop.

The BCD values in the registers you use are then constant for 1 second and
presented to the VGA character generator, that one runs at say 60 fps or more.

The point that you're implying here is very valid, actually. However,
I'm not using VGA just for display, but also for "entry".
Let me introduce you to my final project.
My goal during the class was to create a DLX MIPS monocycle processor
with a reduced instruction set. I've successfully made it. That guy has
to run the Greatest Common Factor (or Divisor) algorithm for two 32 bit
numbers, but first those numbers are inserted into the FPGA using a
system similar to a clock (you push a button and the number in the
system increases or decreases), those numbers have to be shown also.
Then when user activates the start switch, he has to do the work and
then the VGA module gets to display the result when it ends.
Everything else, except for the problem that I can't show hexadecimal
numbers for answer or operands, has been done.
I'm being a bit picky, don't I? But thanks for your suggestion, Jan.
 
J

Jan Panteltje

Jan 1, 1970
0
The point that you're implying here is very valid, actually. However,
I'm not using VGA just for display, but also for "entry".
Let me introduce you to my final project.
My goal during the class was to create a DLX MIPS monocycle processor
with a reduced instruction set. I've successfully made it. That guy has
to run the Greatest Common Factor (or Divisor) algorithm for two 32 bit
numbers, but first those numbers are inserted into the FPGA using a
system similar to a clock (you push a button and the number in the
system increases or decreases), those numbers have to be shown also.
Then when user activates the start switch, he has to do the work and
then the VGA module gets to display the result when it ends.
Everything else, except for the problem that I can't show hexadecimal
numbers for answer or operands, has been done.
I'm being a bit picky, don't I? But thanks for your suggestion, Jan.

I am trying to understand this.
Monocycle procesor (everything in one clock cycle?) Great if you did that!
But _however_ you slice it, a VGA display cannot display more values per
second then the number of frames per second it displays, no matter what.
Say you had 50MHz FPGA clock, and the VGA runs at 50Hz frame rate (50 fps).
This gives you 1 000 000 cycles between displays.

Even if your counter is upped or lowered much faster, it would make sense
in the max speed case for the VGA to only grab its value every frame
(20mS here), then do a binary to BCD conversion, add something for ASCII, and
then display it.

Why would this not be enough?
 
J

J.P. Garcia

Jan 1, 1970
0
I am trying to understand this.
Monocycle procesor (everything in one clock cycle?) Great if you did that!
But _however_ you slice it, a VGA display cannot display more values per
second then the number of frames per second it displays, no matter what.
Say you had 50MHz FPGA clock, and the VGA runs at 50Hz frame rate (50 fps).
This gives you 1 000 000 cycles between displays.

Even if your counter is upped or lowered much faster, it would make sense
in the max speed case for the VGA to only grab its value every frame
(20mS here), then do a binary to BCD conversion, add something for ASCII, and
then display it.

Why would this not be enough?

When we refer to monocycle we are referring to the fact in which you do
a instruction each clock tick. If you did everything in one clock
cicle, well it would be... Amazing and superoptimized.
I know that a VGA has a horizontal and vertical refresh, and we need to
divide the original 50Mhz frequency to something that's able to show
anything in the screen.
However, my VGA module uses 25Mhz. It's somehow based in this one (
http://www.cs.utsa.edu/~danlo/research/DigilentSpartan3/VGATest.htm ),
but I can freeze the display to show a fixed value for as long as I
want to.
However to do the binary to BCD conversion would just freeze the screen
for a while with the previous value and then show the one that's valid
when it's over, and taking into account that we're using a 50 or 25 Mhz
clock this process would be somehow short.
We're not using ASCII because it would be just filling our little
available free logic gates in a lot of stuff we don't need to.
I'm trying to figure how to translate the Verilog code you linked into
VHDL and if I can do it, I'll release it over GPL also, containing full
credit to the authors and stuff.
 
J

Jan Panteltje

Jan 1, 1970
0
When we refer to monocycle we are referring to the fact in which you do
a instruction each clock tick.

Yea, that is what I ment.
If you did everything in one clock
cicle, well it would be... Amazing and superoptimized.

In FPGA I have a DES decode in one clock (no processor though, just expanded
the algo to gates).

I know that a VGA has a horizontal and vertical refresh, and we need to
divide the original 50Mhz frequency to something that's able to show
anything in the screen.
However, my VGA module uses 25Mhz. It's somehow based in this one (
http://www.cs.utsa.edu/~danlo/research/DigilentSpartan3/VGATest.htm ),
but I can freeze the display to show a fixed value for as long as I
want to.
However to do the binary to BCD conversion would just freeze the screen
for a while with the previous value and then show the one that's valid
when it's over, and taking into account that we're using a 50 or 25 Mhz
clock this process would be somehow short.

Not sure here, here is some pseudo code, assuming you have a VGA V sync available:

allocate a spare 32 bits counter
allocate 10 7 bits counters for the decimals
if(negedge Vsync)
{
inhibit main 32 bits counter // do not want a undetermined toggled state
copy counter to spare counter
allow main counter again

spare counter to bcd

add ASCII '0'

// result now ready
result to display RAM.
}

As processes run in parallel in FPGA, you would indeed get to see something like:
000 000 0001
000 000 9430
000 020 4501
when running up.
Nothing would 'freeze' and you cannot see that any faster anyway, human eye is not faster.
This dilemma in normally solved in digital clocks by use multiple speed for up and down buttons.
(as you mentioned user input)
So first short button touch is one up (or down).
Longer then 100mS on button steps 10 up (or down)
Longer then 1S on button steps 100 up (or down).

We're not using ASCII because it would be just filling our little
available free logic gates in a lot of stuff we don't need to.

Well if you want to display BCD 0-1-2- -9 as a number on the screen,
and also have any text, you just add ASCII '0' (= 48 decimal) to each BCD value,
and use a normal character generator (with character ROM).
Spartan has enough memory for that I think, to put the ROM in dual port RAM
if must be.
However you slice it, the character symbols 0 through 9 must be somewhere, if
you want to display these.
For a nice 8(width) x 9(height) font with 10 digits you would need only
90 bytes ROM.... maybe just using defines would do it in verilog.

If the number is just in a corner of the screen you can work around a full VGA
size diplay RAM by just using that part, and the rest in timing :)

I think I have covered it 100% now?
 
K

Ken Smith

Jan 1, 1970
0
Binary to BCD or ASCII digits
You can use the old algorithm of successive divides by ten. If you
don't mind using lots of clock ticks you can get the logic size down
quite a bit.

You can do quite a bit better with the shifting method.

The binary value is initially divided by 2^N so that all but the top 3
bits are below the "binary point".

The top three bits go directly into the results area.

The rest of the result is cleared.

The (N-3) bits of the input are in a shift register.

For (N-3) cycles, you shift a bit out of the top of the input and into the
bottom of the result. After each shift the results area is "decimal
adjusted" so this is doubling BCD values not binary.
 
A

Arlet

Jan 1, 1970
0
48 said:
Hi everyone!
First of all, thanks for being such a help for like 5 years now.
However, now I'm a serious trouble.
I have to describe and design a 32 bit binary to 40 bits BCD converter.
I'm using VHDL as it's my class' focus language.
I found a very interesting solution at
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/bin_to_BCD.html
, but when I upgraded it to 16 bits it didn't work as expected.
And imagine having to instance more than 40 of those adders... Even
using the special replication instruction, it would fill the gate
amount of the Xilinx Spartan-3 S3-200, considering that nearly 60% of
it it's being used and I have to maintain that design (a DLX monocycle
processor... Computer Architecture II, geez).
Please, any help on this problem?

Thanks

J.P. Garcia
AKA 48 bytes

The simplest hardware implementation is to have a modified shift
register, and perform the conversion serially. See Xilinx app note 029
for details: http://direct.xilinx.com/bvdocs/appnotes/xapp029.pdf

Like others have argued, there's no sense to do it faster when it needs
to be displayed on a VGA screen. Doing it serially can do multiple
conversions per vsync.
 
A

Arlet

Jan 1, 1970
0
Jan said:
Not sure here, here is some pseudo code, assuming you have a VGA V sync available:

allocate a spare 32 bits counter
allocate 10 7 bits counters for the decimals
if(negedge Vsync)
{
inhibit main 32 bits counter // do not want a undetermined toggled state
copy counter to spare counter
allow main counter again

spare counter to bcd

add ASCII '0'

// result now ready
result to display RAM.
}

You don't need to inhibit the main counter if you run this code in the
same clock domain as the counter, which I assume is the case. Even if
multiple clock domains are used, the counter can be copied across
without having to stop it.
 
J

Jan Panteltje

Jan 1, 1970
0
You don't need to inhibit the main counter if you run this code in the
same clock domain as the counter, which I assume is the case. Even if
multiple clock domains are used, the counter can be copied across
without having to stop it.

Hey and I thought I spotted a trouble spot :)
You are right.
But if different clock domains, is it not so that when you toggle
from say 0x0fff to 0x1000 you may grab for example a state 0x1ff0
depending on how the counter feedback works, and what sort of counter
it is, and related gate delays in any feedback path?
For a simple binary this would likely not be normally the case, but
many postings in comp.arch.fpga about metastability made me stop it?
 
A

Arlet

Jan 1, 1970
0
Jan said:
Hey and I thought I spotted a trouble spot :)
You are right.
But if different clock domains, is it not so that when you toggle
from say 0x0fff to 0x1000 you may grab for example a state 0x1ff0
depending on how the counter feedback works, and what sort of counter
it is, and related gate delays in any feedback path?
For a simple binary this would likely not be normally the case, but
many postings in comp.arch.fpga about metastability made me stop it?

True, you can't just copy an N-bit word from one clock domain to the
other, and expect it to be correct.

A safe way to do this is to make a copy in the original clock domain,
wait at least a cycle in the second clock, and then copy it again in
the other clock domain. As long as the copied value doesn't change
within the setup-hold interval of the second clock, there's no
ambiguity. It's similar to your original code, but instead of stopping
the counter, you make a copy.
 
J

Jan Panteltje

Jan 1, 1970
0
True, you can't just copy an N-bit word from one clock domain to the
other, and expect it to be correct.

A safe way to do this is to make a copy in the original clock domain,
wait at least a cycle in the second clock, and then copy it again in
the other clock domain. As long as the copied value doesn't change
within the setup-hold interval of the second clock, there's no
ambiguity. It's similar to your original code, but instead of stopping
the counter, you make a copy.


OK, yes, seems right.
I have been thinking, and if I am right that the just wants to display
a 10 digit decimal number, and wants the user to be able to set it in
a simple way, then there is an other route one could take:
Maybe some remember the 7490 decimal counter (BCD output).
You could have 10 BCD decimal counters, and use that BCD to binary
solution from the Xilinx paper to get the binary value if needed.
The advantage of that is that you can set the counter simpler.
You use then 2 buttons, one to select the digit (one of ten sequentially),
and the other to increment 1,2...0 the selected digit.
That way the user can really change each digit fast.
As I am not sure what exactly he does with the counter, so this may or may
not be the better solution.
I have a radio transmitter that lets me set frequency that way.
Advantage is for example you can do 1Hz steps, 10Hz steps, 100Hz steps etc.
Anyways there are so many suggestions now... let's see what comes of all this.
 
J

John Larkin

Jan 1, 1970
0
Hi everyone!
First of all, thanks for being such a help for like 5 years now.
However, now I'm a serious trouble.
I have to describe and design a 32 bit binary to 40 bits BCD converter.
I'm using VHDL as it's my class' focus language.
I found a very interesting solution at
http://www.engr.udayton.edu/faculty/jloomis/ece314/notes/devices/binary_to_BCD/bin_to_BCD.html
, but when I upgraded it to 16 bits it didn't work as expected.
And imagine having to instance more than 40 of those adders... Even
using the special replication instruction, it would fill the gate
amount of the Xilinx Spartan-3 S3-200, considering that nearly 60% of
it it's being used and I have to maintain that design (a DLX monocycle
processor... Computer Architecture II, geez).
Please, any help on this problem?

Thanks

J.P. Garcia
AKA 48 bytes


Usually these fpga's have unused block rams somewhere. So you could
pre-load some rams with lookup tables that translate, say, each nibble
or byte of the input to the corresponding BCD. Then lookup and sum (in
a bcd accumulator!) sequentially, in 4 steps (byte lookup) or eight
(nibble). You could even do it in broadside, in one clock maybe, with
three BCD adders and four byte-to-bcd lookup tables. Only one lookup
table and the last adder need to be the full 40 bits wide!

Brute force, but OK if the rams are free.

What's a DLX monocycle?

John
 
J

J.P. Garcia

Jan 1, 1970
0
Multiple reply... I leave a few hours and I get this lot of
responses!!! Thank you very much everyone, you've been such good
inspiration.

In reply to Jan Panteltje,
I think I have covered it 100% now?
Yes, your explanation and pseudocode were enough for me to implement an
algorithm and a VHDL solution I found on a page.

In reply to Ken Smith,
You can do quite a bit better with the shifting method. and also in reply to Arlet,
The simplest hardware implementation is to have a modified shift
register, and perform the conversion serially.
About that algorithm, it only uses 40 cycles to find the answer, with
lots of time to spare! I found it on this page: (
http://www.doulos.com/knowhow/vhdl_designers_guide/models/binary_bcd/
).

In reply to John Larkin,
Yes, that exact idea you brought was my first choice but I found to be
unstable when testing with like 0xAABBCCDD or 0xFFFFFFAA, the lookup
table stops working right after the second least significant digit.
What's a DLX monocycle?

A DLX monocycle *processor* it's a MIPS sub-set based microprocessor,
that executes one instruction per clock tick, and definitely not a
clown transportation device (which I thought the first time I heard
about it).
More info:
http://www.csee.umbc.edu/courses/undergraduate/411/spring96/dlx.html

Thanks everyone, I hope you don't mind being updated (maybe bothered is
better) with more information (or questions) about this project.
 
J

John Larkin

Jan 1, 1970
0
In reply to John Larkin,
Yes, that exact idea you brought was my first choice but I found to be
unstable when testing with like 0xAABBCCDD or 0xFFFFFFAA, the lookup
table stops working right after the second least significant digit.

Unstable? It *has* to work!

John
 
J

J.P. Garcia

Jan 1, 1970
0
Unstable? It *has* to work!

John

Would you please, give me an example of how to make it? Maybe I didn't
model it successfully and I'm just wrongly getting a decimal 4578 from
0x000014C6.
 
Top