# way to simplify repeated xor operations?

Discussion in 'Electronic Design' started by [email protected], May 30, 2006.

1. ### Guest

I am trying to implement a 24 bit random bit shift register in a small
microcontroller. The output will be used as white noise and fed into an
IIR pink noise filter also implemented on the MCU (see other posts, yes
its possible). I need an 8 bit random number, so I am shifting the
register 8 bits at a time and XORing the two high bits. This will
probably not be a maximal length register. But 24 bits is huge so I am
guessing it will be long enough (at least 5 seconds worth, which needs
to be 40000 samples times 8 bits * 5 seconds 1.6million clock cycles)
This seems like a very good way to generate lots of random numbers
without repeating for a few seconds.

So, of the 24 bits, the high BYTE, the output byte, is A, the middle
byte is B, and the low byte (with bit 0, the input bit) is C. So the
steps are:

Take A as output, i.e. use it as the random number.
Move B to A
Move C to B
Loop the following 8 times:
XOR high bits of A and put result into first bit of C
Shift C left one bit
Shift A left one bit
Repeat

Problem is, I this takes too many cycles. Is there some clever way to
accomplish the XORing of each bit pair in a byte and put the output in
another byte in say 5 or 6 cycles? The way I have outlined above will
probably take about 40 cycles just for the loop.

2. ### Andrew HolmeGuest

Shift it left one bit and XOR with the original byte.

3. ### Don LancasterGuest

Most pseudorandom sequences that are not optimal are ridiculously short.
And certainly not worth pissing over.

See page 1-1 of http://tinaja.com/glib/atg1.pdf
Detailed code in my Apple Assembly Cookbook.

--
Many thanks,

Don Lancaster voice phone: (928)428-4073
Synergetics 3860 West First Street Box 809 Thatcher, AZ 85552

Please visit my GURU's LAIR web site at http://www.tinaja.com

4. ### LuhanGuest

I have a really fast random number generator that will work in almost
any micro since it does not try to emulate a hardware shift register.
Instead, it uses fast and convenient machine instructions.

http://members.cox.net/berniekm/tricks.html

The code is for 24 bits, but increasing it requires only 2 extra
instructions per extra 8 bits. I've used this method in many
applications.

Luhan

5. ### LuhanGuest

http://members.cox.net/berniekm/tricks.html
(better, faster, random numbers)

Luhan

6. ### LuhanGuest

Google refused to post my first response. When I tried another an hour
later, it posted both of them with the same time stamp. Is this a

Luhan

7. ### GenomeGuest

Was that a proper question?

DNA

8. ### bogaxGuest

I think you can do better

Might start by Googling "Galois lfsr"

If you run eg two PRNGs in parallel and combine them and they have
mutually prime run lengths then the combination will have a run length
that's the product of the two component run lengths.

So if you have, say, one PRNG with a run length of 5 and another with
a run length of 7 and you XOR them together the combination
will have a run length of 5 x 7 = 35 (just a thought)

This is fun to play with:

http://www.jnicolle.com/LFSR/

9. ### Tom BruhnsGuest

One way to generate 8 new bits at a time is to generate a set of 256
"masks" each (for your 3-byte register) 3 bytes long, and store them in
a table. Use the MS byte of your register as an index into that table.
Then:
--take A as the output, and as an index into the table
--fetch B, xor it with the table MS byte, store into A
--fetch C, xor it with the table mid byte, store in B
--store table LS byte in C

If you pick the right polynomial to implement, it's quite possible that
at least the MS byte of the table will be all zeros, and you can ignore
the xor of that byte into the old B. You may also be able to think of
your register as a circular buffer, and not even bother pulling data
out and moving it around: just point to a different head of the buffer
each time. This can become very efficient for longer polynomials when
you pick the right ones with just a couple bits set, near the end.

The trick is understanding the relationship of the masks to the bits
which are set in the maximal length polynomial. You can express the
algorithm to generate a new register value for a single bit shift in
matrix form: x(k+1) = A * x(k). Then obviously x(k+2) = A^2 * x(k),
and x(k+8) = A^8 * x(k). You are pre-computing A^8 and storing it in a
table, realizing that if you start with a simple enough A (a simple
enough polynomial), the table is relatively easy to store. There are
some details I'm leaving out here. For example, you need to realize
also that you can generate the same natural response as the
"traditional" method of xor-ing two bits together and feeding that back
to the end of the shift register, by instead looking at an output bit
and using a mask with an equivalent two bits set and xoring it back
into the register.

Cheers,
Tom

10. ### Ken SmithGuest

I don't see any reason you couldn't make that operation a table look up.
Since it is 8 bits in and 8 bits out, the look up operation should be a
snap.

11. ### Tom BruhnsGuest

Indeed, Ken, with a couple small caveats... First, since he's looking
at the top TWO bits, he'll have to go one bit deeper. After the
shift-by-eight, the result to feed back will depend on the eight bits
shifted out plus the new MSbit, not just the eight shifted out. The
table must be 512 entries long, or else you can make it 256 long (index
= 8 bits shifted out) and xor the remaining MSbit into the lsb of the
table entry. And second, maximal length at 24 bits requires a minimum
of four bits fed back*, so it would be better to use a different
length. 2^24-1 = 241*17*13*7*5*3*3 -- non-maximal-sequences could be
pretty short! But you can get maximal length from 22 bits by feeding
back the xor of the top two bits, which is perhaps convenient. Or turn
it on its head and use the 8 bits to look up a couple of bytes to
xor/load back into the earlier bits.

Another way to do it is to find a maximal length polynomial that
results from feeding back the MSbit xored with the eighth bit back from
that; then you can byte-shift the register, and xor the byte shifted
out with the new MS byte to get the feedback (which, if the register is
not an integral number of bytes long [and it won't be--see above], you
will need to shift over so its ls bit is at the ls bit of the composite
register).

Cheers,
Tom

* It seems that all polynomials whose order is an integer multiple of 8
(registers on byte boundaries), at least up to 21 bytes, require at
least 4 bits of feedback for maximal length. I wonder if there's a