Maker Pro
Maker Pro

CAN bus reply problems

S

Ska

Jan 1, 1970
0
Hi folks!

We are developing a system using the CAN bus to implement the network
connecting different nodes. We have a PC that needs to ask for some
data (the node status) to the nodes that have to answer to the request
immediately.
In order to ask each node for its status we send a "remote frame"
message to the CAN bus with a specific ID. The relevant node has to
answer with the relevant data by using a "data frame" message.
Each node is in a while loop reading a buffer and sending back data
when necessary. Usually everything goes well but sometimes it happens
that one of the nodes does not answer to the PC request, even if the
request is sent to the bus (it is seen by another node and it can be
seen by using an oscilloscope connected to the CAN bus lines). It
seems the node do not see the message, it misses the interrupt for
updating the buffer...
We usually send a sequence of "remote frame" messages waiting every
time for the answer: send ,waiting for answer, send, waiting, ... Even
if we insert a sleep between a send and another, sometimes the
messages are missed by a node...
We modified the baud rate (from 500Kbit to 20Kbit) but the problem is
not solved.
We are using a T89C51CC03 micro-controller by ATMEL.

Have you ever experienced this problem? Any suggestion?

Thank you in advance for any help!

Cheers,
Ska
 
H

Heinz-Jürgen Oertel

Jan 1, 1970
0
Ska said:
Hi folks!

We are developing a system using the CAN bus to implement the network
connecting different nodes. We have a PC that needs to ask for some
data (the node status) to the nodes that have to answer to the request
immediately.
In order to ask each node for its status we send a "remote frame"
message to the CAN bus with a specific ID. The relevant node has to
answer with the relevant data by using a "data frame" message.
Each node is in a while loop reading a buffer and sending back data
when necessary. Usually everything goes well but sometimes it happens
that one of the nodes does not answer to the PC request, even if the
request is sent to the bus (it is seen by another node and it can be
seen by using an oscilloscope connected to the CAN bus lines). It
seems the node do not see the message, it misses the interrupt for
updating the buffer...
We usually send a sequence of "remote frame" messages waiting every
time for the answer: send ,waiting for answer, send, waiting, ... Even
if we insert a sleep between a send and another, sometimes the
messages are missed by a node...
We modified the baud rate (from 500Kbit to 20Kbit) but the problem is
not solved.
We are using a T89C51CC03 micro-controller by ATMEL.

Have you ever experienced this problem? Any suggestion?

Thank you in advance for any help!

Cheers,
Ska

I can not answer your specific question, in other words I don't know which
part of your software or hardware is responsible for it. Could be the
driver, could be a miss configuration of the CAN controllers, could be the
cabling.
But you should consider switching your node monitoring from the master/slave
principle you are using now to something other.
Your current implementation looks exactly like to _old_ CANopen Node
Guarding mechanism. CANopen switched to Heart Beat years ago, where each
node is an autonomously Heart Beat Producer and can be monitored by every
node that wishes to do so. The benefit is more flexibility and reduced band
width for the node monitoring.
Anyway, it can happen that one of the Heart Beat Consumers is missing one
Heart Beat of one of the Producers. In this case increase the rate or
accept that one or more HB are missing.

Regards
Heinz
--

with best regards / mit freundlichen Grüßen

Heinz-Jürgen Oertel
+===================================================================
| Heinz-Jürgen Oertel port GmbH http://www.port.de
| mailto:eek:[email protected]
| phone +49 345 77755-0 fax +49 345 77755-20
| Regensburger Str. 7b, D-06132 Halle/Saale, Germany
| CAN Wiki http://www.CAN-Wiki.info
| Newsletter: http://www.port.de/engl/company/content/abo_form.html
+===================================================================
 
S

Ska

Jan 1, 1970
0
Hello Tim, hello Heinz, hello everybody

Thank you for your mails.

What you both are telling is that "No protocol should trust external
nodes 100% to receive something -- you should always have a timeout &
retry mechanism"!
This is exactly what we are doing now, but it is something I don't
like so much... :(
We set a maximum number of retry messages (say 10) and it sometimes
happens that the trials go over this threshold! In this case we reset
and start again the CAN bus but, as I said, it is something we don't
like so much...

....mmm...

Regards,
Ska
 
H

Hans-Bernhard Broeker

Jan 1, 1970
0
[Note: F'up2 cut down to one group --- should have been done by OP.]

In comp.arch.embedded Ska said:
We set a maximum number of retry messages (say 10) and it sometimes
happens that the trials go over this threshold! In this case we reset
and start again the CAN bus but, as I said, it is something we don't
like so much...

[Massive quote without actual referral snipped. Please don't do that.]

What you're observing appears to be a rate of failure to receive CAN
messages that is quite a lot beyond expectations of the protocol,
unless you were operating in a pathologically noisy environment ---
but you didn't mention anything like that.

What this hints at is a genuine bug in the receiving end, but I'm
afraid you didn't reveal enough of its details for anybody out here to
be able to remote-diagnose it more precisely. So I'll just bombard
you with some questions:

Did you test this with only two nodes on the bus, and check if the
receiving one ACKs the transmission?

What *is* the rate of failure, anyway, i.e. one in how many messages
gets lost? What is the rate of transmissions with CRC or other
failures, on the same network?

Do you have any way of debugging into the receiving CAN controller's
register banks after a failed receival, to distinguish if the message
actually failed to arrive in the message box, or just failed to raise
the IRQ it's configured to? (There's a bug like that in another 8051
derivative with integrated CAN...)

Do you have a storage scope that would let you record the exact
signalling up to the point of failure, so you could go look for any
differences between successful and failing transmissions, on physical
level?
 

jdg3

Sep 10, 2010
1
Joined
Sep 10, 2010
Messages
1
What do you have for termination? It be going into a bus off state after receiving a certain number of error packets.
 
Top