Strategies for parsing at command responses

Discussion:

(too old to reply)

earth

2008-08-14 09:30:13 UTC

Hello,

I am using a variation of the gsm at command set to control a device
(sorry cannot say specifically).

First off I find there is no consistency in the protocol format when
the device sends data to me. Some commands are ended with crcrOKcr
(cr = carriage return + newline), some are crcr, some are just cr and
some are just carriage return with no newline. Then the number of cr
differs if the data was sent unsolicited.

Next there are responses prefixed with an 'at' and unsolicited
responses without.

I wondered what peoples stragegies were to parsing the stream of data
from a device? Currently my parser is built around a state machine.
It compares incomming data with possible strings. When it recongnises
a response idenitifier it sets the state to waiting for the parameters
of that response. Because I have a different state for each response
I can look for the data in the format specific to that response and I
can handle the wide variety of formats that the data is sent in.
However its not elegant.

Thanks,
Richard

John Henderson

2008-08-14 21:05:44 UTC

Permalink

Post by earth
Hello,
I am using a variation of the gsm at command set to control a
device (sorry cannot say specifically).
First off I find there is no consistency in the protocol
format when the device sends data to me. Some commands are
ended with crcrOKcr (cr = carriage return + newline), some are
crcr, some are just cr and some are just carriage return with
no newline. Then the number of cr differs if the data was
sent unsolicited.
Next there are responses prefixed with an 'at' and unsolicited
responses without.
I wondered what peoples stragegies were to parsing the stream
of data from a device? Currently my parser is built around a
state machine. It compares incomming data with possible
strings. When it recongnises a response idenitifier it sets
the state to waiting for the parameters of that response.
Because I have a different state for each response I can look
for the data in the format specific to that response and I can
handle the wide variety of formats that the data is sent in.
However its not elegant.

My approach is to put everything from the device onto the end of
a string variable, which I'll refer to by its name, "outb".
This variable is dynamic in that it is allowed to grow as
required, and gets trimmed (one way or another) as data is
removed.

When parsing outb, I look for result identifiers, like "+CBM: "
and "+CSQ: ", and store the positions of the first instances of
these into appropriate variables, like this in C:

ccedpos = strstr(outb, "+CCED: ");
cbmpos = strstr(outb, "+CBM: ");
csqpos = strstr(outb, "+CSQ: ");

Then I'll usually find which of these positions is the earliest
(smaller address offset), and process that one first in a
routine specific to its data. Then I'll destroy the identifier
within outb (by changing "+CSQ: " to "-CSQ: " for example).
Then I'll repeat the above piece of code, find the next result,
and process it. When no identifier is to be found, I'll empty
outb:

outb[0] = 0;

This approach implies lingering long enough on reading the input
buffer to wait for results to arrive fully and intact.

Part of the routine to handle "+CSQ: " results looks like this:

csqpos[0] = '-';
csqpos += 6;
if (sscanf(csqpos, "%2d%*[,]", &rxlv) != 1) rxlv = 0;
if (rxlv) rxlv = (2 * rxlv) - 113; // convert to dBm
if (rxlv > 0) rxlv = 0;

so I'm not concerned here about trailing <cr> or <lf> characters
at all.

I find the above works well for most mixes of immediate and
unsolicited-result commands. Minor exceptions include the
software user requesting the device IMEI. This is catered for
like this:

Writecmd(1, 0, okresp, "ATE1\r");
Writecmd(2, 0, "AT+CGSN", "AT+CGSN\r");
Writecmd(1, 0, okresp, "ATE0\r");
if (tmpp1 = strstr(outb, "AT+CGSN")) {
tmpp1[3] = ' ';
if (sscanf(tmpp1, "%*[^0-9]%15[0-9]%*[\r\nOK]", &tmpca1)
!= 1) tmpca1[0] = 0;
}
if (strlen(tmpca1) == 15) cprintf("\rIMEI: %s\r\n", tmpca1);
else cprintf("\rCannot read the IMEI at present.\r");

Here, the four "Writecmd" routine's arguments are:

seconds to wait for expected response

clear outb first (0 = no, 1 = yes)

the string to expect (okresp = "\r\nOK\r\n")

the command to the device.

This is part of a comprehensive netmonitoring program for a
Wavecom modem, and there many other twists and turns to the
processing of course.

John

earth

2008-08-15 13:50:08 UTC

Permalink

Post by John Henderson
My approach is to put everything from the device onto the end of
a string variable, which I'll refer to by its name, "outb".
This variable is dynamic in that it is allowed to grow as
required, and gets trimmed (one way or another) as data is
removed.

Yes I do that as well. Everything that comes off the serial port is
put onto the end of a string and I begin parsing. Once I have
recognised a command my parser goes into the waiting for parameters
state where I actually begin looking for the end delimiter. Once I
get that I take the parameters out and delete the whole command from
the string.

Post by John Henderson
When parsing outb, I look for result identifiers, like "+CBM: "
and "+CSQ: ", and store the positions of the first instances of
ccedpos = strstr(outb, "+CCED: ");
cbmpos = strstr(outb, "+CBM: ");
csqpos = strstr(outb, "+CSQ: ");

Again similar. I look for the identifiers.

Post by John Henderson
Then I'll usually find which of these positions is the earliest
(smaller address offset), and process that one first in a
routine specific to its data.

Sounds familar

Then I'll destroy the identifier

Post by John Henderson
within outb (by changing "+CSQ: " to "-CSQ: " for example).
Then I'll repeat the above piece of code, find the next result,
and process it. When no identifier is to be found, I'll empty
outb[0] = 0;

Mine differs here slightly in that I delete the command from the
string when I have recieved and processed it.

Post by John Henderson
This approach implies lingering long enough on reading the input
buffer to wait for results to arrive fully and intact.
csqpos[0] = '-';
csqpos += 6;
if (sscanf(csqpos, "%2d%*[,]", &rxlv) != 1) rxlv = 0;
if (rxlv) rxlv = (2 * rxlv) - 113; // convert to dBm
if (rxlv > 0) rxlv = 0;
so I'm not concerned here about trailing <cr> or <lf> characters
at all.

My reason for using the CRLF characters is to find the end of a result
string. And my reason for that is because some of the results have
variable length parameters and some parameters are optional. If the
parameters following the identifier were fixed length then I would not
be interested in an end delimiter for the result string I would just
wait for that number of bytes from the serial port.

Post by John Henderson
I find the above works well for most mixes of immediate and
unsolicited-result commands. Minor exceptions include the
software user requesting the device IMEI. This is catered for
Writecmd(1, 0, okresp, "ATE1\r");
Writecmd(2, 0, "AT+CGSN", "AT+CGSN\r");
Writecmd(1, 0, okresp, "ATE0\r");
if (tmpp1 = strstr(outb, "AT+CGSN")) {
tmpp1[3] = ' ';
if (sscanf(tmpp1, "%*[^0-9]%15[0-9]%*[\r\nOK]", &tmpca1)
!= 1) tmpca1[0] = 0;
}
if (strlen(tmpca1) == 15) cprintf("\rIMEI: %s\r\n", tmpca1);
else cprintf("\rCannot read the IMEI at present.\r");
seconds to wait for expected response
clear outb first (0 = no, 1 = yes)
the string to expect (okresp = "\r\nOK\r\n")
the command to the device.
This is part of a comprehensive netmonitoring program for a
Wavecom modem, and there many other twists and turns to the
processing of course.
John- Hide quoted text -
- Show quoted text -

Thanks. Your use of sscanf is key there. I think some of the
problems I am experiencing are due to 'rogue responses' that do not
follow the pattern of CRLF that others do. I am going to try to
become independent of CRLR.

How about echo? I use hyperterminal to send commands adhoc. I have
used it with echo off (but I cannot see what I am typing) so I choose
either echo on or echo key presses locally. I don't see any responses
prefixed with at. However in my app I do and I specifically send an
echo off command. I have a suspicion that hyperterminal is not
presenting me with exactly what comes from the serial port. Do you
think so?

Thanks again,
Richard

John Henderson

2008-08-15 20:19:17 UTC

Permalink

Hi Richard,

Post by earth
My reason for using the CRLF characters is to find the end of
a result string. And my reason for that is because some of
the results have variable length parameters and some
parameters are optional. If the parameters following the
identifier were fixed length then I would not be interested in
an end delimiter for the result string I would just wait for
that number of bytes from the serial port.

Likewise, much of the data I get is of unpredictable length.
For example, I read and display device make, model number, and
firmware revision during startup. I wrote that particular
routine before I started using sscanf, so my code is a little
obscure to me (written over 5 years ago). A sscanf reading the
string from the result identifier up to the first <cr>
character would be much more compact, comprehensible and
elegant.

Post by earth
How about echo? I use hyperterminal to send commands adhoc.
I have used it with echo off (but I cannot see what I am
typing) so I choose either echo on or echo key presses
locally. I don't see any responses prefixed with at. However
in my app I do and I specifically send an echo off command. I
have a suspicion that hyperterminal is not presenting me with
exactly what comes from the serial port.
Do you think so?

I generally switch echo off, for a number of reasons which have
more to do with elegance than necessity.

But some commands result in a data response without a result
identifier. For those, I turn echo on, to give me a handle to
find the data (it's expected just after the echoed command).
Reading the IMEI, as per my last reply, is a case in point.

In the times I used Hyperterminal, I did not think it was
interfering with the data stream.

Normally, I use Linux and minicom instead of Hyperterminal. But
my netmonitoring programs run as DOS programs so that they'll
work from any DOS boot disk (not even Windows) in a vehicle. I
soon modified the program at page 22 of this pdf file, and used
it instead of hyperterminal:
http://www.beyondlogic.org/serial/serial.pdf

John