While reviewing the dzen
source code I came across a function called
fill_ev_table
, which parses a string of action and event
keywords (see Events and
actions). For example, the following is a default event string for
dzen.
char edef[] = "entertitle=uncollapse,grabkeys;"
"enterslave=grabkeys;leaveslave=collapse,ungrabkeys;"
"button1=menuexec;button2=togglestick;button3=exit:13;"
"button4=scrollup;button5=scrolldown;"
"key_Up=scrollup;key_Down=scrolldown;"
"key_Escape=ungrabkeys,exit";
The parsing is done by calling the strtok_r
function,
which is a member of the C Standard Library.
char* strtok(char * restrict s1, const char *restrict s2);
The original strtok
function accepts two strings, the
first is the string to be parsed and the second is the delimiter. The
first time that strtok
is called with a new string it will
created a static buffer. That static buffer is used to return successive
tokens with each preceding call made to strtok
. The
existence of a static buffer that could be accessed by multiple threads
means this function is neither reentrant, nor thread-safe. This issue is
solved by the strtok_r
function.
char* strtok_r(char * restrict s, const char *restrict sep, char **restrict lasts);
The reentrant version of strtok
forces the caller to
save the state of a string that is being parsed. It does this by adding
a third argument, which is a pointer to the string that is being
manipulated. The first call to strtok_r
will return the
leading token in the string and store address of the buffer in the third
parameter.1 Each following call will return the
next token in the string, unless no more tokens exist, in which case the
function returns NULL.
The process of parsing a string with multiple delimiters can easily lead to terse branching code. This is clearly evidenced by the function I came across in the dzen source code.
void
(char *input) {
fill_ev_tablechar *str1, *str2, *str3, *str4,
*token, *subtoken, *kommatoken, *dptoken;
char *saveptr1=NULL,
*saveptr2=NULL,
*saveptr3=NULL,
*saveptr4=NULL;
int j, i=0, k=0;
long eid=0;
*ah=0;
handlerf
for (j = 1, str1 = input; ; j++, str1 = NULL) {
= strtok_r(str1, ";", &saveptr1);
token if (token == NULL)
break;
for (str2 = token; ; str2 = NULL) {
= strtok_r(str2, "=", &saveptr2);
subtoken if (subtoken == NULL)
break;
if( (str2 == token) && ((eid = get_ev_id(subtoken)) != -1))
;
else if(eid == -1)
break;
for (str3 = subtoken; ; str3 = NULL) {
= strtok_r(str3, ",", &saveptr3);
kommatoken if (kommatoken == NULL)
break;
for (str4 = kommatoken; ; str4 = NULL) {
= strtok_r(str4, ":", &saveptr4);
dptoken if (dptoken == NULL) {
break;
}
if(str4 == kommatoken && str4 != token && eid != -1) {
if((ah = get_action_handler(dptoken)) != NULL) {
(eid);
new_event(eid, i, ah);
add_handler++;
i}
}
else if(str4 != token && eid != -1 && ah) {
(eid, i-1, k, dptoken);
add_option++;
k}
else if(!ah)
break;
}
=0;
k}
(eid);
new_event(eid, i, NULL);
add_handler=0;
i}
}
}
This is clearly written like the example source code provided in the
Linux man-page STRTOK(3), which is very suspect. While searching for an
alternative to strtok
I came across a post by
MrMichaelWill who wrote the following on LinuxQuestions.org:2
Unfortunately the manpage was updated by an amateur. The original author of the functions would certainly disagree.
This is a great example of when to check other sources. An excellent
starting place is the POSIX STRTOK(3P)
man-page. Although I could not find any applicable implementations of
strtok
I decided to design a more elegant algorithm than
the one presented above.
Implementing strtok_r
to parse the aforementioned event
string is surprisingly difficult.
Note that although strtok_r
does not check
the value of the third parameter, it does require a pointer to store the
address. Using NULL as the third argument for the first call↩︎