Simple regular expressions on a list of symbols instead of a string?

Simple regular expressions on a list of symbols instead of a string?

Post by George Pet » Tue, 14 Nov 2006 09:53:12


Hi all,

I want to use really simple regular expressions, in order to be able to
identify matching ranges in a tcl list. For example, assuming the pattern:

a b+ a

to be able to locate in the following list

c c c a b b a c c (where a, b, c can be symbols, i.e. words or other
structures)

the {a b b a} sublist.
How can this be implemented?
For sure, I cannot use the regexp command, as it only operates on
strings. I have to somehow convert the regular expression into an
automaton, which will be applied to the input.

I tried to use the grammar::fa from tcllib, but with limited success.
I tried the following:

package require grammar::fa
package require grammar::fa::dexec
grammar::fa fa
fa fromRegex {. {S a} {+ {S b}} {S a}}
fa determinize

But how can I apply the automaton to locate the sublist?
Up to now, I have tried:

proc executor_callback {op args} {
puts "$op: $args"; update
switch $op {
error {parser reset}
final {parser reset}
}
}

::grammar::fa::dexec parser fa -command executor_callback

foreach token {c c c a b b a c c} {
parser put $token
}

When I run the above code, I can see that there is a match, but I don't
know where. Also, is there a way that I can control the matching of the
symbols?

When I put a symbol (lets say c), the automaton compares it with the
symbols a or b I have defined in the fromRegex command. Is there a way I
can do the comparison myself throw a callback?

The reason behind this is that my symbols are tcl lists. On my regular
expression I can have some constraints for each symbol, which look
specific list elements in the lists that are symbols. If I had a
callback with the two elements as input (the symbol pushed into the fa
and what I have given in the "S symbol" statements when building the fa)
then I could perform my application specific comparisons.

Finally, there is a package grammar::fa::compiler mentioned in the
documentation that seems missing. And I couldn't understand what the
packages grammar::me & grammar::peg do, and how can be used. Any help in
understanding them will be appreciated... :-)

George
 
 
 

Simple regular expressions on a list of symbols instead of a string?

Post by Bruc » Tue, 14 Nov 2006 10:22:11


Or, you could join the list into a string using a char that
can't be in any of your list elements, then also include that
char into your expressions, and go ahead and use regexp.
then re-split the string back into a list.

Bruce

 
 
 

Simple regular expressions on a list of symbols instead of a string?

Post by George Pet » Tue, 14 Nov 2006 10:32:49

O/H Bruce :


Not a solution for me :-) I have tried it and it was very slow.
My "symbols" are tcl lists with 7-8 elements, many of which are also
lists. And I have a few hundrends of them. Concatenating all this
information into a string (besides the fact that it will be a big
string) needs a very complex regular expression, just to check one-two
parameters. I need a better solution :-(

George