środa, 13 kwietnia 2011

Regular expressions derivatives

Have you ever heard about "regular expression derivatives"? No? Me too. This is very interseting idea, that have pratical usage in pattern matching.

Matthew Might wrote a sample program using the idea: A regular expression matcher in Scheme using derivatives. There is also a detailed paper on this topic: Regular-expression derivatives re-examined by Owens, Reppy and Turon.

Traversing DAGs

If DAG has got one component, then the simplest traversing
method is depth-first-search, which could be easily implemented
recursively (using implicit stack).

struct DAGNode {
        // user data
        bool    visited;        // after construction = 0
}

void DFS_aux(DAGNode* node, const bool val) {
        if (node->visited != val) {
                // visit node

                node->visited = val;
                for (n in node.connected)
                        DFS_aux(n, val)
        }
}

void DFS(DAGNode node) {
        static val = true;
        DFS_aux(node, val);
        val = not val;
}

On every call of DFS() variable val is switched, and visited member is marked alternately with true or false.

There is just one problem — what if traversing method stop execution before visiting all nodes? Of course in such situation we have to visit DAG twice: on first pass reset (possibly many times) visited member to false, and then visit once each node.

But usually bool have at least 8 bits, so numbers could be used instead of boolean values 0 or 1. On each call of DFS() a reference number is incremented, thanks that even if previous call stopped in the middle procedure will work
correctly.

The only moment when visited markers have to be cleared is wrapping a reference numbers to zero. This happen every 256 calls if 8-bit values used; for wider counters (16, 32 bits)
max value is greater.

void DFS(DAGNode node) {
        static unsigned int val = 1;
        if (val == 0) {
                // set visited member of all nodes to 0
                val += 1;
        }
        DFS_aux(node, val);
        val += 1;
}