wtorek, 29 marca 2011

pyahocorasick

Python module implementing Aho-Corasick algorithm has been released. C extension (for Py3k) and pure python code are available.

poniedziałek, 28 marca 2011

Internal memory fragmentation

In previous post I've advertised my text about trie representations.

Depending on particular representation internal memory fragmentation vary from 25% to 46% (in GNU libc). In other words if trie should occupy 100MB then in the worst case real memory usage is around 200MB. I've never suppose that fragmentation could be so significant.

When quite simple memory pools were used, then internal fragmentation has been cut down to 1-2%! Impressive.