poniedziałek, 3 marca 2014

Slow-paths in GNU libc strstr

I've observed that some patterns issued to strstr cause significant slowdown.

Sample program kill-strstr.c executes strstr(data, pattern), where data is a large string (16MB) filled with character ?; patterns are read from command line.

On my machine following times were recorded:

1. searching string 'johndoe'...
        time: 0.032
2. searching string '??????????????????a'...
        time: 0.050
3. searching string '??????????????????????????????a'...
        time: 0.049
4. searching string '???????????????????????????????a'...
        time: 0.274
5. searching string '??????????????????????????????a?'...
        time: 0.356
6. searching string '??????????????????????????????a??????????????????????????????'...
        time: 0.396
  • Slowdown is visible in case 4 (5 times slower than pattern 3). Pattern has 32 characters, and contains '?', except last char.
  • Even bigger slowdown occurs in case 5 (7 times slower). This pattern also contains 32 chars, but position of the single letter 'a' is last but one.
  • Similar slowdown occurs in case 5 (nearly 8 times slower). In this pattern single letter 'a' is surrounded by 30 '?'.