wtorek, 21 kwietnia 2015

The Influence of Malloc Placement on TSX Hardware Transactional Memory

Interesting paper:

We show that the placement policies of dynamic storage allocators -- such as those found in common "malloc" implementations -- can influence the L1 conflict miss rate in the L1. Conflict misses -- sometimes called mapping misses -- arise because of less than ideal associativity and represent imbalanced distribution of active memory blocks over the set of available L1 indices. Under transactional execution conflict misses may manifest as aborts, representing wasted or futile effort instead of a simple stall as would occur in normal execution mode.

niedziela, 19 kwietnia 2015

Conversion numbers to binary ASCII representation - new method

Recently I've checked different methods to convert numbers to binary representation, including use of new PDEP instruction from BMI2 extension.

Today I've updated the article with new SWAR version 2, a tricky use of multiplication. The method is not faster, but I like the approach---in certain conditions multiplication can be seen as multi-shift/bit-or instruction. I've already use multiplication in this way to emulate instruction pmovmskb.

poniedziałek, 13 kwietnia 2015

czwartek, 9 kwietnia 2015

Github repositories

I've put source code for my two articles at github:
Repositories contain original code, read: C99, 32-bit for GCC with inline assembly and also new programs in C++11 using intrinsics, tested in 64-bit environment.


BTW the article about popcount has gained popularity, and I hope another crazy idea about hacking MPSADBW will spread all over the world.

SIMD-ized searching in unique constant dictionary

The problem: there is a ordered dictionary containing only unique keys. Dictionary is read only, and keys are 32-bit (SSE) or 64-bit (AVX2). Read more