Calculating bit parity
It takes a really perverse mind to try to calculate bit parity in Python, right? Well, you've certainly come to the right place! The internet has a couple of places like this gold standard for bit hacks - where you can find highly optimized bit hacks for calculating bit parity. I wanted to know if these algorithms also make a difference in a highly abstract language like Python, where integers can have arbitrary size and people try to stay away from too much bit fiddeling
Setting up a test harness
As usual, our first step is to prepare a testbed. What we want to do is verify that our solution is correct, and the easiest way to verify that is to compare it to a known solution. The correct solution will be provided using brute-force, and all other algorithms have to compare against that.
This is the code:
#! -*- Encoding: Latin-1 -*-
def selftest():
A = [random.randrange(0xFFFFFFFF) for k in range(50000)]
methods = { 'brute-force' : parity_brute_force,
'parity-kr' : parity_kr,
'parallel-swar' : parallel_swar,
'using-lookup' : using_lookup,
'using-modulo' : using_modulo,
'in-parallel' : in_parallel,
}
known_result = None
for method in methods:
t0 = time.time()
result = map(methods[method], A)
print "%20s: %.5f" % (method, time.time() - t0, )
if known_result is None:
known_result = result[:]
else:
assert result == known_result
if __name__ == "__main__":
selftest()
Using brute-force
The brute-force algorithm is very simple
- Count the number of bits set in a value x
- If the number of bits is even, the parity is 0, if it is odd, it is 1. This can be easily calculated by using number-of-bits modulo 2.
A possible implementation of this algorithm reads:
def parity_brute_force(x):
bit = 0
num_bits = 0
while x:
bitmask = 1 << bit
bit += 1
if x & bitmask:
num_bits += 1
x &= ~bitmask
return num_bits % 2
A small improvement is to not clear each bit explicitly, but use a mask:
def parity_kr(x):
bit = 0L
parity = False
while x:
parity = not parity
x = x & (x - 1)
return int(parity)
Solution using hamming weight
Don't get intimidated: I didn't know the term Hamming Weight either before I embarked upon this journey. But here is an algorithm that I had seen before using it:
def parallel_swar(i):
i = i - ((i >> 1) & 0x55555555)
i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
i = (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24
return int(i % 2)
Straight-Forward Python implementations of Bit Hacks
The gold standard for bit hacks contains a couple of suggestions and you should check there for how and why they work, but work they do. Here they are in all their beauty:
parity_lookup = [parallel_swar(i) for i in range(256)]
def using_lookup(v):
v ^= v >> 16
v ^= v >> 8
return parity_lookup[v & 0xff]
def using_modulo(v):
v ^= v >> 1
v ^= v >> 2
v = (v & 0x11111111) * 0x11111111
return (v >> 28) & 1
def in_parallel(v):
v ^= v >> 16
v ^= v >> 8
v ^= v >> 4
v &= 0xf
return (0x6996 >> v) & 1
That wasn't too bad, was it?
Test results
For a 100.000 test numbers, I get these result on my machine:
brute-force: 1.06600 parity-kr: 0.29800 parallel-swar: 0.13600 using-modulo: 0.08000 in-parallel: 0.07400 using-lookup: 0.05100
There are few surprises here, but it is interesting to see that the lookup table - which is mentioned in the very fine Elements of Programming Interviews as probably the fastest one in C/C++ is also the fastest one in Python: using a lookup table. Moral of the story: it does pay to look at bit hacks even in something as remote from bits as Python...