Bit Operations in C/C++

Robert B. Heckendorn
University of Idaho


It is important to note that +, -, *, / bind more tightly than any of the bitwise operators! (see the shift example.)

Types for using with bit operations

Use unsigned types such as unsigned int to avoid issues with sign extension.

In the table below maximum is a predefined constant that contains the maximum value for the given type.

type                    maximum       size
unsigned char           UCHAR_MAX     1 byte
unsigned short          USHRT_MAX     2 bytes
unsigned int            UINT_MAX      4 bytes
unsigned long int       ULONG_MAX     4 or 8 bytes depending on the -m option on the compiler
unsigned long long int  ULLONG_MAX    8 bytes

The suffix for unsigned long int is UL (can be mixed case). For example 496UL is an unsigned long int constant. 6U is an unsigned int and 8192ULL is an unsigned long long int (64 bit) constant.

Expressing bit string constants

Although you can use integer constant notation using hexidecimal (hex) notation allows you to "see" what bits you are setting. Each hex digit translates into 4 bits of binary. Hex constants begin with "0x" as in this hex: 0xfff for the least significant 12 bits. Each f in hex represents 0xf = 15 in decimal = 1111 in binary. 3 f's gives you 12 bits of 1's. For an 32 bit string it would be:

0xfff = 00000000000000000000111111111111

0x1 is the least significant bit.

0x1 = 00000000000000000000000000000001
and
0xaa = 00000000000000000000000010101010

Bitwize And

The bitwize and operator is represented by a single &. This performs a bitwize and. Each bit position in one operand is paired with the same position in the other operand and then and'ed. For example 0xc & 0xa would give 0x8:

   00001100
&  00001010
-----------
   00001000
That is, by "bitwize" we mean the operation is done for each bit position in the string. That is for each bit position k in the input strings we AND the bits in the two input strings in position k. For the logical operators, 1 is treated as true and 0 as false.

This can be used for selecting bits. For example select the least significant bit of x:

x & 0x1

Select the right most 5 bits:

x & 0x1f

The term mask is used to refer to a bitstring that is used for selection in some way. For instance:

x = x & mask
only preserves the bits in x where there is a 1 in the mask. This can, of course, be expressed as:
x &= mask

Bitwize Or

The bitwize or operator is a single vertical bar: | For example 0xc | 0xa would give 0xe:

   00001100
|  00001010
-----------
   00001110

This can be used to set bits in bit strings. For example set the second bit of x to 1:

x = x | 0x2

Or it can be used in conjunction with and to join parts of two bit strings. For example set the lower 3 bits of a 32 bit x to the lower 3 bits of y:

x = (x&0xfffffff8) | (y&0x7)

Set the bits in x that are indicated by the mask:

x |= mask

Bitwize Xor

The bitwize exclusive or operator (also known as xor) is a caret: ^ For example 0xc ^ 0xa would give 0x6:

   00001100
^  00001010
-----------
   00000110

Note that bits are only set if there is a difference in the values of a bit position.

The xor operator can be used to flip bits. For instance if you wanted to flip the last 3 bits of x:

x = x ^ 0x7

More generally: flip the bits that are indicated by the mask:

x ^= mask

Here it is in binary:

   10110101 = x
^  00011111 = mask
-----------
   10101010 = new x

You can also use this to swap two strings in a nontraditional way:

x = x ^ y;
y = y ^ x;
x = x ^ y;

You can use this to swap the parts of a string:

x = x ^ (y & mask);
y = y ^ (x & mask);
x = x ^ (y & mask);

Here is the swap in binary with spaces put in to make it easier to see where the swap happens:

    x = 11011110110 00000011100 1011011011
    y = 01100011011 11100110111 0001110101
 mask = 00000000000 11111111111 0000000000

x = 11011110110 11100110111 1011011011 y = 01100011011 00000011100 0001110101

Bitwize Complement

The bitwize complement operator is a tilde: ~ For example ~0xc would give 0xf3 in an unsigned char

~  00001100
-----------
   11110011

Let's revisit the combing of two strings. Or it can be used in conjunction with and to join parts of two bit strings. For example set the lower 3 bits of a 32 bit x to the lower 3 bits of y:

x = (x & ~0x7) | (y & 0x7)

Bitwize Shift

The bitwize right shift operator is: >> and left shift operator is: << For unsigned types the right shift fills with 0. The left shift always fills with zero.

For example 0xc >> 3 would give 0x1:

   00001100
>> 3
-----------
   00000001

and 0xc << 3 would give 0x60:

   00001100
<< 3
-----------
   01100000

Warning: do not try to shift the same number of bits as there are in the type being shifted!! This is returns different values on different processors and is not reliable and may not issue an error when it does this! e.g. do not do x<<32 for unsigned int x.

Another important idiom is to put a 1 in a particular position. For instance:

1<<13
will put a 1 in position number 13 with the least significant bit position being position 0. That is it will create a 1 followed by 13 zeros! You can use this to flip the bit in position 10 in x:
x = x ^ (1<<10)

(x>>12)<<12
will zero the right most 12 bits of x. This, of course, could be done with a single bitwize and.

You can use shift to generate patches of 1 bits. To get 21 bits of 1 bits right justified in the value i.e. 0x1fffff you can do

(1<<21)-1
This is a popular idiom.

It is important to note that +, -, *, / bind more tightly than any of the bitwise operators! This means that:

1<<21-1 = 1<<20

To extract bits 4-6 inclusive of x assuming the right most bit is numbered 0:

(x>>4)&0x7
this will first position bit number 4 and everything to the left of that starting in position 0 and the mask away all but the right most 3 bits giving you what was bits 4-6 but right justified.

In order to rotate bits left you need to take the bits off the left side and attach them on the right. So for a 32 bit x that you want to rotate by amount r:

(x<<r) | (x>>(32-r))

Binary Reflected Gray Code

Binary is one way to map a string of bits to an integer. But it isn't the only way!! It turns out that another popular way to represent an integer as a string of bits is to use Gray code. When counting in binary, everytime you increment or decrement you can flip some bits. It isn't consistent in binary how many bits you flip. Sometimes it takes a lot of bits flipped to get to the next highest number like moving from 15 in binary which is 00001111 to 16 which is 00010000. Five bits had to be flipped. THis means that single bit flipping does not allow you move easily from one binary number to the next larger binary number. So a single bit flip mutate sometimes can't get at the next larger or smaller number! Gray code is a different mapping from bit strings to integers that is guaranteed to flip only one bit to increment or to decrement. Examine the translation from an integer to binary and to gray code in this table:

Number Binary Gray
0 0000 0000
1 0001 0001
2 0010 0011
3 0011 0010
4 0100 0110
5 0101 0111
6 0110 0101
7 0111 0100
8 1000 1100
9 1001 1101
10 1010 1111
11 1011 1110
12 1100 1010
13 1101 1011
14 1110 1001
15 1111 1000

It turns out that you can easily compute a type of Gray code called the Binary Reflected Gray code of a bitstring x using exclusive or and a shift.

x ^ (x>>1)
Yes, it is that simple.

The degray function is used to decode a number that is in Gray code into a regular number.

degray(gray(x)) == x

and

gray(degray(x)) == x
To degray is also relatively simple although you now need to know how many bits wide x is. For 32 bit x you need to execute these lines in order: (see the bit helper routines I have supplied for the code.)
x^=(x>>1);
x^=(x>>2);
x^=(x>>4);
x^=(x>>8);
x^=(x>>16);

So, if a bit string is assumed to be in Gray code you can use degray to translate that bit string into a binay code string for the same number. For the table above it would mean if you saw 0110 in gray code you could compute degray(0110) which gives 0100 in binary (see table above or use the degray function) which is 4 as an integer.

If you have a number like 0111 in binary and what to know what it is in gray code just take the gray(0111) which is 100.

Using a Random Number Generator and Bitstrings

Random number generators (RNG) generally produce a pseudo-random number as a 32 or a 64 bit number depending on the generator. Most require initialization. There are two qualities of a random number generator that are of interest to us: speed and randomness. There is to some degree a tradeoff between these.

A convenient way to use a random number is to use a call that returns a double between 0 and 1. But this often does more work than is necessary for a particular application of the RNG and is therefore slower than it needs to be. In some applications the number of calls to the RNG is sufficiently large that it can be a major part of time used by the application. Here are some examples of using the raw bits returned to speed things up. This requires a reasonably unpatterned random number generator unlike the old UNIX rand() procedure. I use a small random number generator called rand whose code is included.

unsigned long long int randULL();

If you want to decide between two things as if you are flipping a fair coin then simply mask a bit out of the random bitstring:

if (randULL()&0x1ULL) stuff   // uses the least significant bit
Check to be sure your random number generator doesn't have a pattern in the bit position you use (beware linear congruential RNGs).

If you want to chose between k things do not use the mod (%) operator. It is very slow. If you can convert it to a choice between a power of two number of things then use a mask. Say you want to choose between 6 things then try to make it a choice between 8 things and mask off 3 bits.

If you want to choose something with a probability p and you have the chance to choose a p' that close but is k times a power of two then again you can simply mask bits off the RNG.