Bit Operations in C/C++

Robert B. Heckendorn
University of Idaho

It is important to note that +, -, *, / bind more tightly than any of the bitwise operators! (see the shift example.)

Types for using with bit operations

Use unsigned types such as unsigned int to avoid issues with sign extension.

In the table below maximum is a predefined constant that contains the maximum value for the given type.

type                    maximum       size
unsigned char           UCHAR_MAX     1 byte
unsigned short          USHRT_MAX     2 bytes
unsigned int            UINT_MAX      4 bytes
unsigned long int       ULONG_MAX     4 or 8 bytes depending on the -m option on the compiler
unsigned long long int  ULLONG_MAX    8 bytes

The suffix for unsigned long int is UL (can be mixed case). For example 496UL is an unsigned long int constant. 6U is an unsigned int and 8192ULL is an unsigned long long int (64 bit) constant.

Expressing bit string constants

Although you can use integer constant notation using hex allows you to "see" what bits you are setting. Hex constants begin with "0x" as in 0xfff for the least significant 12 bits. For an 32 bit string it would be:

0xfff = 00000000000000000000111111111111

0x1 is the least significant bit.

 0xaa = 00000000000000000000000010101010

Bitwize And

The bitwize and operator is represented by a single &. This performs a bitwize and. For example 0xc & 0xa would give 0x8:

   00001100
&  00001010
-----------
   00001000

By "bitwize" we mean the operation is done for each bit position in the string. That is for each bit position k in the input strings we AND the bits in the two input strings in position k. For the logical operators, 1 is treated as true and 0 as false.

This can be used for selecting bits. For example select the least significant bit of x:

x & 0x1

Select the right most 5 bits:

x & 0x1f

The term mask is used to refer to a bitstring that is used for selection in some way. For instance:

x = x & mask

only preserves the bits in x where there is a 1 in the mask. This can, of course, be expressed as:

x &= mask

Bitwize Or

The bitwize or operator is a single vertical bar: | For example 0xc | 0xa would give 0xe:

   00001100
|  00001010
-----------
   00001110

This can be used to set bits in bit strings. For example set the second bit of x to 1:

x = x | 0x2

Or it can be used in conjunction with and to join parts of two bit strings. For example set the lower 3 bits of a 32 bit x to the lower 3 bits of y:

x = (x&0xfffffff8) | (y&0x7)

Set the bits in x that are indicated by the mask:

x |= mask

Bitwize Xor

The bitwize exclusive or operator (also known as xor) is a caret: ^ For example 0xc ^ 0xa would give 0x6:

   00001100
^  00001010
-----------
   00000110

Note that bits are only set if there is a difference in the values of a bit position.

The xor operator can be used to flip bits. For instance if you wanted to flip the last 3 bits of x:

x = x ^ 0x7

More generally: flip the bits that are indicated by the mask:

x ^= mask

Here it is in binary:

   10110101 = x
^  00011111 = mask
-----------
   10101010 = new x

You can also use this to swap two strings in a nontraditional way:

x = x ^ y;
y = y ^ x;
x = x ^ y;

You can use this to swap the parts of a string:

x = x ^ (y & mask);
y = y ^ (x & mask);
x = x ^ (y & mask);

Here is the swap in binary with spaces put in to make it easier to see where the swap happens:

x = 11011110110 00000011100 1011011011 y = 01100011011 11100110111 0001110101 mask = 00000000000 11111111111 0000000000

x = 11011110110 11100110111 1011011011 y = 01100011011 00000011100 0001110101

Bitwize Complement

The bitwize complement operator is a tilde: ~ For example ~0xc would give 0xf3 in an unsigned char

~  00001100
-----------
   11110011

Let's revisit the combing of two strings. Or it can be used in conjunction with and to join parts of two bit strings. For example set the lower 3 bits of a 32 bit x to the lower 3 bits of y:

x = (x & ~0x7) | (y & 0x7)

Bitwize Shift

The bitwize right shift operator is: >> and left shift operator is: << For unsigned types the right shift fills with 0. The left shift always fills with zero.

For example 0xc >> 3 would give 0x1:

   00001100
>> 3
-----------
   00000001

and 0xc << 3 would give 0x60:

   00001100
<< 3
-----------
   01100000

Warning: do not try to shift the same number of bits as there are in the type being shifted. This is returns different values on different processors and is not reliable and may not issue an error when it does this! e.g. do not do x<<32 for unsigned int x.

(x>>12)<<12

will zero the right most 12 bits of x. This, of course, could be done with a single bitwize and.

To get 21 bits of 1 right justified in the value i.e. 0x1fffff you can do

(1<<21)-1

This is a popular idiom.

It is important to note that +, -, *, / bind more tightly than any of the bitwise operators! This means that:

1<<21-1 = 1<<22

To extract bits 4-6 inclusive of x assuming the right most bit is numbered 0:

(x>>4)&0x7

this will first position bit number 4 in position 0 and the mask away all but the right most 3 bits giving you what was bits 4-6.

In order to rotate bits left you need to take the bits off the left side and attach them on the right. So for a 32 bit x that you want to rotate by amount r:

(x<>(32-r))

Binary Reflected Gray Code

It turns out that you can easily compute the Binary Reflected Gray code of a bitstring x:

x ^ (x>>1)

Yes, it is that simple.

The degray function is used to decode a number that is in Gray code into a regular number.

degray(gray(x)) == x

and

gray(degray(x)) == x

To degray is also relatively simple although you now need to know how many bits wide x is. For 32 bit x you need:

x^=(x>>1);
x^=(x>>2);
x^=(x>>4);
x^=(x>>8);
x^=(x>>16);

Using a Random Number Generator and Bitstrings

Random number generators (RNG) generally produce a pseudo-random number as a 32 or a 64 bit number depending on the generator. Most require initialization. There are two qualities of a random number generator that are of interest to us: speed and randomness. There is to some degree a tradeoff between these.

A convenient way to use a random number is to use a call that returns a double between 0 and 1. But this often does more work than is necessary for a particular application of the RNG and is therefore slower than it needs to be. In some applications the number of calls to the RNG is sufficiently large that it can be a major part of time used by the application. Here are some examples of using the raw bits returned to speed things up. This requires a reasonably unpatterned random number generator unlike the old UNIX rand() procedure. I use a small random number generator called Flea whose prototype is:

unsigned int flea();

If you want to decide between two things as if you are flipping a fair coin then simply mask a bit out of the random bitstring:

if (flea()&0x1) stuff   // uses the least significant bit

Check to be sure your random number generator doesn't have a pattern in the bit position you use (beware linear congruential RNGs).

If you want to chose between k things do not use the mod (%) operator. It is very slow. If you can convert it to a choice between a power of two number of things then use a mask. Say you want to choose between 6 things then try to make it a choice between 8 things and mask off 3 bits.

If you want to choose something with a probability p and you have the chance to choose a p' that close but is k times a power of two then again you can simply mask bits off the RNG.