Search in shivacherukuri.tech@blogger.com

Monday, May 23, 2011

Bit wise tricks to know/ Bit manipulation in C

Here we go

Bit Hack #1. Check if the integer is even or odd.
if ((x & 1) == 0) {   x is even } else {   x is odd } 

I am pretty sure everyone has seen this trick. The idea here is that an integer is odd if and only if the least significant bit b0 is 1. It follows from the binary representation of 'x', where bit b0 contributes to either 1 or 0. By AND-ing 'x' with 1 we eliminate all the other bits than b0. If the result after this operation is 0, then 'x' was even because bit b0 was 0. Otherwise 'x' was odd.

Let's look at some examples. Let's take integer 43, which is odd. In binary 43 is 00101011. Notice that the least significant bit b0 is 1 (in bold). Now let's AND it with 1:

    00101011 &   00000001   (note: 1 is the same as 00000001)     --------     00000001 

See how AND-ing erased all the higher order bits b1-b7 but left bit b0 the same it was? The result is thus 1 which tells us that the integer was odd.

Now let's look at -43. Just as a reminder, a quick way to find negative of a given number in two's complement representation is to invert all bits and add one. So -43 is 11010101 in binary. Again notice that the last bit is 1, and the integer is odd. (Note that if we used one's complement it wouldn't be true!)

Now let's take a look at an even integer 98. In binary 98 is 1100010.

    01100010 &   00000001     --------     00000000 

After AND-ing the result is 0. It means that the bit b0 of original integer 98 was 0. Thus the given integer is even.

Now the negative -98. It's 10011110. Again, bit b0 is 0, after AND-ing, the result is 0, meaning -98 is even, which indeed is true.

Bit Hack #2. Test if the n-th bit is set.

if (x & (1<<n)) {   n-th bit is set } else {   n-th bit is not set } 

In the previous bit hack we saw that (x & 1) tests if the first bit is set. This bit hack improves this result and tests if n-th bit is set. It does it by shifting that first 1-bit n positions to the left and then doing the same AND operation, which eliminates all bits but n-th.

Here is what happens if you shift 1 several positions to the left:

 1         00000001    (same as 1<<0)
1<<1 00000010
1<<2 00000100
1<<3 00001000
1<<4 00010000
1<<5 00100000
1<<6 01000000
1<<7 10000000

Now if we AND 'x' with 1 shifted n positions to the left we effectively eliminate all the bits but n-th bit in 'x'. If the result after AND-ing is 0, then that bit must have been 0, otherwise that bit was set.

Let's look at some examples.

Does 122 have 3rd bit set? The operation we do to find it out is:

122 & (1<<3) 

Now, 122 is 01111010 in binary. And (1<<3) is 00001000.

    01111010 &   00001000     --------     00001000 

We see that the result is not 0, so yes, 122 has the 3rd bit set.

Note: In my article bit numeration starts with 0. So it's 0th bit, 1st bit, ..., 7th bit.

What about -33? Does it have the 5th bit set?

    11011111      (-33 in binary) &   00100000     (1<<5)     --------     00000000 

Result is 0, so the 5th bit is not set.

Bit Hack #3. Set the n-th bit.

y = x | (1<<n) 

This bit hack combines the same (1<<n) trick of setting n-th bit by shifting with OR operation. The result of OR-ing a variable with a value that has n-th bit set is turning that n-th bit on. It's because OR-ing any value with 0 leaves the value the same; but OR-ing it with 1 changes it to 1 (if it wasn't already). Let's see how that works in action:

Suppose we have value 120, and we wish to turn on the 2nd bit.

    01111000    (120 in binary) |   00000100    (1<<2)     --------     01111100 

What about -120 and 6th bit?

    10001000   (-120 in binary) |   01000000   (1<<6)     --------     11001000 

Bit Hack #4. Unset the n-th bit.

y = x & ~(1<<n) 

The important part of this bithack is the ~(1<<n) trick. It turns on all the bits except n-th.

Here is how it looks:

 ~1        11111110  (same as ~(1<<0))
~(1<<1)11111101
~(1<<2)11111011
~(1<<3)11110111
~(1<<4)11101111
~(1<<5)11011111
~(1<<6)10111111
~(1<<7)01111111

The effect of AND-ing variable 'x' with this quantity is eliminating n-th bit. It does not matter if the n-th bit was 0 or 1, AND-ing it with 0 sets it to 0.

Here is an example. Let's unset 4th bit in 127:

    01111111    (127 in binary) &   11101111    (~(1<<4))     --------     01101111 

Bit Hack #5. Toggle the n-th bit.

y = x ^ (1<<n) 

This bit hack also uses the wonderful "set n-th bit shift hack" but this time it XOR's it with the variable 'x'. The result of XOR-ing something with something else is that if both bits are the same, the result is 0, otherwise it's 1. How does it toggle n-th bit? Well, if n-th bit was 1, then XOR-ing it with 1 changes it to 0; conversely, if it was 0, then XOR-ing with with 1 changes it to 1. See, the bit got flipped.

Here is an example. Suppose you want to toggle 5th bit in value 01110101:

    01110101 ^   00100000     --------     01010101 

What about the same value but 5th bit originally 0?

    01010101 ^   00100000     --------     01110101 

Notice something? XOR-ing the same bit twice returned it to the same value. This nifty XOR property is used in calculating parity in RAID arrays and used in simple cryptography cyphers, but more about that in some other article.

Bit Hack #6. Turn off the rightmost 1-bit.

y = x & (x-1) 

Now it finally gets more interesting!!! Bit hacks #1 - #5 were kind of boring to be honest.

This bit hack turns off the rightmost one-bit. For example, given an integer 00101010 (the rightmost 1-bit in bold) it turns it into 00101000. Or given 00010000 it turns it into 0, as there is just a single 1-bit.

Here are more examples:

01010111    (x) &   01010110    (x-1)     --------     01010110
01011000 (x) & 01010111 (x-1) -------- 01010000
10000000 (x = -128) & 01111111 (x-1 = 127 (with overflow)) -------- 00000000
11111111 (x = all bits 1) & 11111110 (x-1) -------- 11111110
00000000 (x = no rightmost 1-bits) & 11111111 (x-1) -------- 00000000

Why does it work?

If you look at the examples and think for a while, you'll realize that there are two possible scenarios:

1. The value has the rightmost 1 bit. In this case subtracting one from it sets all the lower bits to one and changes that rightmost bit to 0 (so that if you add one now, you get the original value back). This step has masked out the rightmost 1-bit and now AND-ing it with the original value zeroes that rightmost 1-bit out.

2. The value has no rightmost 1 bit (all 0). In this case subtracting one underflows the value (as it's signed) and sets all bits to 1. AND-ing all zeroes with all ones produces 0.

Bit Hack #7. Isolate the rightmost 1-bit.

y = x & (-x) 

This bit hack finds the rightmost 1-bit and sets all the other bits to 0. The end result has only that one rightmost 1-bit set. For example, 01010100 (rightmost bit in bold) gets turned into 00000100.

Here are some more examples:

10111100  (x) &   01000100  (-x)     --------     00000100
01110000 (x) & 10010000 (-x) -------- 00010000
00000001 (x) & 11111111 (-x) -------- 00000001
10000000 (x = -128) & 10000000 (-x = -128) -------- 10000000
11111111 (x = all bits one) & 00000001 (-x) -------- 00000001
00000000 (x = all bits 0, no rightmost 1-bit) & 00000000 (-x) -------- 00000000

This bit hack works because of two's complement. In two's complement system -x is the same as ~x+1. Now let's examine the two possible cases:

1. There is a rightmost 1-bit bi. In this case let's pivot on this bit and divide all other bits into two flanks - bits to the right and bits to the left. Remember that all the bits to the right bi-1, bi-2 ... b0 are 0's (because bi was the rightmost 1-bit). And bits to the left are the way they are. Let's call them bi+1, ..., bn.

Now, when we calculate -x, we first do ~x which turns bit bi into 0, bits bi-1 ... b0 into 1s, and inverts bits bi+1, ..., bn, and then we add 1 to this result.

Since bits bi-1 ... b0 are all 1's, adding one makes them carry this one all the way to bit bi, which is the first zero bit.

If we put it all together, the result of calculating -x is that bits bi+1, ..., bn get inverted, bit bi stays the same, and bits bi-1, ..., b0 are all 0's.

Now, AND-ing x with -x makes bits bi+1, ..., bn all 0, leaves bit bi as is, and sets bits bi-1, ..., b0 to 0. Only one bit is left, it's the bit bi - the rightmost 1-bit.

2. There is no rightmost 1-bit. The value is 0. The negative of 0 in two's complement is also 0. 0&0 = 0. No bits get turned on.

We have proved rigorously that this bithack is correct.

Bit Hack #8. Right propagate the rightmost 1-bit.

y = x | (x-1) 

This is best understood by an example. Given a value 01010000 it turns it into 01011111. All the 0-bits right to the rightmost 1-bit got turned into ones.

This is not a clean hack, tho, as it produces all 1's if x = 0.

Let's look at more examples:

10111100  (x) |   10111011  (x-1)     --------     10111111
01110111 (x) | 01110110 (x-1) -------- 01110111
00000001 (x) | 00000000 (x-1) -------- 00000001
10000000 (x = -128) | 01111111 (x-1 = 127) -------- 11111111
11111111 (x = -1) | 11111110 (x-1 = -2) -------- 11111111
00000000 (x) | 11111111 (x-1) -------- 11111111

Let's prove it, though not as rigorously as in the previous bithack (as it's too time consuming and this is not a scientific publication). There are two cases again. Let's start with easiest first.

1. There is no rightmost 1-bit. In that case x = 0 and x-1 is -1. -1 in two's complement is 11111111. OR-ing 0 with 11111111 produces the same 11111111. (Not the desired result, but that's the way it is.)

2. There is the rightmost 1-bit bi. Let's divide all the bits in two groups again (like in the previous example). Calculating x-1 modifies only bits to the right, turning bi into 0, and all the lower bits to 1's. Now OR-ing x with x-1 leaves all the higher bits (to the left) the same, leaves bit bi as it was 1, and since lower bits are all low 1's it also turns them on. The result is that the rightmost 1-bit got propagated to lower order bits.

Bit Hack #9. Isolate the rightmost 0-bit.

y = ~x & (x+1) 

This bithack does the opposite of #7. It finds the rightmost 0-bit, turns off all bits, and sets this bit to 1 in the result. For example, it finds the zero in bold in this number 10101011, producing 00000100.

More examples:

 10111100  (x)     --------     01000011
(~x) & 10111101 (x+1) -------- 00000001
01110111 (x) -------- 10001000
(~x) & 01111000 (x+1) -------- 00001000
00000001 (x) -------- 11111110
(~x) & 00000010 (x+1) -------- 00000010
10000000 (x = -128) -------- 01111111
(~x) & 10000001 (x+1) -------- 00000001
11111111 (x = no rightmost 0-bit) -------- 00000000
(~x) & 00000000 (x+1) -------- 00000000
00000000 (x) -------- 11111111
(~x) & 00000001 (x+1) -------- 00000001

Proof: Suppose there is a rightmost 0-bit. Then ~x turns this rightmost 0 bit into 1 bit. And so does x+1 (because bits more right to the rightmost 0 bit are 1's). Now AND-ing ~x with x+1 evaporates all the bits up to this rightmost 0 bit. This is the highest order bit set in the result. Now what about lower order bits to the right of rightmost 0 bit? They also got evaporated because because x+1 turned them into 0's (they were 1's) and ~x turned them into 0's. They got AND-ed with 0 and evaporated.

Bit Hack #10. Turn on the rightmost 0-bit.

y = x | (x+1) 

This hack changes the rightmost 0-bit into 1. For example, given an integer 10100011 it turns it into 10100111.

More examples:

10111100  (x) |   10111101  (x+1)     --------     10111101
01110111 (x) | 01111000 (x+1) -------- 01111111
00000001 (x) | 00000010 (x+1) -------- 00000011
10000000 (x = -128) | 10000001 (x+1) -------- 10000001
11111111 (x = no rightmost 0-bit) | 00000000 (x+1) -------- 11111111
00000000 (x) | 00000001 (x+1) -------- 00000001

Here is the proof as a bunch of true statements. OR-ing x with x+1 does not lose any information. Adding 1 to x fills the first rightmost 0. The result is max{x, x+1}. If x+1 overflows it's x and there were no 0 bits. If it doesn't, it's x+1 which just got rightmost bit filled with 1.


Friday, May 20, 2011

C bit wise operators

As an old school C programmer, I use all these operators somewhat frequently. I thought I'd write this tutorial to illustrate typical situations where I use them, and provide some sample code snippets.

% MODULO

The modulo operator gives you the remainder after integer division. For example, 16 % 3 is 1. Why? If you divide 16 by 3, you get 5, with a 1 left over. The "left over" part is what the module operator gives you. If you use modulo with an ascending series of numbers, it produces a characteristic repeating "stripe" pattern that goes from 0 to N-1, like so:

0 % 3 == 0 1 % 3 == 1 2 % 3 == 2 3 % 3 == 0 4 % 3 == 1 5 % 3 == 2 6 % 3 == 0 7 % 3 == 1 etc. 
For positive numbers, the results of modulo are predictable.

Unfortunately, for negative numbers, the results vary, depending on the language. Javascript has the following pattern, which is arithmetically correct:

-1 % 3 == -1 -2 % 3 == -2 -3 % 3 == 0 -4 % 3 == -1 -5 % 3 == -2 -6 % 3 == 0 -7 % 3 == -1 etc. 

But other languages, like Perl, have the following pattern, which essentially continues the pattern you see in the positive numbers:

-1 % 3 == 2 -2 % 3 == 1 -3 % 3 == 0 -4 % 3 == 2 -5 % 3 == 1 -6 % 3 == 0 -7 % 3 == 2 etc. 

You can make arguments for either case, since they both have their uses. Since I personally have a hard time remembering these kinds of language-specific details (and I program in a variety of different languages), I tend to avoid using Modulo with negative numbers in my work. For now, I'll concentrate on the positive numbers, and then show a trick which avoids the problem with the negative numbers.

That basic stripe pattern of 0,1,2, 0,1,2, 0,1,2... is a very useful thing, and I tend to employ it when I have software that has right and left-arrow navigation. For example, there might be a user-interface element which contains a "right arrow" which takes you to the "next page". If I'm on the last page, I often want to "wrap around" to the first page. "Wrap around" is an ideal use for the modulo operator.

If the page numbers are numbered 0 thru N-1 (where N is the number of pages), and the current page number is represented by curPage, then pressing the right arrow can invoke a script which does something like this:

curPage = (curPage + 1) % N; 
If you were not familiar with the modulo operator, you might code this like so:
curPage = curPage + 1 if (curPage == N)   curPage = 0; 

If N is 3, both code snippets produce the following transitions:

0 -> 1 1 -> 2 2 -> 0 
... But the modulo version is only one line. This to me, is a classic use of the modulo operator. Typically, I will internally represent page number using a zero-based number, to make it easier to use the modulo operator in this way. But even if your page numbers start at 1, you can still use modulo, by changing the forumula as follows:

curPage = ((curPage-1)+1 % N) + 1; 

which simplifies to:

curPage = (curPage % N) + 1; 
If N is three, this produces the following transitions.
1 -> 2 2 -> 3 3 -> 1 
Now let's consider how you can handle a left-arrow, which takes you to the previous page, and also loops around from the first page to the last page. If your page numbers are 0 - N-1, you might start with the following formula:
curPage = (curPage - 1) % N; 
But notice there is a problem if curPage is set to 0 (zero), depending on the behavior of the modulo operator. In the Perl implementation of Modulo, this code works (which is probably why Perl implemented Modulo that way), but in the Javascript implementation of Modulo, which is arithmetically correct, it does not work, so you have to special case it:
if (curPage == 0)   curPage = N-1; else   curPage = (curPage - 1) % N; 
Here's a cool trick that avoids the special case:
curPage = (curPage + (N-1)) % N; 
Notice this still has the following transition table for N=3:
0 -> 2 1 -> 0 2 -> 1 
If your page numbers start at 1, then convert it to a zero-based number first (by subtracting one), perform the operation, and then convert it back (by adding one):
curPage = ((curPage-1) + (N-1)) % N + 1; 
which has the following transition table for N=3:
1 -> 3 2 -> 1 3 -> 2 
You can generalize both of these directions by using a delta variable, which keeps track of the direction of travel. If traveling to the left, delta = -1, if traveling to the right, delta = 1. In this case, you would use the following code for zero-based page numbers (from 0 to N-1):
function gotoNextPage(delta) {   curPage = (curPage + N + delta) % N; } 
and the following code for one-based page numbers (from 1 to N):
function gotoNextPage(delta) {   curPage = ((curPage-1) + N + delta) % N + 1; } 
You can use modulo anytime where that "stripe" effect is useful - even for making stripes in graphics software! I've also used it for game software where a spaceship needs to wrap around the screen - reappearing on the left side after disappearing on the right. This looks something like this:
spaceship.x = (spaceship.x + screenWidth + horizontalSpeed) % screenWidth; spaceship.y = (spaceship.y + screenHeight + verticalSpeed) % screenHeight; 

& Bitwise AND
| Bitwise OR
~ Bitwise NOT

Here are some tables showing the effect of these bitwise logic operators. I've included the ^ XOR operator as well, which I'll discuss further down.
0 & 0 == 0 0 & 1 == 0 1 & 0 == 0 1 & 1 == 1  0 | 0 == 0 0 | 1 == 1 1 | 0 == 1 1 | 1 == 1  0 ^ 0 == 0 0 ^ 1 == 1 1 ^ 0 == 1 1 ^ 1 == 0  ~0 == 1 ~1 == 0 
These operators are called "bitwise" because they are applied to every single bit in the operands. So for example, in binary
00101101 & 00011100 == 00001100 
which in decimal comes to
45 | 28 == 12 
I most commonly use these bitwise logic operators for maintaining a set of flags, which are stored in a single numeric variable. To make sense of these operators, you have to visualize the binary representation of the numbers involved.

Let's say I have a game which involves different characteristics of animals, which are tracked using flags.

kHasTeeth   = 1; kHasFur     = 2; kHasScales  = 4; kHasLungs   = 8; kHasGills   = 16; kHasTusks   = 32; 
Note that these flags corresponding to single bits in binary numbers:
          // decimal    binary kHasTeeth   = 1;    // 00000001 kHasHair    = 2;    // 00000010 kHasScales  = 4;    // 00000100 kHasLungs   = 8;    // 00001000 kHasGills   = 16;   // 00010000 kHasTusks   = 32;   // 00100000 
We can use the Bitwise OR (|) operator to combine these flags into a single "characteristics" variable.
if (animal == "sheep")   characteristics = kHasTeeth | kHasHair | kHasLungs; else if (animal == "snake")     characteristics = kHasScales | kHasLungs; else if (animal == "goldfish")     characteristics = kHasScales | kHasGills; else if (animal == "shark")     characteristics = kHasTeeth | kHasScales | kHasGills; else if (animal == "elephant")     characteristics = kHasTusks | kHasLungs | kHasHair; 
We can use the Bitwise AND (&) variable to test if a particular flag is turned on.
if ( (characteristics & kHasTeeth) )    message = "This animal has teeth."; 
Here's a more complex example:
if ( (characteristics & kHasScales) && (characteristics & kHasLungs) )    message = "This animal is probably a reptile."; else if ( (characteristics & kHasScales) && (characteristics & kHasGills) )    message = "This animal is probably a fish."; 
The Bitwise Negation (~) operator give you the opposite bit pattern of it's operand, typically extended up to 32 bits.
X = 1      // x = 00000000000000000000000000000001 X = ~X;    // x = 11111111111111111111111111111110 
This operator is commonly combined with the Bitwise AND (&) operator to clear individual bits from flags. For example, the following code turns off the kHasHair bit in characteristics.
characteristics = (characteristics & ~kHasHair); 
This can be shortened to
characteristics &= ~kHasHair; 
Note that this code will turn the kHasHair bit off, if it is on, and will have no effect otherwise. If you want to TOGGLE the value of that bit, use the XOR (^) operator, without the negation.
characteristics ^= kHasHair; 
More on XOR below...

^ XOR

The bitwise XOR operator is one of my favorites, since it often lies at the center of some interesting arcane trickery.

The effect of XOR

A = A ^ B 
or
A ^= B 
Is to toggle any bits in A which are set (or turned on) in B. If those bits are turned off in A, they are turned on. If they are turned on in A, they are turned off.

Interestingly, if you apply it twice, you get the same result back. For example, if you execute the following code

myVar = (myVar ^ X) ^ X 
or
myVar ^= X myVar ^= X 
Where X is any integer value, then the value of myVar is unchanged.

As described above, the toggle nature of XOR makes it ideal for toggling flags.

myFlags = (myFlags ^ SomeFlag); 
or
myFlags ^= SomeFlag; 
A more obscure use of XOR I've used is to swap the values of two integer variables, without requiring a third temporary variable. Normally, if you wanted to swap two integer variables, you'd do this:
var temp = x; x = y; y = temp; 
However, back in the days when I cared about such things as minimizing the numbers of registers, I would write code like this:
x ^= y; y ^= x; x ^= y; 
..which had the same effect. Let's examine what is going on with some real values to get an intuitive understanding of why. Let's say X is equal to 6 and Y is equal to 5. The following tables shows what is happening in binary.
                      X               Y original values       110             101 x ^= y                011             101   x=x^y y ^= x                011             110   y=y^(x^y) or y=x x ^= y                101             110   x=(x^y)^x or x=y 
voila! they've been swapped!

XOR is often used in simple (and easily crackable) encryption algorithms, since you can apply it to a series of character codes to first encrypt, and then decrypt them.

<< Shift Left >> Shift Right

The shift left operator shift all the bits to the left in a number. For example:
x = 6                   // X is 110 in binary x = (x << 1)            // X is now 1100 in binary, or 12 in decimal. 
Notice that shifting a number to the left by 1 is the same as multiplying it by two. Shifting to the left by N bits is the same as multiplying it by 2 to the N. Similarly, the shift right operator is the same as integer-dividing by 2 to the N.

Back when CPU speed was a major issue, you could get speed gains by substituting left and right shifts for multiplies and divides in this manner. However, in high level languages like Javascript and Actionscript, there is no real savings.

I do often use the shift operator to convert ordinal numbers to flags. For example, in my Sudoku software, I use a single number to indicate which digits (from 1-9) are possibilities for a particular cell.

I can set a digit as a possibilty using code like the following:

possibles = (possibles | (1 << digit)); 
or
possibles |= (1 << digit); 
and I can clear a digit as a possibility using code like this:
possibles = (possibles & ~(1 << digit)); 
or
possibles &= ~(1 << digit); 
If there are 9 possible digits (1-9), I can set all the possible digits at once, using this:
N = 9 possibles = (1 << N)-1; 
Note however, that this code assumes I am using bit-numbers 0 thru N-1, rather than bit-numbers 1 thru N. As an old-school programmer, I pretty much always use zero-based counting systems, when I can, since it simplifies the resulting code.

Another very common way I use the shift and bitwise logic operators is for converting individual R,G and B values into colors.

compositeColor = (red << 16) | (green << 8) | blue; 


--
Siva
9886179349

Thursday, May 19, 2011

Binary Arithmetic

Rules of Binary Addition

  • 0 + 0 = 0
  • 0 + 1 = 1
  • 1 + 0 = 1
  • 1 + 1 = 0, and carry 1 to the next more significant bit

For example,

00011010 + 00001100 = 00100110                  1  1  carries
  0  0  0  1  1  0  1  0    =   26(base 10)
+ 0  0  0  0  1  1  0  0
   =   12(base 10)
  0  0  1  0  0  1  1  0    =   38(base 10)
 
 
00010011 + 00111110 = 01010001           1  1  1  1  1  carries
  0  0  0  1  0  0  1  1    =   19(base 10)
+ 0  0  1  1  1  1  1  0
   =   62(base 10)
  0  1  0  1  0  0  0  1    =   81(base 10)

Note:  The rules of binary addition (without carries) are the same as the truths of the XOR gate.


Rules of Binary Subtraction

  • 0 - 0 = 0
  • 0 - 1 = 1, and borrow 1 from the next more significant bit
  • 1 - 0 = 1
  • 1 - 1 = 0

For example,

00100101 - 00010001 = 00010100                  0  borrows
  0  0  1 10  0  1  0  1    =   37(base 10)
- 0  0  0  1  0  0  0  1
   =   17(base 10)
  0  0  0  1  0  1  0  0    =   20(base 10)
 
 
00110011 - 00010110 = 00011101              0 10  1  borrows
  0  0  1  1  0 10  1  1    =   51(base 10)
- 0  0  0  1  0  1  1  0
   =   22(base 10)
  0  0  0  1  1  1  0  1    =   29(base 10)


Rules of Binary Multiplication

  • 0 x 0 = 0
  • 0 x 1 = 0
  • 1 x 0 = 0
  • 1 x 1 = 1, and no carry or borrow bits

For example,

00101001 × 00000110 = 11110110          0  0  1  0  1  0  0  1    =   41(base 10)
× 0  0  0  0  0  1  1  0
   =   6(base 10)
0  0  0  0  0  0  0  0  
0  0  1  0  1  0  0  1     
0  0  1  0  1  0  0  1      
 
0  0  1  1  1  1  0  1  1  0    =   246(base 10)
 
 
00010111 × 00000011 = 01000101          0  0  0  1  0  1  1  1    =   23(base 10)
× 0  0  0  0  0  0  1  1
   =   3(base 10)
   1  1  1  1  1        carries
0  0  0  1  0  1  1  1  
0  0  0  1  0  1  1  1   
 
0  0  1  0  0  0  1  0  1    =   69(base 10)

Note:  The rules of binary multiplication are the same as the truths of the AND gate.

Another Method:  Binary multiplication is the same as repeated binary addition; add the multicand to itself the multiplier number of times.

For example,

00001000 × 00000011 = 00011000                     1  carries
  0  0  0  0  1  0  0  0    =   8(base 10)
  0  0  0  0  1  0  0  0    =   8(base 10)
+ 0  0  0  0  1  0  0  0
   =   8(base 10)
  0  0  0  1  1  0  0  0    =   24(base 10)


Binary Division

Binary division is the repeated process of subtraction, just as in decimal division.

For example,

00101010 ÷ 00000110 = 00000111                    1  1  1     =   7(base 10)

1  1  0  )  0  0  1 1 1  0  1  0     =   42(base 10)
      -   1  1  0        =   6(base 10)
 
 
      1       borrows
    1  0 1 1   
    -   1  1  0   
 
 
         1  1  0 
     -   1  1  0 
 
         0 
 
 
10000111 ÷ 00000101 = 00011011                  1  1  0  1  1     =   27(base 10)

1  0  1  )  1  0  0 1 0  1  1   1     =   135(base 10)
    -   1  0  1          =   5(base 10)

 
    1  1 1    
  -   1  0  1     
 
 
      1  1    
    -    0    
 
 
      1  1  1   
    -   1  0  1   
 
 
       1  0  1 
     -   1  0  1 
 
         0 


Notes

Binary Number System
System Digits:  0 and 1
Bit (short for binary digit):  A single binary digit
LSB (least significant bit):  The rightmost bit
MSB (most significant bit):  The leftmost bit
Upper Byte (or nybble):  The right-hand byte (or nybble) of a pair
Lower Byte (or nybble):  The left-hand byte (or nybble) of a pair
 
Binary Equivalents
1 Nybble (or nibble)  =  4 bits
1 Byte  =  2 nybbles  =  8 bits
1 Kilobyte (KB)  =  1024 bytes
1 Megabyte (MB)  =  1024 kilobytes  =  1,048,576 bytes
1 Gigabyte (GB)  =  1024 megabytes  =  1,073,741,824 bytes

Wednesday, May 18, 2011

Wht is 2s compliment?

       Binary numbers do not have signs. So 2's complement is used to represent a negative nos. 2's complement is found in following way :

2's complement = (do 1's complement(it is to reverse the bits in a binary) +  1)
Step I)
If we have a binary no 00001100(decimal 12) then we have
to invert it by replacing all the 1s by 0 and 0s by 1.
So we get 00001100 ---> 11110011

Step II)
Now we have to add 1 to the no which we found in (I)
So we get ,
11110011 + 00000001 = 11110100


So the no 11110100 represents -12 .
As the MSB i.e. the first bit is 1 , it indicates that the no. represents a -ve no.
In binary subtraction instead of subtracting, 2's complement of a number is added.

why 2's complement is better than sign magnitude

   Because sign magnitude has 2 representations for 0 100000000000000000000 ( = -0) and 000000000000000000000 ( = +0) Clearly, -0 = +0. However, because of these two representations, different machines process sign magnitude differently at 0. Two's complement avoids this problem and is therefore used much more commonly.


Tuesday, May 17, 2011

Daemon creation example

* Rules:                                              *

* 1. Call 'fork' and have parent 'exit'                           *

* 2. Call 'setsid' to create a new session                        *

* 3. Change cwd to /   -> Else if a daemon is from a mounted            *

*    (or somewhere...)    filesystem, then that cannot be unmounted     *

* 4. Set file-mode-creation mask to 0, close unneeded file descriptors  * ************************************************************************/

 

#include <sys/types.h>

#include <sys/stat.h>

#include <fcntl.h>

#include <stdlib.h>

 

static int global;

 

int main(void)

{

      pid_t pid;

     

      if ( (pid = fork()) < 0 )

         return -1;

      else if ( pid != 0 )

            exit(0);

      setsid();   /* this makes it a session leader */

      chdir("/"); /* so that other file systems can be unmounted */

      umask(0);   /* The file mode creation mask could have been inherited

                     with restrictive permissions. E.g, if the daemon

                     expects files that it creates to have group-read

                     groud-write bits enabled, and a file mode creation

                     mask has turned them off, then it would be a

                     problem */

      close(0);

      close(1);

      close(2);

 

      while(1) {

            ++global;

 

        };  /* Just to show some processing.... */

      return 0;

}

Sunday, May 15, 2011

algorithm for finding a string within a string?

The general problem is string searching; there are many algorithms and approaches depending on the nature of the application.

Some advanced index data structures are also used in other applications. Suffix trees are used a lot in bioinformatics; here you have one long reference text, and then you have many arbitrary strings that you want to find all occurrences for. Once the index (i.e. the tree) is built, finding patterns can be done quite efficiently.

For an interview answer, I think it's better to show breadth as well. Knowing about all these different algorithms and what specific purposes they serve best is probably better than knowing just one algorithm by heart.



Monday, May 9, 2011

is main() function required for C program?

No, the ISO C standard states that a main function is only required for a hosted environment (one with an underlying OS).

For a freestanding environment like an embedded system (or an operating system itself), it's implementation defined. From C99 5.1.2:

Two execution environments are defined: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment.

In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined.

As to how Linux itself starts, the start point for the Linux kernel is start_kernel though, for a more complete picture of the entire boot process, you should start here.

and

The operating systems loader has to call a single entry point; in the GNU compiler, the entry point is defined in the crt0.o linked object file, the source for this is the assembler file crt0.s - that invokes main() after performing various run-time start-up tasks (such as establishing a stack, static initialisation). So when building an executable that links the default crt0.o, you must have a main(), otherwise you will get a linker error since in crt0.o main() is an unresolved symbol.

It would be possible (if somewhat perverse and unnecessary) to modify crt0.s to call a different entry point. Just make sure that you make such an object file specific to your project rather than modifying the default version, or you will break every build on that machine.

The OS itself has its own C runtime start-up (which will be called from the bootloader) so can call any entry point it wishes. I have not looked at the Linux source, but imagine that it has its own crt0.s that will call whatever the C code entry point is

Is recursion faster than loop or what is tail recursion?



In any realistic system, no, creating a stack frame will always be more expensive than an INC and a JMP. That's why really good compilers automatically transform tail recursion into a call to the same frame, i.e. without the overhead, so you get the more readable source version and the more efficient compiled version. A really, really good compiler should even be able to transform normal recursion into tail recursion where that is possible.

This depends on the language being used. You wrote 'language-agnostic', so I'll give some examples.

In Java, C, and Python, recursion is fairly expensive compared to iteration (in general) because it requires the allocation of a new stack frame. In some C compilers, one can use a compiler flag to eliminate this overhead, which transforms certain types of recursion (actually, certain types of tail calls) into jumps instead of function calls.

In functional programming language implementations, sometimes, iteration can be very expensive and recursion can be very cheap. In many, recursion is transformed into a simple jump, but changing the loop variable (which is mutable) sometimes requires some relatively heavy operations, especially on implementations which support multiple threads of execution. Mutation is expensive in some of these environments because of the interaction between the mutator and the garbage collector, if both might be running at the same time.

I know that in some Scheme implementations, recursion will generally be faster than looping.

In short, the answer depends on the code and the implementation. Use whatever style you prefer. If you're using a functional language, recursion might be faster. If you're using an imperative language, iteration is probably faster. In some environments, both methods will result in the same assembly being generated (put that in your pipe and smoke it).

Addendum: In some environments, the best alternative is neither recursion nor iteration but instead higher order functions. These include "map", "filter", and "reduce" (which is also called "fold"). Not only are these the preferred style, not only are they often cleaner, but in some environments these functions are the first (or only) to get a boost from automatic parallelization — so they can be significantly faster than either iteration or recursion. Data Parallel Haskell is an example of such an environment.

List comprehensions are another alternative, but these are usually just syntactic sugar for iteration, recursion, or higher order functions

where do constant variables stored in C

How they are stored is an implementation detail (depends on the compiler).

For example, in the GCC compiler, on most machines, read-only variables, constants, and jump tables are placed in the text section.

Consider the code:

const int i = 0;
static const int k = 99;

int function(void)
{
   
const int j = 37;
    totherfunc
(&j);
    totherfunc
(&i);
 
//totherfunc(&k);
   
return(j+3);
}

Generally, i can be stored in the text segment (it's a read-only variable with a fixed value). If it is not in the text segment, it will be stored beside the global variables. Given that it is initialized to zero, it might be in the 'bss'(block starting symbols) section (where zeroed variables are usually allocated) or in the 'data' section (where initialized variables are usually allocated).

If the compiler is convinced the k is unused (which it could be since it is local to a single file), it might not appear in the object code at all. If the call to totherfunc() that references k was not commented out, then k would have to be allocated an address somewhere - it would likely be in the same segment as i.

The constant (if it is a constant, is it still a variable?) j will most probably appear on the stack of a conventional C implementation. (If you were asking in the comp.std.c news group, someone would mention that the standard doesn't say that automatic variables appear on the stack; fortunately, SO isn't comp.std.c!)

Note that I forced the variables to appear because I passed them by reference - presumably to a function expecting a pointer to a constant integer. If the addresses were never taken, then j and k could be optimized out of the code altogether. To remove i, the compiler would have to know all the source code for the entire program - it is accessible in other translation units (source files), and so cannot as readily be removed. Doubly not if the program indulges in dynamic loading of shared libraries - one of those libraries might rely on that global variable.

(Stylistically - the variables i and j should have longer, more meaningful names; this is only an example!)