Wednesday, October 17, 2007

How to convert an integer to little endian or big endian

As discussed in my previous post "Little endian vs Big endian", there are some scenario where we need to take care of endianness of system. Consider an example where you want to send data over network to some remote systems. The systems can be of any endianness. All interested parties decide that the data transmitted over internet will be in big endian format. The system receiving data will then convert it to local endian format.

Let's write function to write and read bytes from a 64 bit integer. I am considering 64 bit integer because an endian related C program written with int may work fine on 32 bit platform but will fail on 64 bit platform. So, we will be using strict data types. More on C data types related issues here.

1. Extract bytes in big endian sequence from uint64_t irrespective of the execution platform.


void
uint64ToByteArray (uint64_t num, size_t bytes, unsigned char *arr)
{
size_t i;
unsigned char ch;
for (i = 0; i < bytes; i )
{
ch = (num >> ((i & 7) << 3)) & 0xFF;
arr[bytes - i - 1] = ch;
}
}

For UINT64_C (0xabcdef1234567890), we will get the following sequence of bytes irrespective of platform.
ab cd ef 12 34 56 78 90

Infact, we can make this function more generic by adding a type parameter which will specify the sequence in which we need the bytes. Lets support little and big endian.

#define LITTLE 0
#define BIG 1

void
uint64ToByteArray (uint64_t num, size_t bytes, unsigned char *arr, int type)
{
size_t i;
unsigned char ch;
for (i = 0; i < bytes; i )
{
ch = (num >> ((i & 7) << 3)) & 0xFF;
if (type == LITTLE)
arr[i] = ch;
else if (type == BIG)
arr[bytes - i - 1] = ch;
}
}

So, if type == 0, byte sequence will be 90 78 56 34 12 ef cd ab
and if type == 1, byte sequence will be ab cd ef 12 34 56 78 90

2. Reconstruct uint64_t from bytes in little/Big endian sequence irrespective of the execution platform.

uint64_t
byteArrayToUInt64 (unsigned char *arr, size_t bytes, int type)
{
uint64_t num = UINT64_C (0);
uint64_t tmp;

size_t i;
for (i = 0; i < bytes; i )
{
tmp = UINT64_C (0);
if (type == LITTLE)
tmp = arr[i];
else if (type == BIG)
tmp = arr[bytes - i - 1];

num |= (tmp << ((i & 7) << 3));
}
return num;
}

So, if byte sequence is 90 78 56 34 12 ef cd ab and type == 0, then uint64_t will be 0xabcdef1234567890

10 comments:

Anonymous said...

fyi, u have a bug in ur FOR loop...
'i' never increments.. should be ++i right?

Anonymous said...

above post,you should have a look at bit shifting,I think...

Anonymous said...

Nope, there is definately a bug in your for loop. I've compiled and run it and it never exits as i never changes from 0.

Anonymous said...

agreed w/ above commenter. the loop goes on infinitely w/o some way for the gap between i and byte to dissapear. i++ or ++i is necessary.

khalid said...

i&7 will be always equal to i, whenever i is less than or equal to y.

So u can replace the i&7 to i.

Thanks and Regards
Khalid

M_D_K said...

for (i = 0; i < bytes; i )

should be:

for (i = 0; i < bytes; i++)

nikhil gaikwad said...

M_D_K IS RIGHT
there should b i++

Andre said...

There are x86 asm instruction dedicated to this operation. Wouldn't it be more efficient to use that in stead of doing it like this?

Anonymous said...

I had similar problem under MS VC++ when converting byte order for colors from GDI to GDI+. I also chose inline assembler and bswap http://msdn.microsoft.com/en-us/library/ms536255%28v=VS.85%29

Anonymous said...

maybe, that's the part, where you have to think yourselves :)