Search in shivacherukuri.tech@blogger.com

Wednesday, January 19, 2011

Basic Hash example

Basic hash example

#include < stdio.h>
#include < stdlib.h>

#define HASHSIZE    1000
#define MAXLINE     1024

typedef struct tnode {
 char *data;
 struct tnode *next;
} node;

void htable_init(node *hashtable);                        // fire up hashtable
void htable_insert(node *hashtable, char *str);           // insert data into hashtable
void htable_resolve(node *hashtable, int loc, char *str); // resolve collisions in hashtable
void htable_display(node *hashtable);                     // display hashtable
int  htable_delete(node *hashtable, char *str);           // delete an entry from hashtable
int  htable_hash(char *str);                              // hash data for hashtable

int main(void) {
 char line[MAXLINE];
 node *hashtable;

 hashtable = (node *)malloc(HASHSIZE * sizeof(node));

 htable_init(hashtable);

 while((fgets(line, MAXLINE, stdin)) != NULL)
  htable_insert(hashtable, line);

 htable_display(hashtable);

 return 0;
}

/* fire up hashtable */
void htable_init(node *hashtable) {
 int i = 0;

 for(i = 0; i < HASHSIZE; i++)
  hashtable[i].data = NULL, hashtable[i].next = NULL;
}

/* insert data into hashtable */
void htable_insert(node *hashtable, char *str) {
 int index = 0;
 
 /*
 // determine hash function
 */
 index = htable_hash(str);
 if(hashtable[index].data != NULL) {
  /*
  // collision occurs - resolve by chaining
  */
  htable_resolve(hashtable, index, str);
 } else {
  hashtable[index].data = calloc(strlen(str) + 1, sizeof(char));
  strcpy(hashtable[index].data, str);
 }
}

/* hash data for hashtable */
int htable_hash(char *str) {
 int index = 0;
 char *tmp = NULL;

 tmp = calloc(strlen(str) + 1, sizeof(char));
 strcpy(tmp, str);

 while(*tmp) {
  index += *tmp;
  tmp++;
 }

 index = index % HASHSIZE;
 return index;
}

/* resolve collisions in hashtable */
void htable_resolve(node *hashtable, int loc, char *str) {
 node *tmp;
 tmp = hashtable + loc;

 while(tmp->next != NULL)
  tmp = tmp->next;
  tmp->next = (node *)malloc(sizeof(node));
  tmp->next->data = calloc(strlen(str) + 1, sizeof(char));
  strcpy(tmp->next->data, str);
  tmp->next->next = NULL;
}

/* display hashtable */
void htable_display(node *hashtable) {
 int i = 0;
 node *target;

 for(i = 0; i < HASHSIZE; i++) {
  if(hashtable[i].data != NULL) {
   target = hashtable + i;
   while(target)
    /* printf("location: %d, data: %s", i, target->data), target = target->next; */
    printf("%s", target->data), target = target->next;
  } /* if */
 } /* for */
}

/* delete an entry from hashtable */
int htable_delete(node *hashtable, char *str) {
 node *bla;
 node *blb;
 char *tmp = NULL;
 int index = 0;

 index = htable_hash(str);

 /* no item at this location */
 if(hashtable[index].data == NULL)
  return 1;

 /* only one item at this location */
 if(hashtable[index].next == NULL) {
  if(strcmp(hashtable[index].data, str) == 0) {
   /* item found */
   tmp = hashtable[index].data;
   hashtable[index].data = NULL;
   free(tmp);
  }
 } else {
  /* there is a chaining case */
  bla = hashtable + index;
  /* linked list similar */
  while(bla->next != NULL) {
   if(strcmp(bla->next->data, str) == 0) {
    blb = bla->next;
    if(bla->next->next)
     bla->next = bla->next->next;
    else
     bla->next = NULL;

    free(blb);
   } /* if */
  } /* while */
 } /* else */

 return 0;
}

Multicast datagrams example

Sending Multicast Datagrams

Sending a multicast datagram is easy: The sender simply specifies a multicast address as the destination of an ordinary sendto(2) system call.


Receiving Multicast Datagrams

To receive multicast packets, an application must first request that the host join a particular multicast group. This is done using another call to setsockopt(2):

	struct ip_mreq mreq;

setsockopt(sock,IPPROTO_IP,IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq));

The definition of struct ip_mreq is as follows:

	struct ip_mreq {
struct in_addr imr_multiaddr; /* multicast group to join */
struct in_addr imr_interface; /* interface to join on */
}

For now, imr_interface can be safely set to INADDR_ANY to join on the default multicast network interface.

In addition to the host part of an IP multicast address, there is also a port number, as in TCP or UDP sockets. This port number information is used by the kernel to decide which port on the local machine to route packets to. However, unlike in TCP or UDP, there can be many sockets which receive IP multicast packets off a single local port. Binding the socket to a local port is done with bind(2), and is done in the same manner as binding to a UDP address.

After the bind(2) has been performed and we have used setsockopt(2) to join a multicast group, we are ready to receive. An ordinary recvfrom(2) call may be used to read datagrams off of the receive socket.


An Example of Multicast Programming: "Hello, World!"

Here is an example which illustrates all the facilities described here for using multicast. It consists of two programs: sender and listener. The listener program simply echoes everything it receives to its standard out, and the sender multicasts "hello, world!" to everyone else in the multicast group.



=====

Multicast Example Programs


Sender Program

/*
* sender.c -- multicasts "hello, world!" to a multicast group once a second
*
* Antony Courtney, 25/11/94
*/

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <time.h>
#include <string.h>
#include <stdio.h>


#define HELLO_PORT 12345
#define HELLO_GROUP "225.0.0.37"

main(int argc, char *argv[])
{
struct sockaddr_in addr;
int fd, cnt;
struct ip_mreq mreq;
char *message="Hello, World!";

/* create what looks like an ordinary UDP socket */
if ((fd=socket(AF_INET,SOCK_DGRAM,0)) < 0) {
perror("socket");
exit(1);
}

/* set up destination address */
memset(&addr,0,sizeof(addr));
addr.sin_family=AF_INET;
addr.sin_addr.s_addr=inet_addr(HELLO_GROUP);
addr.sin_port=htons(HELLO_PORT);

/* now just sendto() our destination! */
while (1) {
if (sendto(fd,message,sizeof(message),0,(struct sockaddr *) &addr,
sizeof(addr)) < 0) {
perror("sendto");
exit(1);
}
sleep(1);
}
}

Listener Program

/*
* listener.c -- joins a multicast group and echoes all data it receives from
* the group to its stdout...
*
* Antony Courtney, 25/11/94
* Modified by: Frédéric Bastien (25/03/04)
* to compile without warning and work correctly
*/

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <time.h>
#include <string.h>
#include <stdio.h>


#define HELLO_PORT 12345
#define HELLO_GROUP "225.0.0.37"
#define MSGBUFSIZE 256

main(int argc, char *argv[])
{
struct sockaddr_in addr;
int fd, nbytes,addrlen;
struct ip_mreq mreq;
char msgbuf[MSGBUFSIZE];

u_int yes=1; /*** MODIFICATION TO ORIGINAL */

/* create what looks like an ordinary UDP socket */
if ((fd=socket(AF_INET,SOCK_DGRAM,0)) < 0) {
perror("socket");
exit(1);
}


/**** MODIFICATION TO ORIGINAL */
/* allow multiple sockets to use the same PORT number */
if (setsockopt(fd,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(yes)) < 0) {
perror("Reusing ADDR failed");
exit(1);
}
/*** END OF MODIFICATION TO ORIGINAL */

/* set up destination address */
memset(&addr,0,sizeof(addr));
addr.sin_family=AF_INET;
addr.sin_addr.s_addr=htonl(INADDR_ANY); /* N.B.: differs from sender */
addr.sin_port=htons(HELLO_PORT);

/* bind to receive address */
if (bind(fd,(struct sockaddr *) &addr,sizeof(addr)) < 0) {
perror("bind");
exit(1);
}

/* use setsockopt() to request that the kernel join a multicast group */
mreq.imr_multiaddr.s_addr=inet_addr(HELLO_GROUP);
mreq.imr_interface.s_addr=htonl(INADDR_ANY);
if (setsockopt(fd,IPPROTO_IP,IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq)) < 0) {
perror("setsockopt");
exit(1);
}

/* now just enter a read-print loop */
while (1) {
addrlen=sizeof(addr);
if ((nbytes=recvfrom(fd,msgbuf,MSGBUFSIZE,0,
(struct sockaddr *) &addr,&addrlen)) < 0) {
perror("recvfrom");
exit(1);
}
puts(message);
}
}

Friday, January 7, 2011

About DUP()

Open files

First, read section 3.10 in the book. This discusses the various ways you can share open files. Basically, what is says is the following:

The operating system (OS) has three kinds of data structures for files:

  • ``File table entry'' -- it has one of these for each file descriptor for a user. It contains information like the current lseek pointer, and a pointer to a ``vnode'' (see below). So, when you start your program, the OS has three file table entries for you -- one each for stdin, stdout, stderr. Each time you call open(), a new file table entry is created for you in the OS.
  • ``vnode'' -- There is one of these for each physical file that has been opened. It contains a pointer to the file's inode, the file's size, etc.
  • ``inode'' -- There is one of these for each file on disk. It contains all the information returned by stat().
The difference between a vnode and an inode is where it's located and when it's valid. Inodes are located on disk and are always valid because they contain information that is always needed such as ownership and protection. Vnodes are located in the operating system's memory, and only exist when a file is opened. However, just one vnode exists for every physical file that is opened.

So, look at the following program:

main()
{
int fd1, fd2;

fd1 = open("file1", O_WRONLY | O_CREAT | O_TRUNC, 0644);
fd2 = open("file1", O_WRONLY);

}
Now, what has happened? The OS has created two file table entries, one for each open() call, but only one vnode. This is because there is only one file. Both file table entries point to the same vnode, but they each have different seek pointers. Thus, if we expand the above program into: (This is file fs1.c)
main()
{
int fd1, fd2;

fd1 = open("file1", O_WRONLY | O_CREAT | O_TRUNC, 0644);
fd2 = open("file1", O_WRONLY);

write(fd1, "Jim\n", strlen("Jim\n"));
write(fd2, "Plank\n", strlen("Plank\n"));

close(fd1);
close(fd2);
}
Then what will happen? Well, the first write() call will write the string "Jim\n" into file1. Then the second write() call will overwrite it with "Plank\n". This is because each fd points to its own file table entry, which has its own lseek pointer, and thus the first write() does not update the lseek pointer of the fd2.

To make this more clear, fs1a.c prints out the values of each fd's seek pointer at each step of the program. As you can see, even though the two fd's are for the same file, since they each have their own file table entry, they each have their own seek pointer:

UNIX> fs1
UNIX> cat file1
Plank
UNIX> fs1a
Before writing Jim: lseek(fd1, 0, 1): 0. lseek(fd2, 0, 1): 0
Before writing Plank: lseek(fd1, 0, 1): 4. lseek(fd2, 0, 1): 0
After writing Plank: lseek(fd1, 0, 1): 4. lseek(fd2, 0, 1): 6
UNIX> cat file1
Plank
UNIX>

Dup()

Now, the system call dup(int fd) duplicates a file descriptor fd. What this does is return a second file descriptor that points to the same file table entry as fd does. So now you can treat the two file descriptors as identical.

Look at an alteration of fs1.c (in fs2.c). Instead of calling open() to initialize fd2, it calls dup(fd1). Thus, after the first write, the lseek pointer of fd2 has been updated to reflect the write to fd1 -- this is because the two file descriptors point to the same file table entry.

Thus, after running fs2.c, the file "file2" should say "Jim\nPlank\n". Like fs1a.c, fs2a.c prints out the lseek pointers of fd1 and fd2 at each step. As you can see, the write() to fd1 updates the lseek pointer for fd2:

UNIX> fs2
UNIX> cat file2
Jim
Plank
UNIX> fs2a
Before writing Jim: lseek(fd1, 0, 1): 0. lseek(fd2, 0, 1): 0
Before writing Plank: lseek(fd1, 0, 1): 4. lseek(fd2, 0, 1): 4
After writing Plank: lseek(fd1, 0, 1): 10. lseek(fd2, 0, 1): 10
UNIX> cat file2
Jim
Plank
UNIX>

Now, when fork() is called, ALL FILE DESCRIPTORS ARE DUPLICATED, AS IF dup() WERE CALLED. Thus, look at the following program (fs3.c):

main()
{
char s[1000];
int i, fd;

fd = open("file3", O_WRONLY | O_CREAT | O_TRUNC, 0644);

i = fork();
sprintf(s, "fork() = %d.\n", i);
write(fd, s, strlen(s));
}
What should happen? Well, whichever process gets control of the CPU first after the fork() will write s to file3. Then the other process will append its string s to file3. For example:
UNIX> fs3
UNIX> cat file3
fork() = 0.
fork() = 22107.
UNIX> fs3
UNIX> cat file3
fork() = 0.
fork() = 22110.
UNIX> fs3
UNIX> cat file3
fork() = 22113.
fork() = 0.
UNIX>
Now, this is because the file descriptor fd is duplicated across fork() calls. Were it not duplicated, but instead re-opened, then one write() would overwrite the other.

Perhaps you're thinking, ``He opened a file and then called fork(). Does he have to worry about that buffer copying problem in the last lab?'' The answer is no, because I'm using write(), which is a system call, and there is no buffering. You have to worry about the buffering problem when the standard I/O library is being used, and the buffer is not empty when fork() is called. For example, look at fs3a.c and fs3b.c. They use fprintf() instead of write(). When I call them, I get the following:

UNIX> fs3a
UNIX> cat file3
fork() = 0.
fork() = 3716.
UNIX> fs3b
UNIX> cat file3
This is file3
fork() = 3719.
This is file3
fork() = 0.
UNIX>
Do you see where the copied buffer is a problem? Make sure you can explain this phenomenon.

Dup2()

Dup2() is a system call that dups an open file descriptor so that the result is a desired file descriptor.
int dup2(int fd1, int fd2)

With dup2(), fd2 specifies the desired value of the new
descriptor. If descriptor fd2 is already in use, it is
first deallocated as if it were closed by close(2V).
Dup2() is most often used so that you can redirect standard input or output. When you call dup2(fd, 0) and the dup2() call is successful, then whenever your program reads from standard input, it will read from fd. Similarly, when you call dup2(fd, 1) and the dup2() call is successful, then whenever your program writes to standard output, it will write to fd.

For example, look at dup2ex.c. This opens the file file4 for writing, and then uses dup2 to redirect standard output to that file. When it's done, you'll see that everything has gone intto file4:

UNIX> dup2ex
UNIX> cat file4
Standard output now goes to file4
It goes even after we closed file descriptor 3
putchar works
And fwrite
And write
UNIX>
Why did I make the fflush() call in dup2ex.c? Take it out and see. Make sure that you can explain this.

Now, suppose you want to execute, for example, "cat < f1 > f2" by calling fork(), exec() and dup2() instead of doing it from the shell. You can do this in catf1f2.c. This opens f1 for reading on stdin (fd 0), and f2 for writing on stdout (fd 1).

Study this program closely, because you will find it greatly helpful in the jsh lab.