Feb 4, 2020

Strings in C

          
          Unit IV  (Contd…..)

          Strings and string manipulation functions.

         C Strings
 
The string in C programming language is actually a one-dimensional array of characters which is terminated by a null character '\0'.
Note that ‘\0’ and ‘0’ are not same. ASCII value of ‘\0’ is 0, whereas ASCII value of ‘0’ is 48. Note that the elements of the character array are stored in contiguous memory locations.
The terminating null (‘\0’) is important, because it is the only way the functions that work with a string can know where the string ends. In fact, a string not terminated by a ‘\0’ is not really a string, but merely a collection of characters.
Thus a null-terminated string contains the characters that comprise the string followed by a null.

The following declaration and initialization create a string consisting of the word "Hello".

To hold the null character at the end of the array, the size of the character array containing the string is one more than the number of characters in the word "Hello".

char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
If you follow the rule of array initialization then you can write the above statement as follows:
char greeting[] = "Hello";

Following is the memory presentation of above-defined string in C:


Actually, you do not place the null character at the end of a string constant. The C compiler automatically places the '\0' at the end of the string when it initializes the array. Let us try to print above mentioned string:

#include <stdio.h>
int main ()
{
char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
printf("Greeting message: %s\n", greeting );
return 0;
}

When the above code is compiled and executed, it produces result something as follows:

Greeting message: Hello
Even though there are so many ways to refer to the elements of a character array, rarely is any one of them used. This is because printf( ) function has a simple way of doing it, as shown below. Note that printf( ) doesn’t print the ‘\0’.
main( )
{
char name[ ] = "Hello World!!" ;
printf ( "%s", name ) ;
}

          Output:
                   Hello World!!
The %s used in printf( ) is a format specification for printing out a string. The same specification can be used to receive a string from the keyboard, as shown below.
main( )
{
char name[25] ;
printf ( "Enter your name " ) ;
scanf ( "%s", name ) ;
printf ( "Hello %s!", name ) ;
}
And here is a sample run of the program...
Enter your name Yashwant
Hello Yashwant!

Note that the declaration char name[25] sets aside 25 bytes under the array name[ ], whereas the scanf( ) function fills in the characters typed at keyboard into this array until the enter key is hit. Once enter is hit, scanf( ) places a ‘\0’ in the array. Naturally, we should pass the base address of the array to the scanf( ) function.

While entering the string using scanf( ) we must be cautious about two things:
·        The length of the string should not exceed the dimension of the character array. 
·       scanf( ) is not capable of receiving multi-word strings. Therefore names such as ‘Debashish Roy’ would be unacceptable. The way to get around this limitation is by using the function gets( ).

The usage of functions gets( ) and its counterpart puts( ) is shown below.

main( )
{
char name[25] ;
printf ( "Enter your full name " ) ;
gets ( name ) ;
puts ( "Hello!" ) ;
puts ( name ) ;
}

 And here is the output...
Enter your name Yashwant Kanetkar
Hello!
Yashwant Kanetkar

The program and the output are self-explanatory except for the fact that, puts( ) can display only one string at a time (hence the use of two puts( ) in the program above). Also, on displaying a string, unlike printf( )puts( ) places the cursor on the next line. Though gets( ) is capable of receiving only one string at a time, the plus point with gets( ) is that it can receive a multi-word string.
If we are prepared to take the trouble we can make scanf( ) accept multi-word strings by writing it in this manner:
char name[25] ;
printf ( "Enter your full name " ) ;
scanf ( "%[^\n]s", name ) ;

Standard Library String Functions
With every C compiler a large set of useful string handling library functions are provided. Below lists the more commonly used functions along with their purpose.
Function
Use
strlen
Finds length of a string
strlwr
Converts a string to lowercase
strupr
Converts a string to uppercase
strcat
Appends one string at the end of another
strncat
Appends first n characters of a string at the end of another
strcpy
Copies a string into another
strncpy
Copies first n characters of one string into another
strcmp
Compares two strings
strncmp
Compares first n characters of two strings
strcmpi
Compares two strings without regard to case ("i" denotes that this function ignores case)
stricmp
Compares two strings without regard to case (identical to strcmpi)
strnicmp
Compares first n characters of two strings without regard to case
strdup
Duplicates a string
strchr
Finds first occurrence of a given character in a string
strrchr
Finds last occurrence of a given character in a string
strstr
Finds first occurrence of a given string in another string
strset
Sets all characters of string to a given character
strnset
Sets first n characters of a string to a given character
strrev
Reverses string

Out of the above list we shall discuss the functions strlen( )strcpy( ),      strcat( ) and strcmp( ), since these are the most commonly used functions. This will also illustrate how the library functions in general handle strings. Let us study these functions one by one.

strlen( )
This function counts the number of characters present in a string. Its usage is illustrated in the following program.

main( )
{
char arr[ ] = "Bamboozled" ;
int len1, len2 ;
len1 = strlen ( arr ) ;
len2 = strlen ( "Humpty Dumpty" ) ;
printf ( "\nstring = %s length = %d", arr, len1 ) ;
                              printf ( "\nstring = %s length = %d", "Humpty Dumpty", len2 ) ;
}

The output would be...
string = Bamboozled length = 10
string = Humpty Dumpty length = 13
Note that in the first call to the function strlen( ), we are passing the base address of the string, and the function in turn returns the length of the string. While calculating the length it doesn’t count ‘\0’. Even in the second call,
len2 = strlen ( "Humpty Dumpty" ) ;

what gets passed to strlen( ) is the address of the string and not the string itself.
strcpy( )
This function copies the contents of one string into another. The base addresses of the source and target strings should be supplied to this function. Here is an example of strcpy( ) in action...
main( )
{
char source[ ] = "Sayonara" ;
char target[20] ;
strcpy ( target, source ) ;
printf ( "\nsource string = %s", source ) ;
printf ( "\ntarget string = %s", target ) ;
}
And here is the output...
source string = Sayonara
target string = Sayonara

On supplying the base addresses, strcpy( ) goes on copying the characters in source string into the target string till it doesn't encounter the end of source string (‘\0’). It is our responsibility to see to it that the target string’s dimension is big enough to hold the string being copied into it. Thus, a string gets copied into another, piece-meal, character by character. There is no short cut for this.

strcat( )
This function concatenates the source string at the end of the target string. For example, “Bombay” and “Nagpur” on concatenation would result into a string “BombayNagpur”. Here is an example of strcat( ) at work.
main( )
{
char source[ ] = "Folks!" ;
char target[30] = "Hello" ;
strcat ( target, source ) ;
printf ( "\nsource string = %s", source ) ;
printf ( "\ntarget string = %s", target ) ;
}
And here is the output...
source string = Folks!
target string = HelloFolks!

Note that the target string has been made big enough to hold the final string.
strcmp( )
This is a function which compares two strings to find out whether they are same or different. The two strings are compared character by character until there is a mismatch or end of one of the strings is reached, whichever occurs first. If the two strings are identical, strcmp( ) returns a value zero. If they’re not, it returns the numeric difference between the ASCII values of the first non-matching pairs of characters. Here is a program which puts strcmp( ) in action.


main( )
{
char string1[ ] = "Jerry" ;
char string2[ ] = "Ferry" ;
int i, j, k ;
i = strcmp ( string1, "Jerry" ) ;
j = strcmp ( string1, string2 ) ;
k = strcmp ( string1, "Jerry boy" ) ;
printf ( "\n%d %d %d", i, j, k ) ;
}

And here is the output...
0 4 -32
In the first call to strcmp( ), the two strings are identical—“Jerry” and “Jerry”—and the value returned by strcmp( ) is zero.
In the second call, the first character of “Jerry” doesn't match with the first character of “Ferry” and the result is 4, which is the numeric difference between ASCII value of ‘J’ and ASCII value of ‘F’.
In the third call to strcmp( ) “Jerry” doesn’t match with “Jerry boy”, because the null character at the end of “Jerry” doesn’t match the blank in “Jerry boy”. The value returned is -32, which is the value of null character minus the ASCII value of space, i.e., ‘\0’ minus ‘ ’, which is equal to -32.
The exact value of mismatch will rarely concern us. All we usually want to know is whether or not the first string is alphabetically before the second string. If it is, a negative value is returned; if it isn’t, a positive value is returned. Any non-zero value means there is a mismatch.


No comments:

Post a Comment