Characters and strings: Difference between revisions

From Applied Science
(Created page with "It's a common practice to teach how to use characters only after teaching how to operate with numbers and variables. This is done because characters are represented by numeric codes. There is one fundamental concept about how computers work. Deep down the computer works by doing elementary logic operations. That's very hard or very easy to grasp depending on the person. It's unnatural to think on words with logical operations. The algorithms that deal with characters ar...")
 
No edit summary
Tag: wikieditor
 
Line 11: Line 11:




<div style="text-align:center;>
<center>'''What comes bellow requires knowledge about arrays'''</center>
'''What comes bellow requires knowledge about arrays'''
</div>




* '''A simple program that makes use of characters:'''
* '''A simple program that makes use of characters:'''
<div class="code">
<source lang="c">
&#35;include <stdio.h>
#include <stdio.h>


int main() {<div style="margin-left:1.5em;">
int main() {
char here;<br />
char here;
unsigned char there;<br />
unsigned char there;
int number, number2;
int number, number2;


printf("Type two characters: ");<br />
printf("Type two characters: ");
scanf(" %c %c", &here, &there);<br />
scanf(" %c %c", &here, &there);
number = here;<br />
 
number2 = there;<br />
number = here;
printf("\nYou typed %c, the ASCII code of which is %d (signed) or %d (unsigned)", here, number, number2);
number2 = there;
</div>
 
printf("\nYou typed %c, the ASCII code of which is %d (signed) or %d (unsigned)", here, number, number2);
}
}
</div>
</source>
<div style="margin-left:1.5em;">
 
'''Arithmetic operations.''' It does not make sense to sum, subtract, characters from characters. Arithmetic operations done with characters are, in fact, operations done with the numeric codes that represent those characters. If the arithmetic operation done results in a numeric code outside the ASCII table, the output is going to be a code that does not represent any character.
'''Arithmetic operations.''' It does not make sense to sum, subtract, characters from characters. Arithmetic operations done with characters are, in fact, operations done with the numeric codes that represent those characters. If the arithmetic operation done results in a numeric code outside the ASCII table, the output is going to be a code that does not represent any character.


Line 39: Line 38:


'''The blank space before the %c.''' That is a technical detail of the scanf() function.
'''The blank space before the %c.''' That is a technical detail of the scanf() function.
</div>
 


* '''A program that counts how many times a letter shows up in a phrase and how many words does a phrase have:'''
* '''A program that counts how many times a letter shows up in a phrase and how many words does a phrase have:'''
<div class="code">
<source lang="c">
&#35;include <stdio.h>
#include <stdio.h>
 
int main() {
char phrase[100], letter;
int i, count=0, count2=0;
 
printf("Type a phrase: ");


int main() {<div style="margin-left:1.5em;">
scanf("%[^\n]", phrase);
char phrase[100], letter; int i, count=0, count2=0;
printf("\nPhrase: %s", phrase);


printf("Type a phrase: ");<br />
printf("\nChoose a character to count: ");
scanf("%[^\n]", phrase);<br />
scanf(" %c", &letter);
printf("\nPhrase: %s", phrase);


printf("\nChoose a character to count: ");<br />
for(i=0; phrase[i]!=0; i++) {
scanf(" %c", &letter);
  if (phrase[i] == letter) { count++; }
  if (phrase[i+1] == ' ' || phrase[i+1] == '\0') { count2++; }
}


for(i=0; phrase[i]!=0; i++) {<div style="margin-left:1.5em;">
printf("The character '%c' has been found %d times. The phrase contains %d words.", letter, count, count2);
if (phrase[i] == letter) { count++; }<br />
if (phrase[i+1] == ' ' || phrase[i+1] == '\0') { count2++;
</div>
}
</div>
}
}
</source>


printf("The character '%c' has been found %d times. The phrase contains %d words.", letter, count, count2);<br />
}
</div>
<div style="margin-left:1.5em;">
'''Characters and word count.''' Notice that the algorithm is comparing the ASCII codes stored in the array and in the variable. Both have the same type char. The word counter is based upon the fact that every word is separated from another by a blank space. The last phrase's word ends with a null char, which also indicates the end of the sequence.
'''Characters and word count.''' Notice that the algorithm is comparing the ASCII codes stored in the array and in the variable. Both have the same type char. The word counter is based upon the fact that every word is separated from another by a blank space. The last phrase's word ends with a null char, which also indicates the end of the sequence.


Line 72: Line 70:
   
   
'''Sequence ends with 0.''' The array must have an extra position, because the last one must be the null char. Great care must be taken in here to not confuse the null char, the ASCII code of which is 0 and the notation between single quotation marks is '\0', from the number 0 itself. The ASCII code of the number 0 is 48. Printing the null char produces nothing on screen.
'''Sequence ends with 0.''' The array must have an extra position, because the last one must be the null char. Great care must be taken in here to not confuse the null char, the ASCII code of which is 0 and the notation between single quotation marks is '\0', from the number 0 itself. The ASCII code of the number 0 is 48. Printing the null char produces nothing on screen.
</div>

Latest revision as of 00:56, 22 January 2025

It's a common practice to teach how to use characters only after teaching how to operate with numbers and variables. This is done because characters are represented by numeric codes. There is one fundamental concept about how computers work. Deep down the computer works by doing elementary logic operations. That's very hard or very easy to grasp depending on the person. It's unnatural to think on words with logical operations.

The algorithms that deal with characters are the same that deal with numbers, the way of thinking does not change. Therefore, it's recommended that study and practice with numeric algorithms are taken with care. To work with words, more than on character at the same time, the array's concept is required. Therefore, any difficulties related to strings might be a difficulty related to understanding arrays.

There are issues related to compatibility of programs under different operational systems and different languages, but that's not something to worry about in this course. That sort of problem is more technical than theoretical, because it depends more on the language used and the libraries available.

The basic logic is: characters are nothing more than numeric codes in the computer, working with them is to work with numbers.

Errors of logic:

  • If there are errors with the characters, the error is in the algorithm's arithmetic operations.


What comes bellow requires knowledge about arrays


  • A simple program that makes use of characters:
#include <stdio.h>

int main() {
 char here;
 unsigned char there;
 int number, number2;

 printf("Type two characters: ");
 scanf(" %c %c", &here, &there);

 number = here;
 number2 = there;

 printf("\nYou typed %c, the ASCII code of which is %d (signed) or %d (unsigned)", here, number, number2);
}

Arithmetic operations. It does not make sense to sum, subtract, characters from characters. Arithmetic operations done with characters are, in fact, operations done with the numeric codes that represent those characters. If the arithmetic operation done results in a numeric code outside the ASCII table, the output is going to be a code that does not represent any character.

Signed char vs Unsigned char. The main difference lies in the ASCII table. Signed chars are numbered from -128 to 127. Unsigned chars are numbered from 0 to 255. The difference is not important in this course. It's just a matter of numeric codes.

The blank space before the %c. That is a technical detail of the scanf() function.


  • A program that counts how many times a letter shows up in a phrase and how many words does a phrase have:
#include <stdio.h>

int main() {
 char phrase[100], letter;
 int i, count=0, count2=0;

 printf("Type a phrase: ");

 scanf("%[^\n]", phrase);
 printf("\nPhrase: %s", phrase);

 printf("\nChoose a character to count: ");
 scanf(" %c", &letter);

 for(i=0; phrase[i]!=0; i++) {
  if (phrase[i] == letter) { count++; }
  if (phrase[i+1] == ' ' || phrase[i+1] == '\0') { count2++; }
 }

 printf("The character '%c' has been found %d times. The phrase contains %d words.", letter, count, count2);
}

Characters and word count. Notice that the algorithm is comparing the ASCII codes stored in the array and in the variable. Both have the same type char. The word counter is based upon the fact that every word is separated from another by a blank space. The last phrase's word ends with a null char, which also indicates the end of the sequence.

The notation %[^\n] in scanf function. That is a technical detail of the scanf() function. If you want to know more about that type of notation, search for "regular expressions" and check the ASCII table. What that expression says is "read all spaces, store all characters until the end of the line". Since the introductory's emphasis is on algorithms and not on the language, the teacher can omit that sort of technical details in favour of the logic.

Sequence ends with 0. The array must have an extra position, because the last one must be the null char. Great care must be taken in here to not confuse the null char, the ASCII code of which is 0 and the notation between single quotation marks is '\0', from the number 0 itself. The ASCII code of the number 0 is 48. Printing the null char produces nothing on screen.