1. Computing & Technology

Discuss in my forum

C Tutorial - Strings and Text Handling

By , About.com Guide

9 of 9

Finishing Off Example 5
At the end of this function, head points to the newly created struct which is populated with a copy of the word, a count of 1 and a null pointer marking the new end of the list.

Finally the DumpList() function outputs all words that occur ten or more times. You can see all of the words by commenting the line with an if statement out. This function also frees up all allocated memory, both the words and the structures and returns a count of how many unique words were found.

Using the changelog.txt file from Tortoise, (the Subversion gui) as input produced these results in under a second. It built up a linked list with 2,106 structs calling addword() 38,815 times.

File size =132097 bytes
Number of Words found = 38815
Number of unique words = 2106

Other Str Functions

In these definitions, string is a synonym for char *.
  • strchr() - Find character in string.
  • strrchr()- Find character in string, searching from the end.
  • strcspn() - Find a string in another string.
  • strxfrm() - Transform string using locale
  • strpbrk() - Find a string in another string.
  • strspn() - Find the first element in a string that doesn't match another string.
  • strstr() - Find a string in another.
  • strtok() - A persistent search (between calls) of one string in another. Can be called repeatedly.
  • strxfrm() - Translate a string according to the locale.
Why so many string search functions? Some include '\0' in the searched text, though strstr() doesn't. So far I've never written anything that needed any of these! but the most useful is possibly strtok() if you have to do multiple searches in a string.

Conclusion

C is probably not the best programming language for text processing as it uses pointers and needs careful coding. It is all too easy to introduce bugs if you're not careful!

That completes this tutorial. The next one is on file handling.

©2012 About.com. All rights reserved.

A part of The New York Times Company.