Understanding fgets

Hi,

I’m trying to understand how fgets works to read words from a file. Say we have this dict.txt file:

hotel
crane
zymic

and I try to read the words from the file as such, while keeping track of the number of words using a counter:

int counter = 0;
char* word = malloc(6);
FILE* file = fopen("dict.txt", "r");
while (fgets(word, 6, file) != NULL) {
    printf("%s", word);
    counter++;
}
printf("%d", counter);

All words are printed correctly but for some reason, counter = 5, even though the file contains only 3 words. In general, it seems to be counter = 2*(number of words in file) - 1. Hence, somehow fgets reads each word twice except for one word. Does it perhaps read the newline characters \n separately, which each word possesses except for the last one? How could I resolve that issue? I tried replacing ‘\n’ by ‘\0’ within the while-loop but that didn’t help. When listing the words in the file as such hotelcranezymic (i.e. without new lines), then the counter is correct.

Based on the documentation on what fgets does, it should read 6 characters into “word” (including the terminator). Each word in the file has 6 characters, the 5 letters + a terminator (new line for the first two words and zero terminator for the last word (or does the last word have no terminator?)? I’m not sure why fgets isn’t able to read those 6 characters or how I can find out what it’s doing.

If I change fgets(word, 6, file) to fgets(word, 7, file), then the counter is correct. However, then I would have an overflow since I only allocated 6 “space units” to word.

Hi,

Yes. It reads your given number of chars -1 (6-1=5) and adds a '\0' at the last position. If in the current line are less then this number of chars it will just read until the '\n', which means if you are always just reading exactly the letters of your word, the remaining '\n' will still be there and read in the next iteration.

'\n' does not terminate a string it is just used for fgets to know where the line ends. A string always ends with a '\0', even if they are not visible in the file.

Well, if you want to fix it fast you could just allocate 7 “space units” instead. But make sure that you don’t save the '\n' as last character of the word in the trie.

2 Likes