Using the CORRECT_UNICODE etc. intended in printFeedback()?

Hey,

I have realized, I could use the predefined CORRECT_UNICODE etc. for printing the feedback.
Now, this is technically a string, so I need to store them in a char* arr[] and later use %s. Does that even work or will I have to worry about some strange conversions?
Also: Is this even intended? Right now I do not see, where else these might be useful…?

Doing it like this, I wish to avoid having to include wchar.h and what not.

Thanks in advance!

Yes, that should work, unless you do something weird with these strings.

It seems to me that this makes weird encoding errors much harder than when using wchar.h.

1 Like

Yes indeed, still, using an array with unicode chars and printing them to stdout with printf() does not work. It gives me funny test results and splits the unicode chars into several stdout-recognized characters.

Am I suppposed to be using something like #include <locale.h> with setlocale(LC_ALL, “”); and later printing with printf(L"%c", arr[i]); ?

Or a more general question: Is there a solution that doesn’t require fancy imports? If it is a function, I think this would be like looking for the needle in a hay stack. But I doubt that.

An answer to this would be great, because I honestly do not want to spend too much time on the tests for printing results.^^

Thanks in advance!

EDIT: I will try this once more with the format specifier “%c”, maybe “%s” leads to the splitting while “%c” would keep it intact. Also I imagine, that there might be other format specifiers for the stdout, which are more suitable like “%lc”. This might even replace the usage of locale.h with printf(L"%c", somechar);"

Not really, no. UTF-8 support is basically nonexistent in C, since support for most anything is nonexistent in C. We require you to print these emojis UTF-8-encoded since this is the “standard” (by convention) way of exchanging unicode text, since it is backwards-compatible to ASCII.

However, unicode defines so many characters that 16 bits are not even sufficient. To represent a unicode character (the technical name for this is “code point”), you need at least 21 bits. UTF-8 then encodes such a 21 bit character into multiple 8 bit chunks. See UTF-8 - Wikipedia for more info.

In principle, you can write a C function that UTF-8-encodes a code point into a char sequence. It’s not even that difficult :slight_smile:

In practice, you only have 5 different unicode chars, so we have already given you these strings as the macros CORRECT_UNICODE and so on.

Hope this helps!

1 Like

Yes, this also avoids getting into trouble by setting different locales and not being able to find the way back to the original one. Thus, I will not share some findings here with respect to setting and resetting locales. Just a simple script, which displays the yellow square in the terminal:
unicode_stuff_simplified

This is the hex represention of the yellow square. One can find the hex values also here under UTF-8 Kodierung: „🟨“ U+1F7E8 Großes gelbes Quadrat Unicode-Zeichen (yellow square).
Note: From reading several posts, I read that this way might still not be supported on Windows 10 but Windows supports direct Unicode for its terminal, which is very likely very different to the support that the test server has?

I will go for the provided macros but if the tests fail, I’ll change to hex and report here, if it worked.
Thanks for the directions, Johannes.

Your program should output UTF-8. Whether the Windows terminal is able to handle it or not does not affect your program’s correctness. Also, don’t use Windows lol. We provide the VM for a reason.

1 Like

Just informing some people, who do. :smiley:

Hey,

In short:
printf("%s", WRONG_UNICODE) or printf("\U00002B1B")
do the job.

Best,
Tim

2 Likes