0
0
C Sharp (C#)programming~10 mins

Char type and Unicode behavior in C Sharp (C#) - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Char type and Unicode behavior
Declare char variable
Assign Unicode character
Store UTF-16 code unit
Use char in operations
Output char or code
Handle surrogate pairs if needed
End
This flow shows how a char variable holds a UTF-16 code unit representing a Unicode character, including surrogate pairs for characters outside the basic multilingual plane.
Execution Sample
C Sharp (C#)
char letter = 'A';
char emoji = '\uD83D';
char emoji2 = '\uDE00';
Console.WriteLine(letter);
Console.WriteLine(emoji);
Console.WriteLine(emoji2);
This code assigns a letter and two surrogate halves of an emoji to char variables and prints them.
Execution Table
StepActionVariableValueOutput
1Declare char letter and assign 'A'letter'A' (U+0041)
2Declare char emoji and assign high surrogate '\uD83D'emoji'\uD83D' (U+D83D)
3Declare char emoji2 and assign low surrogate '\uDE00'emoji2'\uDE00' (U+DE00)
4Print letterletter'A' (U+0041)A
5Print emojiemoji'\uD83D' (U+D83D)� (may show as replacement char)
6Print emoji2emoji2'\uDE00' (U+DE00)� (may show as replacement char)
7To show full emoji, combine emoji + emoji2 as stringstring emojiFull"\uD83D\uDE00"
8Print emojiFullemojiFull"\uD83D\uDE00"😀
💡 End of program after printing characters and combined emoji string.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 7Final
letterundefined'A' (U+0041)'A' (U+0041)'A' (U+0041)'A' (U+0041)'A' (U+0041)
emojiundefinedundefined'\uD83D' (U+D83D)'\uD83D' (U+D83D)'\uD83D' (U+D83D)'\uD83D' (U+D83D)
emoji2undefinedundefinedundefined'\uDE00' (U+DE00)'\uDE00' (U+DE00)'\uDE00' (U+DE00)
emojiFullundefinedundefinedundefinedundefined"\uD83D\uDE00""\uD83D\uDE00"
Key Moments - 3 Insights
Why does printing the individual surrogate chars show strange symbols?
Each char holds only one UTF-16 code unit, so printing a single surrogate half (rows 5 and 6) does not form a valid character alone, causing replacement symbols or errors.
How do we print an emoji that requires two chars?
Combine the high and low surrogate chars into a string (row 7) and print that string (row 8) to display the full emoji correctly.
Is a C# char a full Unicode character?
No, a char is a 16-bit UTF-16 code unit. Characters outside the basic multilingual plane need two chars (surrogate pairs) to represent one Unicode character.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the value of 'letter' after step 1?
A'A' (U+0041)
B'\uD83D' (U+D83D)
C'\uDE00' (U+DE00)
Dundefined
💡 Hint
Check the 'Value' column for 'letter' at step 1 in the execution_table.
At which step does the program print the full emoji correctly?
AStep 5
BStep 6
CStep 8
DStep 4
💡 Hint
Look for the step where 'emojiFull' is printed in the execution_table.
If we print only 'emoji' (high surrogate) without 'emoji2', what happens?
AIt prints the full emoji
BIt prints a replacement character or strange symbol
CIt causes a compile error
DIt prints nothing
💡 Hint
See output at step 5 in the execution_table for printing 'emoji' alone.
Concept Snapshot
char in C# holds a single UTF-16 code unit (16 bits).
Unicode characters outside BMP need two chars (surrogate pairs).
Printing a single surrogate char alone shows invalid symbol.
Combine surrogate pairs in a string to display full Unicode chars.
Use '\uXXXX' syntax to assign Unicode code units to char.
Full Transcript
In C#, the char type stores a single UTF-16 code unit representing a Unicode character. Characters like 'A' fit in one char, but emojis often require two chars called surrogate pairs. When printing a single surrogate half, the output may show a replacement symbol because it's incomplete. To display the full emoji, combine the two surrogate chars into a string and print that string. This example shows declaring chars for 'A' and an emoji's surrogate halves, printing them individually and combined. Remember, char is not a full Unicode character but a 16-bit code unit. Surrogate pairs together represent characters outside the basic multilingual plane.