0
0
Cprogramming~15 mins

Union basics - Deep Dive

Choose your learning style9 modes available
Overview - Union basics
What is it?
A union in C is a special data type that allows storing different data types in the same memory location. Unlike a struct, where each member has its own space, a union shares the same space for all its members. This means only one member can hold a value at a time. Unions help save memory when you need to work with different types but never at the same time.
Why it matters
Unions exist to save memory and allow flexible data handling in low-level programming. Without unions, programs would waste memory by allocating space for all possible data types even if only one is used at a time. This is especially important in embedded systems or performance-critical applications where memory is limited.
Where it fits
Before learning unions, you should understand basic C data types and structs. After unions, you can explore advanced memory management, bit fields, and type punning techniques.
Mental Model
Core Idea
A union is like a single box that can hold one of many different items, but only one item at a time, sharing the same space.
Think of it like...
Imagine a single parking spot that can fit either a car, a motorcycle, or a bicycle, but only one vehicle can park there at once. The spot is reused depending on the vehicle type.
┌───────────────┐
│   Union Box   │
│ ┌───────────┐ │
│ │ Member A  │ │
│ │ (int)     │ │
│ ├───────────┤ │
│ │ Member B  │ │
│ │ (float)   │ │
│ ├───────────┤ │
│ │ Member C  │ │
│ │ (char[4]) │ │
│ └───────────┘ │
└───────────────┘
All members share this same box space.
Build-Up - 7 Steps
1
FoundationWhat is a Union in C
🤔
Concept: Introduces the union keyword and basic syntax.
In C, a union is declared using the keyword 'union' followed by a name and curly braces containing member declarations. For example: union Data { int i; float f; char str[4]; }; This defines a union named Data with three members: an int, a float, and a char array.
Result
You can now declare variables of type union Data that can hold an int, float, or char array, but only one at a time.
Understanding the syntax is the first step to using unions effectively.
2
FoundationMemory Sharing in Unions
🤔
Concept: Explains how all members share the same memory space.
Unlike structs, where each member has its own memory, all union members share the same memory location. The size of the union is equal to the size of its largest member. For example, if int is 4 bytes, float is 4 bytes, and char[4] is 4 bytes, the union size is 4 bytes. This means writing to one member overwrites the others.
Result
The union variable uses memory equal to its largest member, saving space compared to structs.
Knowing that union members overlap in memory helps avoid bugs when reading or writing values.
3
IntermediateAccessing Union Members Safely
🤔Before reading on: If you write to one union member and then read another, do you expect the original value or something else? Commit to your answer.
Concept: Shows how writing to one member affects others and the importance of reading the correct member.
When you assign a value to one union member, the bits in the shared memory change. Reading a different member interprets those bits differently, often leading to unexpected results. Example: union Data d; d.i = 65; printf("%c", d.str[0]); // May print 'A' because 65 is ASCII for 'A' This is called type punning and can be used intentionally or cause bugs.
Result
Reading a different member than the one last written can produce surprising outputs.
Understanding this behavior is crucial for using unions correctly and safely.
4
IntermediateUsing Unions for Type Punning
🤔Before reading on: Do you think unions can be used to convert data types without extra functions? Commit to yes or no.
Concept: Demonstrates how unions can reinterpret data bits as different types.
Type punning means treating the same bits of memory as different types. Unions allow this by writing to one member and reading from another. Example: union { int i; float f; } u; u.i = 0x3f800000; // Bit pattern for float 1.0 printf("%f", u.f); // Prints 1.0 This technique is used in low-level programming for performance or hardware access.
Result
You can convert between types by sharing memory, without copying or casting.
Knowing unions enable type punning reveals their power and risks in systems programming.
5
IntermediateUnions Inside Structs
🤔
Concept: Shows how unions can be combined with structs for flexible data layouts.
You can place a union inside a struct to create a data structure that holds different types but also other fixed fields. Example: struct Packet { int type; union { int i; float f; } data; }; This allows the packet to store a type indicator and a value that can be int or float.
Result
Combining unions and structs creates versatile data containers.
Understanding this pattern helps design efficient and clear data models.
6
AdvancedAlignment and Padding in Unions
🤔Before reading on: Do you think union size is always exactly the largest member size? Commit to yes or no.
Concept: Explains how memory alignment and padding affect union size.
The size of a union is at least the size of its largest member, but may be larger due to alignment requirements. Some types need to be stored at specific memory addresses (aligned), so padding bytes may be added. Example: union U { char c; double d; }; On many systems, double requires 8-byte alignment, so union size may be 8 bytes even if char is 1 byte.
Result
Union size can be larger than the largest member due to alignment rules.
Knowing alignment effects prevents memory layout bugs and helps optimize data structures.
7
ExpertUnions and Undefined Behavior Risks
🤔Before reading on: Is it always safe to read a different union member than the one last written? Commit to yes or no.
Concept: Discusses the risks of undefined behavior when accessing inactive union members.
In C, reading a union member other than the one most recently written can cause undefined behavior according to the standard, except in some cases like reading common initial sequences in structs. Compilers may handle this differently, so relying on type punning via unions can be risky and non-portable. Safe alternatives include using memcpy or standard conversion functions.
Result
Misusing unions can cause unpredictable program behavior and bugs.
Understanding the language rules about unions helps write safe, portable code and avoid subtle errors.
Under the Hood
At runtime, a union allocates a single memory block sized to fit its largest member. All members share this block, so writing to one member overwrites the bits of others. The compiler manages this memory layout and alignment. Accessing a member reads the bits in that shared space, interpreting them according to the member's type.
Why designed this way?
Unions were designed to save memory and allow flexible data representation in low-level programming. Early computers had limited memory, so sharing space for mutually exclusive data was efficient. Alternatives like structs waste memory by allocating space for all members. The tradeoff is that only one member can be valid at a time, requiring careful use.
┌─────────────────────────────┐
│          Union Memory        │
│ ┌───────────────┐           │
│ │ Shared Memory  │<---------┤
│ │ (largest size) │           │
│ └───────────────┘           │
│  ↑           ↑             │
│  │           │             │
│ Member A   Member B         │
│ (int)     (float)           │
│  │           │             │
│  └───────────┘             │
│ All members overlap here    │
└─────────────────────────────┘
Myth Busters - 3 Common Misconceptions
Quick: Does writing to one union member and reading another always give the original value? Commit yes or no.
Common Belief:People often believe that writing to one union member and reading another will give meaningful or original data.
Tap to reveal reality
Reality:Reading a different member than the one last written interprets the same bits differently, often producing garbage or unexpected values.
Why it matters:This misconception leads to bugs and crashes when programmers assume union members hold independent values.
Quick: Is the size of a union always the sum of its members' sizes? Commit yes or no.
Common Belief:Some think the union size is the total of all member sizes added together.
Tap to reveal reality
Reality:The union size equals the size of its largest member, not the sum, because all members share the same memory space.
Why it matters:Misunderstanding size leads to incorrect memory allocation and buffer overflows.
Quick: Is it always safe and portable to use unions for type punning? Commit yes or no.
Common Belief:Many believe unions are a safe, standard way to convert between types by reading different members.
Tap to reveal reality
Reality:The C standard considers this undefined behavior except in specific cases; compilers may handle it differently, risking portability.
Why it matters:Relying on this can cause subtle bugs that appear only on some systems or compiler versions.
Expert Zone
1
Some compilers provide extensions that make type punning via unions well-defined, but relying on them reduces portability.
2
Unions can be used with anonymous members in modern C to simplify syntax and improve code clarity.
3
Using unions with bit fields allows compact representation of flags and small data, but requires careful attention to alignment and endianness.
When NOT to use
Avoid unions when you need to store multiple values simultaneously; use structs instead. For safe type conversions, prefer memcpy or standard conversion functions. In high-level code, unions can reduce readability and increase maintenance complexity.
Production Patterns
Unions are widely used in embedded systems for hardware register access, protocol parsing where data formats vary, and memory-efficient variant types. They often appear inside structs with a tag field indicating the active member, implementing tagged unions or discriminated unions.
Connections
Tagged Unions (Discriminated Unions)
Builds-on
Tagged unions combine a union with a tag field to safely track which member is active, preventing misuse and undefined behavior.
Memory Management
Same pattern
Unions illustrate memory reuse and efficient allocation, key ideas in managing limited resources in programming and systems design.
Quantum Superposition (Physics)
Analogous concept
Like a union holding multiple possible states but only one actual at a time, quantum superposition holds multiple possibilities until measured, showing how one system can represent multiple states in different ways.
Common Pitfalls
#1Reading a union member different from the one last written, expecting meaningful data.
Wrong approach:union Data d; d.i = 42; printf("%f", d.f); // Wrong: reading float after writing int
Correct approach:union Data d; d.i = 42; printf("%d", d.i); // Correct: read the same member written
Root cause:Misunderstanding that union members share memory and interpreting bits incorrectly.
#2Assuming union size is sum of all members, leading to wrong memory allocation.
Wrong approach:union Data { int i; double d; }; // Assuming size is sizeof(int) + sizeof(double) char buffer[sizeof(int) + sizeof(double)]; // Wrong
Correct approach:union Data { int i; double d; }; char buffer[sizeof(union Data)]; // Correct: size is largest member size
Root cause:Confusing union memory layout with struct layout.
#3Using unions for type punning without considering undefined behavior and portability.
Wrong approach:union { int i; float f; } u; u.i = 0x3f800000; printf("%f", u.f); // Risky and undefined behavior
Correct approach:int i = 0x3f800000; float f; memcpy(&f, &i, sizeof(f)); printf("%f", f); // Safe and portable
Root cause:Ignoring C standard rules about accessing inactive union members.
Key Takeaways
Unions allow different data types to share the same memory space, saving memory when only one type is needed at a time.
Only one union member can hold a valid value at once; writing to one member overwrites others.
Reading a different member than the one last written can cause unexpected results or undefined behavior.
Union size equals the largest member size, possibly increased by alignment requirements.
Unions are powerful for low-level programming but require careful use to avoid bugs and maintain portability.