Data Structures Theoryknowledge~6 mins

Minimum spanning tree (Prim's) in Data Structures Theory - Full Explanation

Choose your learning style9 modes available

Learn Why Deep Visual Practice Challenge Project Recall Time

Introduction

Imagine you want to connect several cities with roads so that all cities are reachable, but you want to spend the least amount of money building these roads. Finding the cheapest way to connect all points without any loops is the problem that minimum spanning trees solve.

Explanation

Graph and Spanning Tree

A graph is a collection of points called vertices connected by lines called edges. A spanning tree is a subset of these edges that connects all vertices without forming any loops. It ensures every point is reachable from any other point.

A spanning tree connects all points with no cycles.

Minimum Spanning Tree (MST)

Among all possible spanning trees, the minimum spanning tree is the one where the total cost or weight of the edges is as small as possible. This means it uses the cheapest connections to link all points.

MST is the cheapest way to connect all points without loops.

Prim's Algorithm Overview

Prim's algorithm builds the MST by starting from any vertex and growing the tree one edge at a time. At each step, it adds the cheapest edge that connects a new vertex to the growing tree until all vertices are included.

Prim's grows the MST by adding the cheapest edge to new vertices step-by-step.

How Prim's Algorithm Works

Begin with a single vertex and mark it as part of the MST. Look at all edges from this vertex and pick the one with the smallest weight that connects to a vertex not yet in the MST. Add that vertex and edge to the MST. Repeat this process until all vertices are included.

Prim's repeatedly adds the lowest-cost edge connecting the MST to a new vertex.

Data Structures Used

Prim's algorithm often uses a priority queue to efficiently select the next smallest edge. It keeps track of which vertices are in the MST and updates the queue with edges leading to vertices not yet included.

A priority queue helps Prim's quickly find the cheapest edge to add.

Real World Analogy

Imagine building a network of water pipes to connect several houses. You start from one house and always lay the shortest pipe to a new house not yet connected, making sure you don't create loops. This way, you use the least amount of pipe to connect all houses.

Graph and Spanning Tree → Houses as points and pipes as connections between them

Minimum Spanning Tree (MST) → The cheapest way to connect all houses with pipes without any loops

Prim's Algorithm Overview → Starting from one house and adding the shortest pipe to a new house each time

How Prim's Algorithm Works → Choosing the shortest pipe available that connects to a house not yet connected

Data Structures Used → Using a priority queue of shortest pipes to decide which pipe to lay next

Diagram

┌───────────────┐
│ Start at any  │
│ vertex (A)    │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Select edge   │
│ with smallest │
│ weight to new │
│ vertex       │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Add vertex and│
│ edge to MST   │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Repeat until  │
│ all vertices  │
│ included      │
└───────────────┘

Flowchart showing Prim's algorithm steps from starting vertex to building the MST by adding edges.

Key Facts

Graph → A set of points connected by lines called edges.

Spanning Tree → A subset of edges connecting all vertices without any cycles.

Minimum Spanning Tree (MST) → A spanning tree with the smallest total edge weight.

Prim's Algorithm → An algorithm that builds the MST by adding the cheapest edge connecting new vertices.

Priority Queue → A data structure that efficiently selects the smallest element.

Code Example

Data Structures Theory

import heapq

def prim_mst(graph, start=0):
    mst = []
    visited = set([start])
    edges = [(cost, start, to) for to, cost in graph[start]]
    heapq.heapify(edges)

    while edges:
        cost, frm, to = heapq.heappop(edges)
        if to not in visited:
            visited.add(to)
            mst.append((frm, to, cost))
            for to_next, cost_next in graph[to]:
                if to_next not in visited:
                    heapq.heappush(edges, (cost_next, to, to_next))
    return mst

# Example graph as adjacency list: vertex -> list of (neighbor, cost)
graph = {
    0: [(1, 2), (3, 6)],
    1: [(0, 2), (2, 3), (3, 8), (4, 5)],
    2: [(1, 3), (4, 7)],
    3: [(0, 6), (1, 8), (4, 9)],
    4: [(1, 5), (2, 7), (3, 9)]
}

mst = prim_mst(graph)
for edge in mst:
    print(edge)

OutputSuccess

Common Confusions

Believing Prim's algorithm can start from any vertex but must always pick edges connected only to the first vertex.

Believing Prim's algorithm can start from any vertex but must always pick edges connected only to the first vertex. Prim's algorithm starts from any vertex but expands by adding the cheapest edge connecting any vertex already in the MST to a new vertex, not just edges from the starting vertex.

Thinking the MST includes all edges in the graph.

Thinking the MST includes all edges in the graph. The MST includes only enough edges to connect all vertices without cycles, so it has fewer edges than the full graph.

Summary

Minimum spanning trees connect all points with the least total cost and no loops.

Prim's algorithm builds the MST by starting from one vertex and adding the cheapest edge to new vertices step-by-step.

Using a priority queue helps Prim's algorithm efficiently find the next edge to add.