Exploring Time Complexity of Prim’s Algorithm in Python

This article is a complementary resource to the DSA with Python course.

Exploring Time Complexity of Prim’s Algorithm in Python

The Prim's algorithm is used to find the minimum spanning tree of a connected, undirected, and weighted graph.

In this article, we will analyze the time complexity of the Prim's algorithm.


How Does Prim's Algorithm Work?

Prim's algorithm:

  1. Chooses a starting vertex and add it to the MST.
  2. Adds the edge with the least weight that connects a vertex in MST with the remaining vertices.
  3. Repeats the previous step until all vertices are included in the MST.
def prim(graph, start):
    # Initialize a dictionary to keep track
    # of vertices included in the MST
    in_tree = {vertex: False for vertex in graph}
    in_tree[start] = True

    # Start with an empty tree
    MST = []

    # Continue until all vertices are included in the MST
    while not all(in_tree.values()):
        # Get the shortest edge that connects a vertex
        # in the MST to a vertex outside the MST
        edges = [(start, end, weight) for start, neighbors in graph.items() for end, weight in neighbors.items() if in_tree[start] and not in_tree[end]]
        sorted_edges = sorted(edges, key=lambda edge: edge[2])
        shortest_edge = sorted_edges[0]

        # Add the shortest edge to the MST and
        # mark the connected vertex as included
        start, end, weight = shortest_edge
        MST.append((start, end, weight))
        in_tree[end] = True

    return MST

Time Complexity (Unoptimized Solution)

The time complexity of the above program can broken down into following steps:

1. Initialization

The algorithm initializes a dictionary to keep track of vertices included in the MST. This operation takes O(V) time, where V is the number of vertices in the graph.

2. Finding Shortest Edge

In each iteration, the algorithm

  • Generates edges using list comprehension, which iterates over all the edges in the graph. For E edges, it has the time complexity of O(E).
  • Sorts the edges by their weight using the sorted() function. The time complexity of sorting a list of E elements is O(E logE).
  • Selects the shortest edge once the edges are sorted. It does this by taking the first element of the sorted list. This operation takes O(1) time.

So, combining these above steps, the total complexity for finding the shortest edge is O(E logE). It's because the sorting step dominates the other two steps.

3. Adding Edge to MST

Once the shortest edge is found, it is added to the MST. This operation takes O(1) time but we are performing this operation per vertex.

So, for V vertices, the total time complexity is O(V).

Overall Time Complexity

The algorithm iterates until all vertices are included in the MST, which takes V iterations.

In each iteration, the algorithm finds the shortest edge, adds it to the MST, and marks the connected vertex as included.

Thus, the overall time complexity of the algorithm is O(V * E logE).

However, this is not the optimal solution. We can improve our implementation by using a heap data structure. Let's see how.


Optimized Solution

import heapq

def prim(graph, start):
    # Initialize a dictionary to keep track
    # of vertices included in the MST
    in_tree = {vertex: False for vertex in graph}
    in_tree[start] = True

    # Initialize a priority queue to store the edges
    edges = []
    for end, weight in graph[start].items():
        heapq.heappush(edges, (weight, start, end))

        # Initialize the MST
        mst = []

    # Continue until all vertices are included in the MST
    while edges:
        # Extract the shortest edge from the priority queue
        weight, start, end = heapq.heappop(edges)

        # If the end vertex is not in the MST,
        # add the edge to the MST and
        # mark the end vertex as included
        if not in_tree[end]:
            mst.append((start, end, weight))
            in_tree[end] = True

            # Add the edges of the end vertex to the priority queue
            for neighbor, neighbor_weight in graph[end].items():
                if not in_tree[neighbor]:
                    heapq.heappush(edges, (neighbor_weight, end, neighbor))

    return mst

Time Complexity of Optimized Solution

The time complexity of the optimized program can be split into following steps:

1. Initialization

The algorithm initializes a dictionary to keep track of vertices included in the MST. This operation takes O(V) time, where V is the number of vertices in the graph.

2. Adding Edges to the Priority Queue

The algorithm adds the edges of the starting vertex to the priority queue. This operation takes O(E logE) time, where E is the number of edges in the graph.

This is because the heap operation takes O(logE) time, and we perform this operation E times.

3. Extracting the Shortest Edge From the Priority Queue

The algorithm extracts the shortest edge from the priority queue, which has time complexity of O(logE) time.

4. Adding the End Vertex to the MST

The algorithm adds the end vertex to the MST and marks it as included in the dictionary. This operation takes O(1) time.

5. Adding the Edges of the End Vertex to the Priority Queue

The algorithm adds the edges of the end vertex to the priority queue. This operation takes O(E logE) time, where E is the number of edges in the graph.

This is because the heap push operation takes O(logE) time, and we perform this operation E times.

Overall Time Complexity

Here, step 5 (adding the edges of the end vertex to the priority queue) is performed V times. But it's only performed for the edges of the end vertex, which is a subset of the total edges in the graph.

This step dominates the overall time complexity because it's performed V times and involves a relatively slow operation (adding edges to the priority queue).

Thus the overall time complexity is O(E logE) because that's the time complexity of the dominant step.

Overall Time Complexity: O(E logE)


Factors Affecting the Time Complexity

1. Dense Graph

In a dense graph, the number of edges E is close to the square of the number of vertices V.

Therefore, the overall time complexity is <katex>O(V^{2})</katex>.

A Dense Graph
A Dense Graph

2. Sparse Graph

In a sparse graph, the number of edges E is close to the number of vertices V.

Therefore, the overall time complexity is O(V logV).