Dec 12, 2024

Shreejan Silwal

Factors Affecting Time Complexity

This article is a complementary resource to the Complexity Calculation course.

Factors Affecting Time Complexity

Time complexity measures the efficiency of an algorithm. It depends on many factors, the primary ones being:

Input Size and Nature
Programming Logic (Nested Loops, Design Choice)
Choice of Data Structure

Let's explore each of these one by one.

Input Size and Nature

The input to an algorithm significantly impacts its performance. If the input is predictable, algorithms can be optimized to run more efficiently.

Consider the following algorithm.

Example: Bubble sort

def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        swapped = False
        for j in range(0, n-i-1):
            if arr[j] > arr[j+1]:
                # Swap if elements are in the wrong order
                arr[j], arr[j+1] = arr[j+1], arr[j]
                swapped = True
        if not swapped:
            # Stop early if no elements were swapped in the inner loop
            break
    return arr

This code is known as the Bubble Sort algorithm. For now, it's not essential to know what this code is called. Just know that this code sorts the elements of an array.

If the input is already sorted:

4, 16, 17, 18

In the case of sorted data, the inner loop will only run once and terminate as no swaps are made. This results in the best-case time complexity of O(n).

When the input is in reverse order:

18, 17, 16, 4

In this case, each element is farthest from its correct position. So, two nested loops perform pairwise comparisons and swaps for each element, resulting in a time complexity of <katex>O(n^{2})</katex>.

Therefore, if input patterns are predictable, choosing the right algorithm based on these patterns can drastically improve performance.

Note: The size of the input (n) is the most significant factor determining the time an algorithm takes to execute. Algorithms generally take longer to execute with larger inputs unless the time complexity of the algorithm is O(1).

Programming Logic

The elements of programming logic, like loops, recursion, and algorithmic design choice, can impact the time complexity of the algorithm.

Nested Loops

As the depth of nested loops increases, the time complexity becomes worse.

For example, two nested loops generally have a time complexity of <katex>O(n^{2})</katex>. However, a single loop generally has a time complexity of O(n).

Thankfully, we can minimize the number of loops in some DSA problems.

Let's take a simple example where we have to find all the pairs of numbers with a sum equal to a target value.

def find_pairs_nested_loops(arr, target):
    pairs = []  

    # Iterate through each array element
    # as the first element of the pair
    for i in range(len(arr)):

        # Iterate through the elements after
        # the current element to avoid duplicates
        for j in range(i + 1, len(arr)):

            # Check if the current pair sums up to the target
            if arr[i] + arr[j] == target:
                pairs.append((arr[i], arr[j])) 

    return pairs

arr = [1, 2, 3, 4, 5, 6]
target = 7
result = find_pairs_nested_loops(arr, target)
print("Pairs that sum to", target, ":", result)

Output

Pairs that sum to 7 : [(1, 6), (2, 5), (3, 4)]

Here, we have solved the problem with a nested structure of two loops. Hence, the time complexity is katex>O(n^{2})</katex>.

However, we can also solve the above problem using only one loop.

Using a Single Loop

def find_pairs_one_pass(arr, target):

    seen = set()  
    pairs = []   

    # Iterate through each number in the array
    for num in arr:
        # Calculate the required complement to reach the target
        complement = target - num  
        
        # Check if the complement has already been seen
        if complement in seen:
            # If found, add the pair (complement, num) to the results
            pairs.append((complement, num))
        
        # Else add the current number to the set of seen numbers
        seen.add(num)
    
    return pairs  

arr = [1, 2, 3, 4, 5, 6]
target = 7
result = find_pairs_one_pass(arr, target)
print("Pairs that sum to", target, ":", result)

Output

Pairs that sum to 7 : [(1, 6), (2, 5), (3, 4)]

This solution has a time complexity of O(n). Hence, this solution is more efficient.

Therefore, to make our algorithms more efficient, we should minimize the number of loops in the solution.

Algorithmic Design Choice

The approach taken to solve a particular problem, or simply the design choice of the algorithm, can affect the time complexity.

Let's take a look at the Maximum Subarray problem.

Problem Definition:

Given an array of integers, you need to find the maximum sum of any contiguous subarray.

For example, for the array [-2, 1, -3, 4, -1, 2, 1, -5, 4], the subarray [4, -1, 2, 1] has the largest sum of 6.

We can solve this using the brute force approach. The brute force approach involves checking all possible subarrays, calculating their sums, and returning the maximum.

def max_subarray_brute_force(arr):
    n = len(arr)
    max_sum = float('-inf')  # Initialize to negative infinity
    for i in range(n):
        for j in range(i, n):
            # Calculate the sum of the subarray from index i to j
            current_sum = sum(arr[i:j+1])
            max_sum = max(max_sum, current_sum)
    return max_sum

# Example usage:
arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
print("Brute force max subarray sum:", max_subarray_brute_force(arr))

The time complexity of this approach is <katex>O(n^{3})</katex>. This is because of the following factors:

The two nested loops for generating all possible subarrays, which contribute a time complexity of <katex>O(n^{2})</katex>.
Since the calculation of the sum is nested inside the loops, its complexity (O(n)) is multiplied to the complexity caused by the loops (<katex>O(n^{2})</katex>), resulting in a final complexity of <katex>O(n^{3})</katex>.

However, we can also solve this problem using another method called divide and conquer.

Divide and Conquer

The divide and conquer approach splits the array into two halves, recursively finding the maximum sum in the left half, the right half, and the cross sum (the sum that spans the middle of the array).

def max_crossing_sum(arr, left, right, mid):
    # Find the maximum sum of the subarray crossing the middle point
    left_sum = float('-inf')
    right_sum = float('-inf')

    # Find maximum sum on left of mid
    total = 0
    for i in range(mid, left - 1, -1):
        total += arr[i]
        left_sum = max(left_sum, total)

    # Find maximum sum on right of mid
    total = 0
    for i in range(mid + 1, right + 1):
        total += arr[i]
        right_sum = max(right_sum, total)

    return left_sum + right_sum

def max_subarray_divide_conquer(arr, left, right):
    if left == right:
        return arr[left]

    mid = (left + right) // 2
    left_sum = max_subarray_divide_conquer(arr, left, mid)
    right_sum = max_subarray_divide_conquer(arr, mid + 1, right)
    cross_sum = max_crossing_sum(arr, left, right, mid)

    return max(left_sum, right_sum, cross_sum)

# Example usage:
arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
print("Divide and conquer max subarray sum:", max_subarray_divide_conquer(arr, 0, len(arr)-1))

Here, we split the array into two parts to calculate the sum recursively.

mid = (left + right) // 2
left_sum = max_subarray_divide_conquer(arr, left, mid)
right_sum = max_subarray_divide_conquer(arr, mid + 1, right)
cross_sum = max_crossing_sum(arr, left, right, mid)

This introduces a logarithmic time complexity, making the overall time complexity O(n logn).

Note: Logarithmic time complexity occurs when the problem size is halved or reduced by a constant factor with each iteration or recursive call, such as in binary search or in divide-and-conquer algorithms.

Data Structures Used

The choice of data structure used can significantly impact the time complexity of the algorithm.

Suppose we need to implement a data structure with key-value pairs in Python.

We can do so using a list of tuples:

[(key1, value1), (key2, value2), (key3, value3), ...]

The code implementation is shown below:

# Initialize an empty list to store key-value pairs
store = []

# Insert key-value pair
def insert(key, value):
    for i, (k, v) in enumerate(store):
        if k == key:
            store[i] = (key, value)  # Update value if key exists
            return
    store.append((key, value))  # Add new key-value pair if key doesn't exist

# Get value by key
def get(key):
    for k, v in store:
        if k == key:
            return v
    return None  # Return None if key doesn't exist

# Remove key-value pair
def remove(key):
    for i, (k, v) in enumerate(store):
        if k == key:
            del store[i]
            return
    print(f"Key '{key}' not found.")  # Key not found

# Display all key-value pairs
def display():
    for k, v in store:
        print(f"{k}: {v}")


# Example usage
insert("name", "Alice")
insert("age", 25)
insert("city", "Wonderland")

print(get("name"))  # Output: Alice
remove("age")
display()  # Output: name: Alice, city: Wonderland

This implementation uses the store list to store key-value pairs.

Lists in Python don't have built-in indexing by key. To find, update, or delete a key, the entire list has to be traversed.

Therefore the average case time complexity for insertion, deletion, retrieval, and update for this implementation is O(n).

Improving the Time Complexity

Fortunately, Python has a built-in data structure called a dictionary for this purpose. The Python dictionary uses a hashmap to store data in key-value pairs.

{
key1: value1,
key2: value 2,
key3: value 3,
...
}

# Initialize an empty dictionary to store key-value pairs
store = {}

# Insert key-value pair
def insert(key, value):
    store[key] = value  # Adds or updates the key-value pair

# Get value by key
def get(key):
    return store.get(key, None)  # Returns value or None if key doesn't exist

# Remove key-value pair
def remove(key):
    if key in store:
        del store[key]
    else:
        print(f"Key '{key}' not found.")  # Key not found

# Display all key-value pairs
def display():
    for k, v in store.items():
        print(f"{k}: {v}")


# Example usage
insert("name", "Alice")
insert("age", 25)
insert("city", "Wonderland")

print(get("name"))  # Output: Alice
remove("age")
display()  # Output: name: Alice, city: Wonderland

Unlike our previous implementation, the Python dictionary has an average time complexity of O(1) for insertion, deletion, retrieval, and update.

This is because the Python dictionary implements a hashmap, which has a constant time complexity in most cases.

Therefore, you need to choose the best data structure for the given algorithm to optimize time complexity.

For example,

A hashmap is ideal for quick lookups and insertions.
A balanced tree structure is preferable for ordered data.