Disjoint set

1 answer

A disjoint set, also known as a union-find or merge-find data structure, is a collection of non-overlapping (disjoint) sets that supports two primary operations efficiently:

  1. Union: Join two sets together.
  2. Find: Determine which set a particular element belongs to.

Disjoint sets are commonly used in various applications, such as network connectivity, image processing, clustering, and Kruskal's algorithm for finding the minimum spanning tree in a graph.

Key Concepts

  • Set Representation: Each element points to a "parent" element, with the root element representing the set.
  • Path Compression: A technique used in the find operation to flatten the structure of the tree whenever find is called, making subsequent queries faster.
  • Union by Rank: A technique used in the union operation to ensure that the smaller tree is always added under the root of the larger tree, thereby keeping the structure balanced.

Basic Operations

  • Find(x):

    • Returns the representative (or root) of the set containing x. If x is not the root, the function recursively finds the root of x and applies path compression to speed up future queries.
  • Union(x, y):

    • Merges the sets that contain x and y. It finds the roots of each set and attaches one root to the other based on rank (or size).

Example Implementation (Python)

Here's a simple implementation of a disjoint set:

class DisjointSet:
    def __init__(self, size):
        self.parent = list(range(size))
        self.rank = [1] * size

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]

    def union(self, x, y):
        rootX = self.find(x)
        rootY = self.find(y)

        if rootX != rootY:
            # Union by rank
            if self.rank[rootX] > self.rank[rootY]:
                self.parent[rootY] = rootX
            elif self.rank[rootX] < self.rank[rootY]:
                self.parent[rootX] = rootY
            else:
                self.parent[rootY] = rootX
                self.rank[rootX] += 1

# Example usage:
ds = DisjointSet(10)
ds.union(1, 2)
ds.union(2, 3)
print(ds.find(1))  # Output: 3 (or 1 depending on union structure)
print(ds.find(2))  # Output: 3
print(ds.find(3))  # Output: 3

Complexity

  • Find: Nearly constant time (O(α(n)), where α is the inverse Ackermann function)
  • Union: Nearly constant time

Applications

  • Network connectivity
  • Kruskal's algorithm for minimum spanning trees
  • Image segmentation
  • Keeping track of connected components in dynamic graphs

The disjoint set structure is very efficient for handling dynamic connectivity queries and has become a fundamental algorithmic tool in computer science.