DSU/Union Find Fundamentals - Advanced Data Structures / Disjoint Set Union | Union Find

algo.monster · December 19, 2021, 7:57am

kv3911 · December 19, 2021, 7:57am

another implementation that tracks counts of disconnected component, size of each component and does path compression

@dataclass
class Counter:
key: int = 0

def union_find(graph: Dict[int, List[int]]):
def find(x: int) → int:
while parent[x] != x:
parent[x] = parent[parent[x]] # 1. path compression
x = parent[x]
return x

def union(x: int, y: int, counter: Counter):
    root_x = find(x)
    root_y = find(y)
    if root_x == root_y:
        return

    # 2. connect smaller subtree to bigger tree
    if size[root_x] > size[root_y]:
        parent[root_y] = root_x
        size[root_x] += size[root_y]
    else:
        parent[root_x] = root_y
        size[root_y] += size[root_x]

    counter.key -= 1

count = Counter()  # maintains count of disconnected component
parent, size = {}, {}  # size ~~ "weight"
for v in graph:
    parent[v] = v
    size[v] = 1
    count.key += 1

# run UF over each edge
for v1, v1_edges in graph.items():
    logger.debug(f"{v1} -> {v1_edges}")
    for v2 in v1_edges:
        union(v1, v2, count)

kopi22 · July 28, 2022, 6:08pm

For the first implementation of UnionFind, it says that: “The following code accomplishes a Union operation in O(1)”. But how does it do it in O(1) if it uses “find” which is O(n) in the worst case?

null1 · May 1, 2023, 5:34pm

Within the rank choice, why do we continuously call find(x) and find(y) for each if statement. Wouldn’t we just call it once and then do an analysis through the if conditions?

Uzair · May 31, 2023, 4:48pm

You can do that too. But since the find() is being called a constant number of times, the overall complexity will be O( constant * log(n)) which is the same as log(n). Hence it doesn’t matter whether you call find() once, or a constant number of times.

Uzair · May 31, 2023, 4:52pm

When we talk about the merge operation or insert operations in general, we assume that the location or the index at which the value is to be inserted is already known. So any complexity spent to find that insertion index is not considered in the insert operation. Insertion in a list is O(n) because even after we find the index to insert the value at beforehand, we still need to move the rest of the elements in the list by 1 to make an empty spot for the new value. In union find, there is no need to move the elements as we are simply mapping the node to a different parent which is a simple assignment operation that can be done in O(1) time.

null51 · September 4, 2023, 1:26am

this should specify that a quick union implementation is used first over a quick find.

Gangyi_Li · December 20, 2023, 6:15pm

 self.id[self.find(y)] = self.find(x)
28            # if rank is the same then we update x rank and increment by 1
29            # we make y's parent equal to x's parent, so x has increased depth
30            if self.rank[self.find(x)] == self.rank[self.find(y)]:

is this right? line 30 will always return true, because self.id[self.find(y)] = self.find(x), which means y and x are pointing to the same parent already. Do i miss something?

Daniel1 · January 7, 2024, 5:35pm

Typo in Implementation section:

How do we construct the tree strcture to maintain disjoint sets?

Bogdan_Ponomarenko · March 11, 2024, 6:31pm

wrong implementation DSU for javascript

class UnionFind {
  constructor() {
    this.id = new Map();
  }

  find(x) {
        let y = this.id.has(x) ? this.id.get(x) : x;
        // check if the current node is a Set ID node
        if (y !== x) {
            y = this.find(y);
        }
        return y;
    }

  union(x,y) {
    this.id.set(this.find(x), this.find(y))
  }
}