Minimum Spanning Tree | Forests - Graph / Minimum Spanning Tree

https://algo.monster/problems/mst_forest

number of edges in the final minimum spanning graph = n - number of connected components
Incase of a tree, (fully connected; number of connected conponents = 1)
number of edges = n-1

It is said in problem that “The fence needs to be set up such that every tree in the area is connected to the fence”, and there is no requirement that every fence must be connected to form one set (if it’s possible by tuples list). So in some cases solution could be more optimized and have lesser fence length. For example in Test#2 the expected result is 20, but we could get 15 if last fence between tree 1 and 4 with length 5 is omitted cause tree 1 is already fenced with trees 10, 2 and tree 4 is fenced with tree 9 therefore no need to take last fence 1 | 4

You should specify in the find that you’re employing path compression in the find() function and why this optimization matters.

nvm

+1 on this.

The test 2 actually can have the fence with length 11, if you also include in the priority the fact that an fence with 2 not yet connected trees has the double value (i.e. half cost). The priority gets much trickier though, and you need to recalculate it after every edge picked. This is the solution:
[
5 8 1
7 10 1
6 5 1
4 9 2
2 3 4
1 10 2
]

Also, there are a few things that are not in the description:

  1. The test 2 contains edges that are only possible in the directed graph:
    [

    8 5 2,
    5 8 1,

5 6 2,
6 5 1
]

This makes no sense with the description on the task, and introduces a hideen layer of complexity of the task.

  1. The test 4 contains edges pointing to the tree itself such as
    [
    6 6 2,
    7 7 3,
    8 8 2
    ]

Not sure which is the idea behind that, also makes little sense with the description to me. But this can be a tricky one, if you have a test with an isolated tree that only has such an edge, so we are obligued to include these to solve the problem.

  1. I cannot see, how the Kruskal helps in a problem such as:
    5
    4
    1 3 6
    1 2 1
    2 4 6
    3 4 7

Here, the shortest solution is [{1 2 1}, {3 4 7}] with the cost of 8. However, kruskal will include other two edges while constructing the MST and in the end ignore the last one, arriving to the cost of 13. The only possible solution to this situation I see is to run backtracking with permuting all the edges,. That is too slow however, since it is O(n!) time.

  1. Consider this similar case to the described in the previous point:
    5
    4
    1 3 3
    1 2 2
    2 4 3
    3 4 5

Here, the shortest solution is [{1 3 3}, {2 4 3}] with the cost of 6, ignoring the cheapest edge. I cannot see it possible doing with the kruskal algorithm, since we may need to ignore one of the cheaper edges picked k times before.

Clarification: The sequences of numbers in my previous comment are custom test inputs. Since the platform keeps messing up with the spaces, tabulations and new line characters, I will introduce those characters manually here:
5 4 1 3 6 1 2 1 2 4 6 3 4 7 => 5\n 4\n 1 3 6\n 1 2 1\n 2 4 6\n 3 4 7\n
5 4 1 3 3 1 2 2 2 4 3 3 4 5 => 5\n 4\n 1 3 3\n 1 2 2\n 2 4 3\n 3 4 5\n

P.S: Dear authors, this problem is not well specified nor the solution is well explained. Just throwing the code does little help. I hope you elaborate it better (hope, the described cases in my previous comments help) and how they should be solved. Or find a different one, if this problem turns out to bee too complex for this topic.

Thank you

My code for Prim algo:

from typing import List
from collections import defaultdict
from heapq import *
def mst_forest(trees: int, pairs: List[List[int]]) → int:
seen=set()
adj_list = defaultdict(list)
for pair in pairs:
adj_list[pair[0]].append((pair[2],pair[1]))
adj_list[pair[1]].append((pair[2],pair[0]))

def bfs(node):
    fence=0
    heap=[]
    heappush(heap,(0,node))
    while heap:
        distance,node = heappop(heap)
        if node in seen:
            continue
        fence+=distance
        
        for neighbor in adj_list[node]:
            heappush(heap,neighbor)
        seen.add(node)
       
    return fence

total=0
for i in range(trees):
    if i not in seen:
        total+=bfs(i)
        

return total

this one can never be solved by Prim. Only Kruskal can solve it.

As indicated by the AlgoMonster solution description, it is possible to solve using Prim’s: “One can also run prim’s but make sure that every single node is calculated. Every connected component needs to have prim’s algorithm run on it.”

Prim’s won’t generate the MST for disconnected graphs - a possibility in this problem - which is why we have to run Prim’s on every node. My solution keeps a list of visited nodes like a normal Prim’s implementation, but clears the priority queue for each iteration of Prim’s over the list of nodes. This way we ensure that every node is visited but we don’t traverse the same edge multiple times.

Here is my Python implementation following the logic above:

from typing import List
from heapq import heappush, heappop


class Edge:
    def __init__(self, weight, src, dst):
        self.weight = weight
        self.src = src
        self.dst = dst
    def __lt__(self, other):
        return self.weight < other.weight
    
    
def prims(num_edges, graph):
    visited = set()
    distance = 0
    for edge_src in iter(graph):
        priority_q = []
        visited.add(edge_src)
        adj_edges = graph[edge_src]
        for adj_edge in adj_edges:
            heappush(priority_q, adj_edge)
        while len(priority_q) > 0: 
            edge = heappop(priority_q)
            if edge.dst in visited:
                continue 
            distance += edge.weight
            visited.add(edge.dst)
            adj_edges = graph[edge.dst]
            for adj_edge in adj_edges:
                heappush(priority_q, adj_edge)
    return distance


def mst_forest(trees: int, pairs: List[List[int]]) -> int:
    graph = {}
    for pair in pairs:
        src, dst, weight = pair
        if graph.get(src):
            graph[src].append(Edge(weight, src, dst))
        else:
            graph[src] = [Edge(weight, src, dst)]
        if graph.get(dst):
            graph[dst].append(Edge(weight, dst, src))
        else:
            graph[dst] = [Edge(weight, dst, src)]
    return prims(trees, graph)