Task Scheduling 2 - Graph / Directed Graph / Topological Sort

https://algo.monster/problems/task_scheduling_2

Test Case #3 the answer does not look right

Can you elaborate?

Test case #3 should be 102 right? 100 + 1 + 1. Abbreviate cardinals dextrous and green can be done first and green has maxcost of 100, then bricks is done next with cost of 1, then lastly fibre with cost of 1. The answer is 101, which is saying the tasks can be done in two steps, how is this possible?

Case #3 is wrong. it should be 102.
Here is a python code on how I did it:

from typing import List
import collections
def task_scheduling_2(tasks: List[str], times: List[int], requirements: List[List[str]]) → int:
# WRITE YOUR BRILLIANT CODE HERE
inbound = collections.defaultdict(int)
graph = collections.defaultdict(list)

#build the graph
for req in requirements:
    graph[req[0]].append(req[1])
    inbound[req[1]]+=1
queue = collections.deque([])

for task in tasks:
    if inbound[task] == 0:
        queue.append(task)
timeSum = 0        
while queue:
    maxTime = 0
    for _ in range(len(queue)):
        task = queue.popleft()
        #print('task is:', task)
        time = times[tasks.index(task)]
        maxTime = max(time, maxTime)
        
        for neighbour in graph[task]:
            inbound[neighbour]-=1
            if inbound[neighbour] == 0:
                queue.append(neighbour)
                del inbound[neighbour]
    timeSum +=maxTime

return timeSum

This test case is correct, why is the max value of 102? task g takes time of 100 plus time of 1 from f for a total of 101. As stated in the question you can start any number of tasks as long as you completed the requirements so you do not need to wait for g to complete before starting b.
For a timeline of sorts, at time 0 we start tasks a,c,d,g, at time 1 tasks a,c,d complete and we can start task b. At time 2 task b completes. Then at time 100 task g completes and we can start task f. Then we finish all tasks at time 101.

Case #3 looks to be wrong and should be 102. Drawing the graph out

Graph:
abbreviate → bricks → fiber
cardinals → bricks
dextrous → bricks
green → fiber
height

For the topological sort, we must first process the parents with no nodes which is [abbreviate, cardinals, dextrous, green , height]. Since the maximum time for this is green with time of 100, this gives us the total time needed for the first level to be 100

total time so far : 100
Graph:
bricks → fiber

Now we have to process the next level which is just [bricks]. Bricks has a time of 1, so we add that to our total time

total time so far: 101
Graph:
fiber

Now we have the last node to process which is [fiber] with a time of 1. so we add that to the total time

total time: 102

task g takes time of 100 plus time of 1 from f for a total of 101. As stated in the question you can start any number of tasks as long as you completed the requirements so you do not need to wait for g to complete before starting b. For a timeline of sorts, at time 0 we start tasks a,c,d,g, at time 1 tasks a,c,d complete and we can start task b. At time 2 task b completes. Then at time 100 task g completes and we can start task f. Then we finish all tasks at time 101.

My Python solution:

from typing import List
from collections import deque
def task_scheduling_2(tasks: List[str], times: List[int], requirements: List[List[str]]) → int:
# WRITE YOUR BRILLIANT CODE HERE
graph = {n:[] for n in tasks}
parent_count = {n:0 for n in tasks}
for parent, child in requirements:
graph[parent].append(child)
parent_count[child] += 1
times_dict = {task:time for task, time in zip(tasks, times)}
dist = {}
q = deque()
for node in parent_count:
if parent_count[node] == 0:
dist[node] = times_dict[node]
q.append(node)
ans = 0
while len(q) >0:
cur = q.popleft()
ans = max(ans, dist[cur])
for child in graph[cur]:
if not child in dist:
dist[child] = dist[cur]+times_dict[child]
else:
dist[child] = max(dist[cur]+times_dict[child], dist[child])
parent_count[child] -= 1
if parent_count[child] == 0:
q.append(child)
return ans

Great way to construct the times_dict!

Can you upload a python version solution

done. sorry we had some issues and the site was showing older commits which didn’t have it.

Appreciate it!

Solution in JS:

const taskToTime = {};
for(let i = 0; i < tasks.length; i++) {
    taskToTime[tasks[i]] = times[i];
}
const graph = generateGraph(tasks, requirements);
const counts = generateCounts(tasks, graph);
const completionTimes = {};
for(let task of tasks){
    completionTimes[task] = 0;
}

let totalTime = 0;
const q = []; // a, c
for(let task in counts){
    if(counts[task] === 0){
        q.push(task);
        completionTimes[task] = taskToTime[task];
    }
}
while(q.length){
    const dq = q.shift();
    for(let child of graph[dq]){
        counts[child]--;
        
        const curr = completionTimes[child];
        const newPathTime = completionTimes[dq] + taskToTime[child];
        const maxTime = Math.max(curr, newPathTime);
        
        completionTimes[child] = maxTime;
        totalTime = Math.max(totalTime, maxTime);
        
        if(counts[child] === 0)
            q.push(child);
    }
}
return totalTime;

Why dis[child] = max(dis[child], dis[node] + task_times[child]) instead of dis[child] = dis[node] + task_times[child] in the Python answer? I couldn’t figure out when it would be needed and removed it, and it passed test cases

The child might be updated by multiple parents and you will need the max distance.

should be 102 as BOTH b and g need to be completed before start task f .

Arrgh - you’re right, I ran into the same bug in my code that Michael did, thanks for the clarification

I found DFS approach simpler and more natural. It is as follows: perform DFS on an inverted graph (so instead of A → B, we have B → A) for a max time it takes to complete task and its dependencies. To avoid the explosion of possible paths, we simply add memoization. Since we memoize the max times for all tasks, we do not need to start search from “end” nodes and can simply get the max time out of all tasks (which naturally will be some end node). Here is my solution:

def task_scheduling_2(tasks: List[str], times: List[int], requirements: List[List[str]]) → int: \

graph = {node: [time, []] for node, time in zip(tasks, times)}
for req in requirements:
    graph[req[1]][1].append(req[0])
    
def find_max_time(start, memo):
    if start in memo:
        return memo[start]
    
    time = graph[start][0]
    max_time = 0
    for neighbour in graph[start][1]:
        max_time = max(max_time, find_max_time(neighbour, memo))
                   
    time += max_time
    memo[start] = time
    return time
       
max_time = 0 
memo = dict()
for task in tasks:
    max_time = max(max_time, find_max_time(task, memo))
return max_time

“task g takes time of 100 plus time of 1” is wrong. task g and tasks a,c,d can be performed together because you have unlimited friends and they all have 0 in-degree. Total time taken to finish a,c,d,g is max(1,1,1,100) = 100

“You have also invited all your friends to help complete your tasks so you can work on any amount of tasks at a given time”

and obviously a,c,d completes before g because they take 1 minute to complete and latter takes 100 minutes (assume minutes for the sake of argument) The constraints are satisfied