Topological sort

Topological sort

posted 2 min read

What is topological sort ?

  • We need to schedule a series of tasks, such as classes or construction jobs, where we cannot start one task until after its prerequisites are completed.
  • We want to organize such tasks into a linear order that allows us to complete it one at a time, without violating any prerequisites.
  • This problem can be modeled with DAG, directed acyclic graph
  • Graph is directed since one task is dependent on another, so that means the relation between tasks is directed.
  • Acyclic - We cannot have cycles in the graph, since that would mean conflicting series of prerequisite tasks that could not be completed without violating any prerequisites.
  • The process of laying out the vertices of a DAG in a linear order to meet the prerequisite rule is called topological sort.

How to do it ?

2 ways to implement a topological sort

  1. BFS approach (Kahn's algorithm)
  2. DFS approach

BFS approach:

  • In BFS approach we visit every node and find in-degree for each node (in-degree -> no of edges/ how many nodes are connected to that node)
  • Finding in-degree in case of DAG is simply loop through all the nodes and increase the in-degree count for each adjacent node by 1.
  • We keep the nodes with in-degree 0 into a queue, and start processing the queue.

DFS approach:

  • In DFS approach we visit each node and its adjacent nodes, we only push the node to a stack once all the dependencies are visited first, so we naturally we are visiting all the dependencies first and then the node itself, so we end up ordered items naturally.

Working code

BFS approach

const graph = {
  7: [11, 8],
  5: [11],
  3: [10, 8],
  11: [2, 9, 10],
  8: [9],
  2: [],
  9: [],
  10: [],
};

function toposortBFS(graph) {
  const order = [];
  const indgree = {};

  /*Set indegrees for all node to 0*/
  for (let node of Object.keys(graph)) {
    indgree[node] = 0;
  }

  /*Increment in-degrees of all adjacent nodes of all nodes in graph -> get in-degree for all the nodes in a graph*/
  for (let node of Object.keys(graph)) {
    for (let adjNode of graph[node]) {
      /*IMP: (?? 0) is necessary in case of connected components in  the graph*/
      indgree[adjNode] = (indgree[adjNode] ?? 0) + 1;
    }
  }

  const queue = [];
  /*Push all nodes with 0 in-degree to a queue*/
  for (let node of Object.keys(indgree)) {
    if (indgree[node] === 0) {
      queue.push(node);
    }
  }

  /*Process queue such that we keep pushing adj node with 0 in-degree,  decrement indegrees of adj node, dequeued elements are in sorted order*/
  while (queue.length > 0) {
    const currentNode = queue.shift();
    order.push(currentNode);

    for (let adjNode of graph[currentNode]) {
      indgree[adjNode] -= 1;
      if (indgree[adjNode] === 0) {
        queue.push(adjNode);
      }
    }
  }

  /*We need to check if all the items are processed from the graph - if not, we have a cycle in the graph - this could be used to detect cycles as well*/
  return order.length === Object.keys(graph).length ? order : [];
}

DFS approach

const graph = {
    7:[11,8],
    5:[11],
    3:[10,8],
    11:[2,9,10],
    8:[9],
    2:[],
    9:[],
    10:[]
};

function toposortDFS (graph) {
    const order = [];
    const visited = {};
    
    for (let node of Object.keys(graph)) {
        visited[node] = 0;
    }
    
    function dfs(node) {
        if(visited[node]){ return; }
        visited[node] = 1;
        for(let adjNode of graph[node]) {
            dfs(adjNode);
        }
        order.push(node);
    }
    
    for(let node of Object.keys(graph)) {
        if(!visited[node]) {
            dfs(node);
        }
    }

    return order.reverse();
}

Applications

  • Representing course pre-requisites
  • Detecting dead locks
  • Pipeline of computing jobs
  • Checking for symbolic link loops
  • Evaluating formulae in spreadsheets
If you read this far, tweet to the author to show them you care. Tweet a Thanks
chinmay awesome breakdown!  Which approach do you think scales better for huge datasets—BFS or DFS?
Depends on the problem we are trying to solve, but in general BFS will scale well since it relies on queue based iteration and process the node in layers and this could be implemented as an parallel operation for really large graphs

One more fun part about BFS is that it will visit each node only once, whereas in DFS it might happen that a node is "revisited" but since we are tracking if a node is visited or not, it does not re-run a recursion again, but there are a bunch of extra function calls that happen...
I believe in-degree refers to the number of incoming edges to a node.

There is error on spelling such as perquisites.

I like this article as it presents possible solutions to a problem.

Thanks for reading my article !

fixed the typo

More Posts

JavaScript to Python Mastery Guide: Syntax, Data Structures, OOP, Modules, Async...

Hossam Gouda - Mar 30

"Dive Deep, Code Smart—Master the Waves of DSA!"

Veeranki Phani Sirisha - Jan 16

Build a Telegram bot with Phi-3 and Qdrant and chat with your PDFs!

Astra Bertelli - May 9, 2024

Python Dictionaries: A Comprehensive Guide

Muhammad Sameer Khan - Mar 24, 2024

Python Lists

Muhammad Sameer Khan - Mar 6, 2024
chevron_left