Weighted Directed Graph

December 14, 2021 12 minute read

Weighted Directed Graph

Project as part of the object-oriented course.

The above project deals with the construction of the data structure - an Directed and Weighted graph, using other data structures.

In this project you can find algorithms that deal with solving various problems:

The shortest path between two nodes
Finding the Connected components
Deserialization and Serialization

The classes of the project :

DiGraph which extends the abstract class : GraphInterface

Data members:	Description
vertices	representing by Dictionary
adjacency	representing by Dictionary
edges	representing the edges of the graph - list of tuples (source,destination,weight)
v	representing the number of nodes
e	representing the number of edges
mc	representing number of operations performed in the graph

Methods:	Description
add_node	adding a node in the graph
get_node	return the node that associated with the initial key
add_edge	adding an edge between two nodes that associated with the initial keys
getEdge	return the weight of the edge that associated with the initial keys
getEdge	return the weight of the edge that associated with the initial keys
all_in_edges_of_node	return a dictionary that represents all the nodes that pointing of the initial key
all_out_edges_of_node	return a dictionary that represents all the nodes that pointed by the initial key
remove_edge	remove the edge between two nodes in the graph
removeNode	remove a node from the graph
graph_transpose	return the directed weighted graph transpose

1.2 DiNode

Data members	Description
key	represent the key of each node
position	represent the location of the node (tuple (x,y,z))
info	represent the info of each node
weight	represent the weight of each node

Methods:	Description
getKey	return the key of the node
getInfo	return the info of the node
setInfo	set the info of the node
getWeight	return the weight of the node
setWeight	set the weight of the node
getPosition	get the position of the node (tuple)
setPosition	set the position of the node (tuple)

2 .GraphAlgo which extends the abstract class : GraphInterface

Data members	Description
graph	an object (DiGraph) that represents a graph

Methods:	Description
get_graph	return a graph object
shortest_path	return tuple of two objects (length of the path, list of nodes that are in the path)
connected_component	return a list that represent the strongly component that associated with the initial key
connected_components	return a list of lists that representing the strongly components in the graph
plot_graph	making a graphic user interface by using the library mathplotlib
save_to_json	graph object serialization
load_from_json	graph object deserialization

Visualization of given graphs

Visualization of random graphs

Clone our project: $ git clone https://github.com/d0lev/Weighted-Directed-Graph.git

Computer Specifications :

CPU : intel(R) core(TM) i7-8550U CPU @ 1.80GHz Ghz 1.99

RAM : 16.0 GB

Operation system : Windows 10 x64

For more information, click on the following buttons:

| | | | – | – |

Java and Python have many similarities.
Both languages have strong cross-platform support and extensive standard libraries.
They both treat (nearly) everything as objects.
Both languages compile to bytecode, but Python is (usually) compiled at runtime.
In this project we compared algorithms running time between Python, Java and NetworkX.
We measured time with Analysis.py class we created for loading 6 different graphs and running the following methods in each one of the graphs:

load()
shortest_path()
connected_component()
connected_components()

We measured the time for java algorithms by implementing a new method to DWGraph_Algo.java (this file is included in data folder).
And here are our results:

Graph	Method	Python	Java	networkX
G_10_80_1.json	shortest path	0.0	0.004	0.0
G_10_80_1.json	strongly connected component	0.0009975433349609375	0.004	-
G_10_80_1.json	strongly connected components	0.0009965896606445312	0.001	0.0009982585906982422
G_10_80_1.json	Load	0.0009965896606445312	0.252	0.0009970664978027344
G_100_800_1.json	shortest path	0.004984855651855469	0.008	0.0
G_100_800_1.json	strongly connected component	0.009975433349609375	0.004	-
G_100_800_1.json	strongly connected components	0.007986783981323242	0.005	0.0019931793212890625
G_100_800_1.json	Load	0.005985260009765625	0.088	0.005988597869873047
G_1000_8000_1.json	shortest path	0.045725345611572266	0.456	0.008982419967651367
G_1000_8000_1.json	strongly connected component	0.07812619209289551	0.399	-
G_1000_8000_1.json	strongly connected components	0.0927581787109375	0.144	0.02414393424987793
G_1000_8000_1.json	Load	0.04288625717163086	0.421	0.06652331352233887
G_10000_80000_1.json	shortest path	0.47168731689453125	9.031	0.04133033752441406
G_10000_80000_1.json	strongly connected component	0.7371416091918945	4.043	-
G_10000_80000_1.json	strongly connected components	0.6745309829711914	4.446	0.10427618026733398
G_10000_80000_1.json	Load	0.5050013065338135	8.123	0.34337925910949707
G_20000_160000_1.json	shortest path	0.6866710186004639	19.78	0.17187857627868652
G_20000_160000_1.json	strongly connected component	1.28702974319458	18.902	-
G_20000_160000_1.json	strongly connected components	1.2448785305023193	18.46	0.37194395065307617
G_20000_160000_1.json	Load	0.7047510147094727	16.996	0.9114229679107666
G_30000_240000_1.json	shortest path	1.0656912326812744	54.657	0.49542903900146484
G_30000_240000_1.json	strongly connected component	1.818232774734497	53.72	-
G_30000_240000_1.json	strongly connected components	2.0527021884918213	42.955	0.4524245262145996
G_30000_240000_1.json	Load	1.062131643295288	33.112	1.1488621234893799

GraphAlgo.py offers methods and algorithms for a weighted directed graph in Python3.

Algorithms:

Algorithm	Details	Time Complexity
`dijkstra()`	This method implements the Dijkstra algorithm . and also keep on each node the shortest path from the source node. These nodes enters a `PriorityQueue()` and the nodes that poll from the queue will be the nodes with the shortest distance priority [ a tuple(weight, node) ]and also they will be marked as “visited” . Variant of Queue that retrieves open entries in priority order (lowest first). Entries are typically tuples of the form: ( priority number (weight / distance) , data (node) ). it follows that the destination node will keep the shortest distance from the source node. dijkstra() is using the `graph.all_in_edges_of_node()` instead of transposing the graph.
`kosaraju()`	This method implements the ‘Kosaraju algorithm’. The Python interpreter limits the depths of recursion to help you avoid infinite recursions, resulting in stack overflows.This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python. That’s why instead of using `sys.setrecursionlimit` (which isn’t permissioned in this task) we implemented this method iterative way by using stacks ds to act like a recursion. The first DFS `dfs_inner()` is to find all the vertices u that are reachable from vertex v. The second DFS `dfs_reverse()` is to check the reverse, i.e if all u can reach v. The reverse check on the second DFS is made by transposing and getting the graph via `graph_transpose()`. Instead of testing each vertex u ( which are reachable from v) and can reach v back, the second DFS on the transpose equivalently tests, if v can reach all u and in the end returning the specific strongly component.
`dfs_inner()`	This method is getting each unvisited vertex from `kosaraju()` first loop of vertices and fill the main `stack` of the components traversal order by using `stack_like_rec` to act like a recursion (because of the restrictions detailed in `kosaraju()` doc). This method is implementing the DFS algorithm iterative way. This is similar to BFS, the only difference is queue is replaced by stack. Created a `stack_dfs` of nodes and visited array -> insert the `vertex` in the `stack_dfs` -> run a while-loop till the `stack_dfs` is not empty -> pop the element from the `stack_dfs` -> for every neighbour and unvisited node of current node, mark `visited` the node and insert it in the `stack_dfs` -> insert it to `stack_like_rec` -> in the end pop all `stack_like_rec` to the main `stack` (like a recursion).
`dfs_reverse()`	This method is also using the DFS algorithm (iterative) but this time, it will traverse the transposed graph, Every call to this method is given with a new empty component that will be filled with the nodes which from the given vertex to its SSC.

Methods:

Method	Details
`get_graph()`	return: the underlying graph of which this class works.
`save_to_json()`	A method that performs graph object serialization. it encode it to a JSON file, and save it in the given path.
`load_from_json()`	Loads a graph from a json file. decode it from a JSON file, by loading it from the given path.
`shortest_path()`	return: a tuple contains the distance (float) between source to destination and a list of the shortest path from node source to node destination using Dijkstra’s Algorithm
`connected_component()`	Finds the Strongly Connected Component(SCC) that node key is a part of. this method is using the `connected_components()` to get and return the specific SCC.
`connected_components()`	Finds all the Strongly Connected Component(SCC) in the graph. this method is using `kosaraju()` - more details check its docs.
`plot_graph()`	Plots the graph. If the nodes have a position, the nodes will be placed there. otherwise, they will be placed in a random but elegant manner. This method is using `matplotlib.pyplot` which is a state-based interface to matplotlib. it provides a MATLAB-like way of plotting. pyplot is mainly intended for interactive plots and simple cases of programmatic plot generation.

graph_transpose() in Di_Graph.py return the directed weighted graph transpose, it used especially for the kosaraju() method.

Algorithms

import random
from typing import List
from src import GraphInterface
from src.GraphAlgoInterface import GraphAlgoInterface
from src.DiGraph import DiGraph
import json
import matplotlib.pyplot as plt
from queue import *
import sys
import math


class GraphAlgo(GraphAlgoInterface):

    def __init__(self, g: DiGraph = None):
        """
        Init the graph on which this set of algorithms operates on.
        :param g: a directed graph
        """
        self.graph = g

    def get_graph(self) -> GraphInterface:
        """
        :return: the underlying graph of which this class works.
        """
        return self.graph

    def save_to_json(self, file_name: str) -> bool:
        """
        A method that performs graph object serialization.
        It serialize it to a JSON file, and save it in the given path.
        :param file_name: The path to the out file
        :return: True if the save was successful, False o.w.
        """
        graph_json = {"Nodes": [], "Edges": []}

        try:
            with open(file_name, mode='w') as my_file:
                for vertex in self.graph.get_all_v():
                    v = self.graph.get_node(vertex)
                    if v.getPosition() is None:
                        id = v.getKey()
                        vertex_dict = {"id": id}
                        graph_json["Nodes"].append(vertex_dict)
                    else:
                        pos = str(v.getPosition()[0]) + "," + str(v.getPosition()[1]) + "," + str(v.getPosition()[2])
                        id = v.getKey()
                        vertex_dict = {'pos': pos, "id": id}
                        graph_json["Nodes"].append(vertex_dict)

                for edge in self.graph.edges:
                    src = edge[0]
                    dest = edge[1]
                    weight = edge[2]
                    edge_dict = {"src": src, "dest": dest, "w": weight}
                    graph_json["Edges"].append(edge_dict)

                graph_json_str = json.dumps(graph_json)
                my_file.write(graph_json_str)
                return True

        except IOError:
            return False

    def load_from_json(self, file_name: str) -> bool:
        """
        Loads a graph from a json file.
        It deserialize it from a JSON file, by loading it from the given path.
        :param file_name: The path of the file
        :return: True if the loading was successful, False o.w.
        """
        graph_dis = DiGraph()

        try:
            with open(file_name, mode='r') as my_file:
                json_str = my_file.read()
                graph_from_json = json.loads(json_str)

            for vertex in graph_from_json['Nodes']:
                pos = vertex.get('pos')
                if pos is not None:
                    pos = tuple(map(float, vertex['pos'].split(',')))
                key = vertex['id']
                graph_dis.add_node(key, pos)

            for edge in graph_from_json['Edges']:
                source = int(edge['src'])
                destination = int(edge['dest'])
                weight = float(edge['w'])
                graph_dis.add_edge(source, destination, weight)

            self.graph = graph_dis
        except IOError:
            return False

        if self.graph is not None:
            return True

        return False

    def shortest_path(self, source: int, destination: int) -> (float, list):
        """
        :param source: The start node id
        :param destination: The end node id
        :return: a tuple contains the distance (float) between source to destination
                 and a list of the shortest path from node source to node destination using Dijkstra's Algorithm
        """
        tuple_path = self.dijkstra(source, destination)
        if tuple_path is None:
            return 'inf', None
        return tuple_path

    def dijkstra(self, source: int, destination: int) -> (float, list):
        """
        This method implements the Dijkstra algorithm . and also keep on each node the shortest path from the source
        node. These nodes enters a PriorityQueue() and the nodes that poll from the queue will be the nodes with the
        shortest distance priority [ a tuple(weight, node) ] and also they will be marked as "visited" . Variant of
        Queue that retrieves open entries in priority order (lowest first). Entries are typically tuples of the form:
        ( priority number (weight / distance) , data (node) ). it follows that the destination node will keep the
        shortest distance from the source node. dijkstra() is using the 'graph.all_in_edges_of_node()' instead of
        transposing the graph.
        :param source: the source of this path.
        :param destination: the destination of this path.
        :return:a tuple contains the distance (float) between source to destination and a list of the shortest
        path from node source to node destination.
        """
        if (source in self.graph.vertices and destination in self.graph.vertices
                and source != destination):
            self.get_graph().Reset()
            pqueue = PriorityQueue()
            src = self.graph.get_node(source)
            src.setWeight(0)
            pqueue.put((0, src))
            while not pqueue.empty():
                vertx = pqueue.get()[1]
                vertx.setInfo("visited")
                for key, weight in self.graph.all_out_edges_of_node(vertx.key).items():
                    neighbour = self.graph.get_node(key)
                    if neighbour.getInfo() == "unvisited":
                        temp_weight = vertx.weight + weight
                        if temp_weight < neighbour.weight:
                            pqueue.put((temp_weight, neighbour))
                            neighbour.setWeight(temp_weight)

            squeue = Queue()
            path = []
            current = self.graph.get_node(destination)
            if current.getWeight() == sys.maxsize:
                return None
            else:
                path.append(current)
                while current is not src:
                    for key, weight in self.graph.all_in_edges_of_node(current.key).items():
                        neighbour = self.graph.get_node(key)
                        if math.isclose(current.weight - weight, neighbour.weight, rel_tol=1e-5):
                            path.append(neighbour)
                            squeue.put(neighbour)

                    current = squeue.get()

            dest = self.graph.get_node(destination).getWeight()
            path.reverse()

            return dest, path
        return None

    def connected_component(self, key: int) -> list:
        """
        Finds the Strongly Connected Component(SCC) that node key is a part of.
        This method is using the 'connected_components()' to get and return the specific SCC.
        :param key: The node id
        :return: The list of nodes in the SCC
        """
        if key in self.graph.vertices.keys():
            list_components = self.kosaraju()
            for component in list_components:
                if key in component:
                    return component

    def connected_components(self) -> List[list]:
        """
        Finds all the Strongly Connected Component(SCC) in the graph.
        This method is using 'kosaraju()' - more details check its docs.
        :return: The list of all SCC
        """
        if len(self.graph.vertices) > 0:
            return self.kosaraju()

    def kosaraju(self):
        """
        This method implements the 'Kosaraju algorithm'.
        The Python interpreter limits the depths of recursion to help you avoid infinite recursions,
        resulting in stack overflows.This limit prevents infinite recursion from causing an overflow of
        the C stack and crashing Python.
        That's why instead of using 'sys.setrecursionlimit' (which isn't permissioned in this task)
        we implemented this method iterative way by using stacks ds to act like a recursion.
        The first DFS 'dfs_inner()' is to find all the vertices u that are reachable from vertex v.
        The second DFS 'dfs_reverse()' is to check the reverse, i.e if all u can reach v.
        The reverse check on the second DFS is made by transposing and getting the graph via 'graph_transpose()'.
        Instead of testing each vertex u ( which are reachable from v) and can reach v back,<br>
        the second DFS on the transpose equivalently tests, if v can reach all u and in the end returning the
        specific strongly component.
        :return: The list of all SCC
        """
        self.graph.Reset()
        stack = LifoQueue()
        for key in self.graph.vertices:
            vertex = self.graph.get_node(key)
            if vertex.getInfo() == "unvisited":
                self.dfs_inner(vertex, stack)

        components = []
        graph_transpose = self.graph.graph_transpose()
        while not stack.empty():
            vertex = graph_transpose.get_node(stack.get().key)
            if vertex.getInfo() == "unvisited":
                component = []
                components.append(component)
                self.dfs_reverse(vertex, component, graph_transpose)

        return components

    def dfs_inner(self, vertex, stack):
        """
        This method is getting each unvisited vertex from kosaraju() first loop of vertices and fill the main 'stack' of
        the components traversal order by using 'stack_like_rec' to act like a recursion (because of the restrictions
        detailed in kosaraju() doc). This method is implementing the DFS algorithm iterative way. This is similar to BFS,
        the only difference is queue is replaced by stack. Created a stack_dfs of nodes and visited array -> insert
        the 'vertex' in the stack_dfs -> -> run a while-loop till the stack_dfs is not empty -> pop the element from
        the stack_dfs -> -> for every neighbour and unvisited node of current node, mark 'visited' the node and
        insert it in the stack_dfs -> -> insert it to stack_like_rec -> in the end pop all stack_like_rec to the main
        'stack' (like a recursion).
        :param vertex: a given node for this component.
        :param stack: the main 'stack' of the components traversal order.
        """
        stack_like_rec = LifoQueue()
        stack_dfs = LifoQueue()
        stack_like_rec.put(vertex)
        stack_dfs.put(vertex)
        while not stack_dfs.empty():
            current = stack_dfs.get()
            current.setInfo("visited")
            for neighbour in self.graph.all_out_edges_of_node(current.key).keys():
                w = self.graph.get_node(neighbour)
                if w.getInfo() == "unvisited":
                    w.setInfo("visited")
                    stack_dfs.put(w)
                    stack_like_rec.put(w)

        while not stack_like_rec.empty():
            stack.put(stack_like_rec.get())

    @staticmethod
    def dfs_reverse(vertex, component, graph_t):
        """
        This method is also using the DFS algorithm (iterative) but this time, it will traverse the transposed graph,
        Every call to this method is given with a new empty component that will be filled with the nodes which from
        the given vertex to its SSC.
        :param vertex: a node (DiNode).
        :param component: a new list to be filled with nodes which from vertex SSC.
        :param graph_t: the transposed graph.
        """
        stack_dfs = LifoQueue()
        stack_dfs.put(vertex)
        vertex.setInfo("visited")
        while not stack_dfs.empty():
            current = stack_dfs.get()
            component.append(current.key)
            for neighbour, weight in graph_t.all_out_edges_of_node(current.key).items():
                v = graph_t.get_node(neighbour)
                if v.getInfo() == "unvisited":
                    v.setInfo("visited")
                    stack_dfs.put(v)

    def plot_graph(self) -> None:
        """
        Plots the graph.
        If the nodes have a position, the nodes will be placed there.
        Otherwise, they will be placed in a random but elegant manner.
        This method is using `matplotlib.pyplot` which is a state-based interface to matplotlib.
        It provides a MATLAB-like way of plotting.
        pyplot is mainly intended for interactive plots and simple cases of
        programmatic plot generation.
        """
        plt.grid(color='grey', linestyle=':', linewidth=0.5)
        for edge in self.get_graph().edges:
            source = self.get_graph().get_node(edge[0])
            destination = self.get_graph().get_node(edge[1])

            if source.getPosition() is None:
                source.setPosition(random.uniform(0, 40), random.uniform(0, 40), 0)
            if destination.getPosition() is None:
                destination.setPosition(random.uniform(0, 40), random.uniform(0, 40), 0)

            x_list = [source.getPosition()[0], destination.getPosition()[0]]
            y_list = [source.getPosition()[1], destination.getPosition()[1]]
            plt.plot(x_list, y_list, color="purple")

        for key, vertex in self.get_graph().get_all_v().items():
            plt.annotate(str(key), (vertex.getPosition()[0] - 0.0002, vertex.getPosition()[1] + 0.00013), color='green')
            plt.plot(vertex.getPosition()[0], vertex.getPosition()[1], ".", color='black', markersize=14)

        plt.title("Weighted Directed Graph Visualization")
        plt.xlabel("The x axis")
        plt.ylabel("The y axis")
        plt.show()

Share on

Twitter Facebook LinkedIn

Dor Azaria

Weighted Directed Graph

Weighted Directed Graph

Visualization of given graphs

Visualization of random graphs

Computer Specifications :

Algorithms:

Methods:

Algorithms

Share on

Leave a comment

You may also enjoy

Spotify Popularity Prediction - Machine Learning

Decision Tree Algorithm

Pandas Tutorial - Python

Dynamic Huffman Coding