12 minute read

neural-network-shutterstock.jpg

Weighted Directed Graph

Project as part of the object-oriented course.

The above project deals with the construction of the data structure - an Directed and Weighted graph, using other data structures.

In this project you can find algorithms that deal with solving various problems:

  1. The shortest path between two nodes
  2. Finding the Connected components
  3. Deserialization and Serialization

 

The classes of the project :

  1. DiGraph which extends the abstract class : GraphInterface
Data members:Description
verticesrepresenting by Dictionary
adjacencyrepresenting by Dictionary
edgesrepresenting the edges of the graph - list of tuples (source,destination,weight)
vrepresenting the number of nodes
erepresenting the number of edges
mcrepresenting number of operations performed in the graph
Methods:Description
add_nodeadding a node in the graph
get_nodereturn the node that associated with the initial key
add_edgeadding an edge between two nodes that associated with the initial keys
getEdgereturn the weight of the edge that associated with the initial keys
getEdgereturn the weight of the edge that associated with the initial keys
all_in_edges_of_nodereturn a dictionary that represents all the nodes that pointing of the initial key
all_out_edges_of_nodereturn a dictionary that represents all the nodes that pointed by the initial key
remove_edgeremove the edge between two nodes in the graph
removeNoderemove a node from the graph
graph_transposereturn the directed weighted graph transpose

1.2 DiNode

Data membersDescription
keyrepresent the key of each node
positionrepresent the location of the node (tuple (x,y,z))
inforepresent the info of each node
weightrepresent the weight of each node
Methods:Description
getKeyreturn the key of the node
getInforeturn the info of the node
setInfoset the info of the node
getWeightreturn the weight of the node
setWeightset the weight of the node
getPositionget the position of the node (tuple)
setPositionset the position of the node (tuple)

2 .GraphAlgo which extends the abstract class : GraphInterface

Data membersDescription
graphan object (DiGraph) that represents a graph
Methods:Description
get_graphreturn a graph object
shortest_pathreturn tuple of two objects (length of the path, list of nodes that are in the path)
connected_componentreturn a list that represent the strongly component that associated with the initial key
connected_componentsreturn a list of lists that representing the strongly components in the graph
plot_graphmaking a graphic user interface by using the library mathplotlib
save_to_jsongraph object serialization
load_from_jsongraph object deserialization

 

Visualization of given graphs

image-20210111170240724

Visualization of random graphs

image-20210111170633382<p> </p>

Clone our project: $ git clone https://github.com/d0lev/Weighted-Directed-Graph.git

wiki

Computer Specifications :

CPU : intel(R) core(TM) i7-8550U CPU @ 1.80GHz Ghz 1.99

RAM : 16.0 GB

Operation system : Windows 10 x64

For more information, click on the following buttons:

|wiki | wiki| | – | – |

Java and Python have many similarities.
Both languages have strong cross-platform support and extensive standard libraries.
They both treat (nearly) everything as objects.
Both languages compile to bytecode, but Python is (usually) compiled at runtime.
In this project we compared algorithms running time between Python, Java and NetworkX.
We measured time with Analysis.py class we created for loading 6 different graphs and running the following methods in each one of the graphs:

  • load()
  • shortest_path()
  • connected_component()
  • connected_components()

We measured the time for java algorithms by implementing a new method to DWGraph_Algo.java (this file is included in data folder).
And here are our results:

 

 

GraphMethodPythonJavanetworkX
G_10_80_1.jsonshortest path0.00.0040.0
G_10_80_1.jsonstrongly connected component0.00099754333496093750.004-
G_10_80_1.jsonstrongly connected components0.00099658966064453120.0010.0009982585906982422
G_10_80_1.jsonLoad0.00099658966064453120.2520.0009970664978027344
G_100_800_1.jsonshortest path0.0049848556518554690.0080.0
G_100_800_1.jsonstrongly connected component0.0099754333496093750.004-
G_100_800_1.jsonstrongly connected components0.0079867839813232420.0050.0019931793212890625
G_100_800_1.jsonLoad0.0059852600097656250.0880.005988597869873047
G_1000_8000_1.jsonshortest path0.0457253456115722660.4560.008982419967651367
G_1000_8000_1.jsonstrongly connected component0.078126192092895510.399-
G_1000_8000_1.jsonstrongly connected components0.09275817871093750.1440.02414393424987793
G_1000_8000_1.jsonLoad0.042886257171630860.4210.06652331352233887
G_10000_80000_1.jsonshortest path0.471687316894531259.0310.04133033752441406
G_10000_80000_1.jsonstrongly connected component0.73714160919189454.043-
G_10000_80000_1.jsonstrongly connected components0.67453098297119144.4460.10427618026733398
G_10000_80000_1.jsonLoad0.50500130653381358.1230.34337925910949707
G_20000_160000_1.jsonshortest path0.686671018600463919.780.17187857627868652
G_20000_160000_1.jsonstrongly connected component1.2870297431945818.902-
G_20000_160000_1.jsonstrongly connected components1.244878530502319318.460.37194395065307617
G_20000_160000_1.jsonLoad0.704751014709472716.9960.9114229679107666
G_30000_240000_1.jsonshortest path1.065691232681274454.6570.49542903900146484
G_30000_240000_1.jsonstrongly connected component1.81823277473449753.72-
G_30000_240000_1.jsonstrongly connected components2.052702188491821342.9550.4524245262145996
G_30000_240000_1.jsonLoad1.06213164329528833.1121.1488621234893799

GraphAlgo.py offers methods and algorithms for a weighted directed graph in Python3.

Algorithms:

Algorithm Details Time Complexity
dijkstra() This method implements the Dijkstra algorithm .
and also keep on each node the shortest path from the source node.
These nodes enters a PriorityQueue() and the nodes that poll from the queue will be the nodes with the shortest distance priority
[ a tuple(weight, node) ]and also they will be marked as “visited” .
Variant of Queue that retrieves open entries in priority order (lowest first).
Entries are typically tuples of the form: ( priority number (weight / distance) , data (node) ).
it follows that the destination node will keep the shortest distance from the source node. dijkstra() is using the graph.all_in_edges_of_node() instead of transposing the graph.
img
kosaraju() This method implements the ‘Kosaraju algorithm’.
The Python interpreter limits the depths of recursion to help you avoid infinite recursions,
resulting in stack overflows.This limit prevents infinite recursion from causing an overflow of
the C stack and crashing Python.
That’s why instead of using sys.setrecursionlimit (which isn’t permissioned in this task)
we implemented this method iterative way by using stacks ds to act like a recursion.
The first DFS dfs_inner() is to find all the vertices u that are reachable from vertex v.
The second DFS dfs_reverse() is to check the reverse, i.e if all u can reach v.
The reverse check on the second DFS is made by transposing and getting the graph via graph_transpose().
Instead of testing each vertex u ( which are reachable from v) and can reach v back,
the second DFS on the transpose equivalently tests, if v can reach all u and in the end returning the
specific strongly component.
img
dfs_inner() This method is getting each unvisited vertex from kosaraju() first loop of vertices and fill the main stack of
the components traversal order by using stack_like_rec to act like a recursion (because of the restrictions
detailed in kosaraju() doc). This method is implementing the DFS algorithm iterative way. This is similar to BFS,
the only difference is queue is replaced by stack. Created a stack_dfs of nodes and visited array -> insert
the vertex in the stack_dfs -> run a while-loop till the stack_dfs is not empty -> pop the element from
the stack_dfs -> for every neighbour and unvisited node of current node, mark visited the node and
insert it in the stack_dfs -> insert it to stack_like_rec -> in the end pop all stack_like_rec to the main
stack (like a recursion).
img
dfs_reverse() This method is also using the DFS algorithm (iterative) but this time, it will traverse the transposed graph,
Every call to this method is given with a new empty component that will be filled with the nodes which from
the given vertex to its SSC.
img

Methods:

Method Details
get_graph() return: the underlying graph of which this class works.
save_to_json() A method that performs graph object serialization. it encode it to a JSON file, and save it in the given path.
load_from_json() Loads a graph from a json file. decode it from a JSON file, by loading it from the given path.
shortest_path() return: a tuple contains the distance (float) between source to destination and a list of the shortest path from node source to node destination using Dijkstra’s Algorithm
connected_component() Finds the Strongly Connected Component(SCC) that node key is a part of. this method is using the connected_components() to get and return the specific SCC.
connected_components() Finds all the Strongly Connected Component(SCC) in the graph. this method is using kosaraju() - more details check its docs.
plot_graph() Plots the graph.
If the nodes have a position, the nodes will be placed there. otherwise, they will be placed in a random but elegant manner.
This method is using matplotlib.pyplot which is a state-based interface to matplotlib. it provides a MATLAB-like way of plotting.
pyplot is mainly intended for interactive plots and simple cases of programmatic plot generation.
  • graph_transpose() in Di_Graph.py return the directed weighted graph transpose, it used especially for the kosaraju() method.

Algorithms

import random
from typing import List
from src import GraphInterface
from src.GraphAlgoInterface import GraphAlgoInterface
from src.DiGraph import DiGraph
import json
import matplotlib.pyplot as plt
from queue import *
import sys
import math


class GraphAlgo(GraphAlgoInterface):

    def __init__(self, g: DiGraph = None):
        """
        Init the graph on which this set of algorithms operates on.
        :param g: a directed graph
        """
        self.graph = g

    def get_graph(self) -> GraphInterface:
        """
        :return: the underlying graph of which this class works.
        """
        return self.graph

    def save_to_json(self, file_name: str) -> bool:
        """
        A method that performs graph object serialization.
        It serialize it to a JSON file, and save it in the given path.
        :param file_name: The path to the out file
        :return: True if the save was successful, False o.w.
        """
        graph_json = {"Nodes": [], "Edges": []}

        try:
            with open(file_name, mode='w') as my_file:
                for vertex in self.graph.get_all_v():
                    v = self.graph.get_node(vertex)
                    if v.getPosition() is None:
                        id = v.getKey()
                        vertex_dict = {"id": id}
                        graph_json["Nodes"].append(vertex_dict)
                    else:
                        pos = str(v.getPosition()[0]) + "," + str(v.getPosition()[1]) + "," + str(v.getPosition()[2])
                        id = v.getKey()
                        vertex_dict = {'pos': pos, "id": id}
                        graph_json["Nodes"].append(vertex_dict)

                for edge in self.graph.edges:
                    src = edge[0]
                    dest = edge[1]
                    weight = edge[2]
                    edge_dict = {"src": src, "dest": dest, "w": weight}
                    graph_json["Edges"].append(edge_dict)

                graph_json_str = json.dumps(graph_json)
                my_file.write(graph_json_str)
                return True

        except IOError:
            return False

    def load_from_json(self, file_name: str) -> bool:
        """
        Loads a graph from a json file.
        It deserialize it from a JSON file, by loading it from the given path.
        :param file_name: The path of the file
        :return: True if the loading was successful, False o.w.
        """
        graph_dis = DiGraph()

        try:
            with open(file_name, mode='r') as my_file:
                json_str = my_file.read()
                graph_from_json = json.loads(json_str)

            for vertex in graph_from_json['Nodes']:
                pos = vertex.get('pos')
                if pos is not None:
                    pos = tuple(map(float, vertex['pos'].split(',')))
                key = vertex['id']
                graph_dis.add_node(key, pos)

            for edge in graph_from_json['Edges']:
                source = int(edge['src'])
                destination = int(edge['dest'])
                weight = float(edge['w'])
                graph_dis.add_edge(source, destination, weight)

            self.graph = graph_dis
        except IOError:
            return False

        if self.graph is not None:
            return True

        return False

    def shortest_path(self, source: int, destination: int) -> (float, list):
        """
        :param source: The start node id
        :param destination: The end node id
        :return: a tuple contains the distance (float) between source to destination
                 and a list of the shortest path from node source to node destination using Dijkstra's Algorithm
        """
        tuple_path = self.dijkstra(source, destination)
        if tuple_path is None:
            return 'inf', None
        return tuple_path

    def dijkstra(self, source: int, destination: int) -> (float, list):
        """
        This method implements the Dijkstra algorithm . and also keep on each node the shortest path from the source
        node. These nodes enters a PriorityQueue() and the nodes that poll from the queue will be the nodes with the
        shortest distance priority [ a tuple(weight, node) ] and also they will be marked as "visited" . Variant of
        Queue that retrieves open entries in priority order (lowest first). Entries are typically tuples of the form:
        ( priority number (weight / distance) , data (node) ). it follows that the destination node will keep the
        shortest distance from the source node. dijkstra() is using the 'graph.all_in_edges_of_node()' instead of
        transposing the graph.
        :param source: the source of this path.
        :param destination: the destination of this path.
        :return:a tuple contains the distance (float) between source to destination and a list of the shortest
        path from node source to node destination.
        """
        if (source in self.graph.vertices and destination in self.graph.vertices
                and source != destination):
            self.get_graph().Reset()
            pqueue = PriorityQueue()
            src = self.graph.get_node(source)
            src.setWeight(0)
            pqueue.put((0, src))
            while not pqueue.empty():
                vertx = pqueue.get()[1]
                vertx.setInfo("visited")
                for key, weight in self.graph.all_out_edges_of_node(vertx.key).items():
                    neighbour = self.graph.get_node(key)
                    if neighbour.getInfo() == "unvisited":
                        temp_weight = vertx.weight + weight
                        if temp_weight < neighbour.weight:
                            pqueue.put((temp_weight, neighbour))
                            neighbour.setWeight(temp_weight)

            squeue = Queue()
            path = []
            current = self.graph.get_node(destination)
            if current.getWeight() == sys.maxsize:
                return None
            else:
                path.append(current)
                while current is not src:
                    for key, weight in self.graph.all_in_edges_of_node(current.key).items():
                        neighbour = self.graph.get_node(key)
                        if math.isclose(current.weight - weight, neighbour.weight, rel_tol=1e-5):
                            path.append(neighbour)
                            squeue.put(neighbour)

                    current = squeue.get()

            dest = self.graph.get_node(destination).getWeight()
            path.reverse()

            return dest, path
        return None

    def connected_component(self, key: int) -> list:
        """
        Finds the Strongly Connected Component(SCC) that node key is a part of.
        This method is using the 'connected_components()' to get and return the specific SCC.
        :param key: The node id
        :return: The list of nodes in the SCC
        """
        if key in self.graph.vertices.keys():
            list_components = self.kosaraju()
            for component in list_components:
                if key in component:
                    return component

    def connected_components(self) -> List[list]:
        """
        Finds all the Strongly Connected Component(SCC) in the graph.
        This method is using 'kosaraju()' - more details check its docs.
        :return: The list of all SCC
        """
        if len(self.graph.vertices) > 0:
            return self.kosaraju()

    def kosaraju(self):
        """
        This method implements the 'Kosaraju algorithm'.
        The Python interpreter limits the depths of recursion to help you avoid infinite recursions,
        resulting in stack overflows.This limit prevents infinite recursion from causing an overflow of
        the C stack and crashing Python.
        That's why instead of using 'sys.setrecursionlimit' (which isn't permissioned in this task)
        we implemented this method iterative way by using stacks ds to act like a recursion.
        The first DFS 'dfs_inner()' is to find all the vertices u that are reachable from vertex v.
        The second DFS 'dfs_reverse()' is to check the reverse, i.e if all u can reach v.
        The reverse check on the second DFS is made by transposing and getting the graph via 'graph_transpose()'.
        Instead of testing each vertex u ( which are reachable from v) and can reach v back,<br>
        the second DFS on the transpose equivalently tests, if v can reach all u and in the end returning the
        specific strongly component.
        :return: The list of all SCC
        """
        self.graph.Reset()
        stack = LifoQueue()
        for key in self.graph.vertices:
            vertex = self.graph.get_node(key)
            if vertex.getInfo() == "unvisited":
                self.dfs_inner(vertex, stack)

        components = []
        graph_transpose = self.graph.graph_transpose()
        while not stack.empty():
            vertex = graph_transpose.get_node(stack.get().key)
            if vertex.getInfo() == "unvisited":
                component = []
                components.append(component)
                self.dfs_reverse(vertex, component, graph_transpose)

        return components

    def dfs_inner(self, vertex, stack):
        """
        This method is getting each unvisited vertex from kosaraju() first loop of vertices and fill the main 'stack' of
        the components traversal order by using 'stack_like_rec' to act like a recursion (because of the restrictions
        detailed in kosaraju() doc). This method is implementing the DFS algorithm iterative way. This is similar to BFS,
        the only difference is queue is replaced by stack. Created a stack_dfs of nodes and visited array -> insert
        the 'vertex' in the stack_dfs -> -> run a while-loop till the stack_dfs is not empty -> pop the element from
        the stack_dfs -> -> for every neighbour and unvisited node of current node, mark 'visited' the node and
        insert it in the stack_dfs -> -> insert it to stack_like_rec -> in the end pop all stack_like_rec to the main
        'stack' (like a recursion).
        :param vertex: a given node for this component.
        :param stack: the main 'stack' of the components traversal order.
        """
        stack_like_rec = LifoQueue()
        stack_dfs = LifoQueue()
        stack_like_rec.put(vertex)
        stack_dfs.put(vertex)
        while not stack_dfs.empty():
            current = stack_dfs.get()
            current.setInfo("visited")
            for neighbour in self.graph.all_out_edges_of_node(current.key).keys():
                w = self.graph.get_node(neighbour)
                if w.getInfo() == "unvisited":
                    w.setInfo("visited")
                    stack_dfs.put(w)
                    stack_like_rec.put(w)

        while not stack_like_rec.empty():
            stack.put(stack_like_rec.get())

    @staticmethod
    def dfs_reverse(vertex, component, graph_t):
        """
        This method is also using the DFS algorithm (iterative) but this time, it will traverse the transposed graph,
        Every call to this method is given with a new empty component that will be filled with the nodes which from
        the given vertex to its SSC.
        :param vertex: a node (DiNode).
        :param component: a new list to be filled with nodes which from vertex SSC.
        :param graph_t: the transposed graph.
        """
        stack_dfs = LifoQueue()
        stack_dfs.put(vertex)
        vertex.setInfo("visited")
        while not stack_dfs.empty():
            current = stack_dfs.get()
            component.append(current.key)
            for neighbour, weight in graph_t.all_out_edges_of_node(current.key).items():
                v = graph_t.get_node(neighbour)
                if v.getInfo() == "unvisited":
                    v.setInfo("visited")
                    stack_dfs.put(v)

    def plot_graph(self) -> None:
        """
        Plots the graph.
        If the nodes have a position, the nodes will be placed there.
        Otherwise, they will be placed in a random but elegant manner.
        This method is using `matplotlib.pyplot` which is a state-based interface to matplotlib.
        It provides a MATLAB-like way of plotting.
        pyplot is mainly intended for interactive plots and simple cases of
        programmatic plot generation.
        """
        plt.grid(color='grey', linestyle=':', linewidth=0.5)
        for edge in self.get_graph().edges:
            source = self.get_graph().get_node(edge[0])
            destination = self.get_graph().get_node(edge[1])

            if source.getPosition() is None:
                source.setPosition(random.uniform(0, 40), random.uniform(0, 40), 0)
            if destination.getPosition() is None:
                destination.setPosition(random.uniform(0, 40), random.uniform(0, 40), 0)

            x_list = [source.getPosition()[0], destination.getPosition()[0]]
            y_list = [source.getPosition()[1], destination.getPosition()[1]]
            plt.plot(x_list, y_list, color="purple")

        for key, vertex in self.get_graph().get_all_v().items():
            plt.annotate(str(key), (vertex.getPosition()[0] - 0.0002, vertex.getPosition()[1] + 0.00013), color='green')
            plt.plot(vertex.getPosition()[0], vertex.getPosition()[1], ".", color='black', markersize=14)

        plt.title("Weighted Directed Graph Visualization")
        plt.xlabel("The x axis")
        plt.ylabel("The y axis")
        plt.show()

Leave a comment