Skip to content

lrpopeyou/Evaluation_Framework_For_Graph_Embedding

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

alt text

GEVI : Graph Embedding Visual Inspector

                               Deep Neural Networks Based Approaches for Graph Embeddings

                                               By : Firas BEN HASSAN 
                                               
                                                    Supervisors : 
                                                 Mr. Jörg Schlötterer 
                                                 Prof.Ferdaous CHAABENE
                                                 Dr.Harald Kosch

alt text

1- Abstract

Graphs, such as social networks, word co-occurrence networks, and communication networks, occur naturally in various real-world applications. Analyzing them yields insight into the structure of society, language, and different patterns of communication. Many approaches have been proposed to perform the analysis. Recently, methods which use the representation of graph nodes in vector space have gained traction from the research community.

2-Tasks

Node Classification :

Often in networks, a fraction of nodes are labeled. In social networks, labels may indicate interests, beliefs, or demographics. In language networks, a document may be labeled with topics or keywords, whereas the labels of entities in biology networks may be based on functionality. Due to various factors, labels may be unknown for large fractions of nodes. For example, in social networks many users do not provide their demographic information due to privacy concerns. Missing labels can be inferred using the labeled nodes and the links in the network. The task of predicting these missing labels is also known as node classification.

Link Prediction :

Networks are constructed from the observed interactions between entities, which may be incomplete or inaccurate. The challenge often lies in identifying spurious interactions and predicting missing information. Link prediction refers to the task of predicting either missing interactions or links that may appear in the future in an evolving network.

3-Datasets

Social Networks Datasets :

-KARATE :

Zachary’s karate network is awell-known social network of a university karate club Social network of friendships between 34 members of a karate club at a US university in the 1970

-BLOGCATALOG :

This is a network of social relationships of the bloggers listed on the BlogCatalog website. The labels represent blogger interests inferred through the metadata provided by the bloggers. The network has 10,312 nodes, 333,983 edges and 39 different labels.

-LiveJournal:

LiveJournal is a free on-line blogging community where users declare friendship each other. LiveJournal also allows users form a group which other members can then join. We consider such user-defined groups as ground-truth communities. We provide the LiveJournal friendship social network and ground-truth communities.

Name Type Nodes Edges Link
Karate Undirected,Unweighted,Static 34 78 click here to download
BlogCatalog Undirected,Unweighted 10312 333983 click here to download
LiveJournal Undirected,Unweighted 3 997 962 34681189 click here to download

Collaboration Networks Datasets :

-Cora:

The Cora dataset consists of Machine Learning papers The papers were selected in a way such that in the final corpus every paper cites or is cited by at least one other paper. There are 2708 papers in the whole corpus.

-Wiki:

Wiki contains 2, 405 documents from 19 classes and 17, 981 links between them.

-Citeseer:

Citeseer contains 3, 312 publications from six classes and 4, 732 links between them. Similar to Cora, the links are citation relationships between the documents and each paper is described by a binary vector of 3, 703 dimensions.

Name Type Nodes Edges Link
Cora Undirected,Unweighted, 2708 5429 click here to download
Wiki Undirected,Unweighted 2405 17981 click here to download
Citeseer Undirected,Unweighted 3312 4732 click here to download

Biology Networks Dataset :

-PROTEIN-PROTEIN INTERACTIONS (PPI) :

This is a network of biological interactions between proteins in humans. This network has 3,890 nodes and 38,739 edges.

Name Type Nodes Edges Link
PPI Undirected,Unweighted, 3890 38739 click here to download

4- Paper References with the implementation(s) and the dataset(s)

Node2vec

node2vec (Network-only): Scalable Feature Learning for Networks,

[arxiv] [Python] [Python] [Python],

datasets(Cora, Zachary’s Karate Club, BlogCatalog, Wikipedia, PPI)

DeepWalk

DeepWalk (Network-only): Online Learning of Social Representations,

[arxiv] [Python] [Python],

datasets(Cora, Zachary’s Karate Club, BlogCatalog, Wikipedia, PPI)

LINE

LINE(Network-only): Large-scale information network embedding,

[arxiv] [C++] [Python],

datasets(Cora, Zachary’s Karate Club, BlogCatalog, Wikipedia, PPI)

Doc2vec Doc2vec (content-only ): Distributed Representations of Sentences and Document,

[arxiv] [Python],

dataset (Cora)

Paper2vec Paper2vec ( combined ): Combining Graph and Text Information for Scientific Paper Representation,

[arxiv] [Python],

dataset (Cora)

Glove Glove (content-only ): global vectors for word representation,

[Python]

GraRep

Grarep: Learning graph representations with global structural information,

[Matlab], [Datasets]

TADW

TADW ( combined ): Network Representation Learning with Rich Text Information,

[paper] [Matlab],

Datasets (Cora, Citeseer, Wikipedia)

planetoid

[Planetoid: (Network-only)]Revisiting Semi-supervised Learning with Graph Embeddings,

[arxiv] [Python],

Datasets (Cora, Citeseer, Wikipedia)

DNGR

DNGR: (Network-only) Deep Neural Networks for Learning Graph Representations,

[Matlab] [Python Keras], [Datasets]

ComplEx ComplEx :(Network-only)Complex Embeddings for Simple Link Prediction,

[arxiv] [Python], [Datasets]

5- Evaluation Framework for Graph Embeddings Approaches

alt text

Requirement specification

The rationale behind our framework is to provide developers, end users and researchers with easy-to-use interfaces that allow for the agile, fine-grained and uniform evaluation of graph embeddings approaches on multiple tasks and on multiple datasets.

By these means, we aim to ensure that both tool developers and end users can derive meaningful insights pertaining to the extension, integration and use of graph embeddings algorithms.

In particular, the evaluation framework provides comparable results to tool developers so as to allow them to easily discover the strengths and weaknesses of their implementations with respect to the state of the art and it allows deriving insights pertaining to the areas in which tools should be further refined, thus allowing developers to create an informed agenda for extensions and end users to detect the right tools for their purposes.

The evaluation framework should be an open-source and extensible framework that allows evaluating tools against 11 different approaches on 2 different tasks with 7 different datasets.

Team alt text

About

Deep Neural Networks Based Approaches for Graph Embeddings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.4%
  • Jupyter Notebook 6.6%