NEAR Recommender System

Index

Documentation
NEAR Social
- Introduction
- Task
- Result
Technical details
Methodologies
- SQL queries
- Notebooks
Widget
Visualization
Development
Authors

This repository contains the files used for the Capstone Project "NEAR Social Recommender - A recommender system for an on-chain social network" of the Data Science Bootcamp , Batch 03/2023 at Constructor.

This project was done in collaboration with Pagoda, the software development company behind the NEAR Blockchain Operating System.

Documentation

For more detailed information about the codebase, please refer to the Documentation.

NEAR Social

Introduction

NEAR Social is a blockchain-based social network where users log in with their NEAR wallet address. All user actions, such as posting, following, liking, and updating their profile, are recorded on the public ledger as blockchain transactions. Users own their data, and developers can create permissionless open-source apps, known as widgets, to expand the platform's capabilities.

Task

Our objective was to develop a user recommendation system that fosters network growth by connecting users with similar interests. To achieve this, we designed a system that utilizes on-chain data for each user. We employed four distinct recommendation algorithms, as illustrated in the architectural overview below:

Top trending users
Friends of friends
Tag similarity
Post similarity

Result

This recommender system is available through a widget on near.org

Technical details

This project used the on-chain data on the NEAR blockchain via the Databricks instance of Pagoda. We created SQL queries and tables as well as Data Science Notebooks.

Methodologies

Among others, we explored the given datasets with the following methods:

Friends of friends
- XGBoost
- RandomForest
Trending users
- NetworkX
- Louvain community detection
Tag/Post Similarity
- Natural Language Processing, Cosine Similarity
- Pooled word embeddings on Large Transformer Model, Cosine Similarity
Hyperlink-Induced-Topic-Search (HITS) Algorithm
- Graphs for visualization and exploration

SQL queries

We created our own SQL tables using existing parsed tables to process the data to our needs. These tables include:

near_social_txs_clean: transactions within the social.near contract without duplicates
graph_follows: table showing users and follows in the form of graph edges
users_agg_metrics: account and social network metrics by user

These tables can be found in the sit schema inside Databricks.

Notebooks

Several notebooks inside and outside Databricks have been created to implement the different recommender algorithms. These can be found under near_recommender/notebooks inside this repository.

Widget

The recommender system is going to be implemented as a widget.

Visualization

Unveiling the web of network connections and community clusters, several iterations of visual interfaces gave us a comprehensive understanding of user relationships, facilitating trending user recommendations and fine tuning the models.

Development

1. Package Management

This package is managed using Poetry, a Python package management tool. You can find more information about Poetry here:

To interact with Poetry's interface, make sure you have it installed.

Basic commands:

poetry shell

Activates a virtualenv for this project.

poetry install

Installs the requirements into the virtualenv.

poetry add/remove <package>

Installs/removes packages. Poetry automatically handles dependency version management. It is recommended to use these commands instead of manually changing versions in pyproject.toml.

Specific versions for a package can be installed by adding the version in format. Refer to the for more details.

poetry update

Updates the entire project.

poetry build

Builds a wheel from the package. This can be uploaded and installed in the designated runtime environment.

For more commands, consult the documentation provided by Poetry.

2. Python Version

We rely on Databricks LTS support for the Python version. Please refer to the pyproject.toml file for further information.

Compiling the Documentation

The documentation is hosted on GitHub Pages from the docs branch, located in the /docs folder. To ensure smooth integration with GitHub, make sure to include an empty .nojekyll file in the compiled docs directory (project_root)/docs.

To build the documentation, use the provided

     make html

command in the documentation source directory, near_recommender/docs/.

To rebuild the documentation, you will need a Java runtime on your localhost and the Poetry virtual environment activated.

Authors

Agustin Rojo Serrano

Christian Kühner

Daniel Herrmann

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
near_recommender		near_recommender
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NEAR Recommender System

Documentation

NEAR Social

Introduction

Task

Result

Technical details

Methodologies

SQL queries

Notebooks

Widget

Visualization

Development

1. Package Management

2. Python Version

Compiling the Documentation

Authors

About

Contributors 3

Languages

License

scopalaffairs/near_recommender

Folders and files

Latest commit

History

Repository files navigation

NEAR Recommender System

Documentation

NEAR Social

Introduction

Task

Result

Technical details

Methodologies

SQL queries

Notebooks

Widget

Visualization

Development

1. Package Management

2. Python Version

Compiling the Documentation

Authors

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages