Getting Started

This repository contains the dataset and sample code for the Getting Started section of Pilosa documentation.

The Dataset

The sample dataset contains stargazer and language data for Github projects which were retrieved for the search keyword "Go". See the Generating the Dataset section below to create other datasets.

languages.txt: Language name to languageID mapping. The line number corresponds to the languageID.
language.csv: languageID, projectID
stargazer.csv: stargazerID, projectID, timestamp(starred)

Usage

Docker

Run the Pilosa Docker image with Getting Started data using:

docker run -it --rm -p 10101:10101 pilosa/getting-started:latest

Continue with Getting Started: Make Some Queries.

Without Docker

Pilosa server should be running: Starting Pilosa
The appropriate schema should be initialized: Create the Schema
Finally, the data can be imported: Import Some Data

Continue with Getting Started: Make Some Queries.

Sample Projects

Generating the Dataset

Using a Github token is strongly recommended for avoiding throttling. If you don't already have a token for the GitHub API, see Creating a personal access token for the command line.

A recent version of Python is required. We test the script with 2.7 and 3.5.

Below are the steps to run commands:

Create a virtual env:
- Using Python 2.7: virtualenv getting-started
- Using Python 3.5: python3 -m venv getting-started
Activate the virtual env:
- On Linux, MacOS, other UNIX: source getting-started/bin/activate
- On Windows: getting-started\Scripts\activate
Install requirements: pip install -r requirements.txt
If you have a Github token, save it as token in the root directory of the project.

To generate csv files:

The fetch.py script searches Github for a given keyword and creates the dataset explained in The Dataset section.

Run the script: python fetch.py KEYWORD. KEYWORD is the search term to use for searching repository names.

Creating the Docker Image

make docker VERSION=some-version

Name	Name	Last commit message	Last commit date
Latest commit travisturner Merge pull request #21 from pilosa/travisturner-patch-1 Sep 8, 2020 d588baf · Sep 8, 2020 History 62 Commits
docker	docker	don't sleep, wait	Apr 26, 2019
go	go	Updated for Pilosa 1.0	Aug 29, 2018
java	java	Updated for Pilosa 1.0	Aug 29, 2018
python	python	Updated for Pilosa 1.3	Jun 27, 2019
.gitignore	.gitignore	Trivial	May 8, 2017
Dockerfile	Dockerfile	Added support for Docker image creation	Apr 11, 2019
LICENSE	LICENSE	Apply BSD License	Apr 28, 2017
Makefile	Makefile	Added support for Docker image creation	Apr 11, 2019
README.md	README.md	Fix languages.txt in README	Sep 8, 2020
fetch.py	fetch.py	Updated for latest Pilosa develop	Jun 19, 2018
language.csv	language.csv	continuation of last commit	Apr 25, 2017
languages.txt	languages.txt	update timestamp for input-defintion	Jul 6, 2017
requirements.txt	requirements.txt	updated requests dependency	Dec 7, 2018
stargazer.csv	stargazer.csv	continuation of last commit	Apr 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

The Dataset

Usage

Docker

Without Docker

Sample Projects

Generating the Dataset

To generate csv files:

Creating the Docker Image

About

Releases

Packages

Contributors 6

Languages

License

FeatureBaseDB/getting-started

Folders and files

Latest commit

History

Repository files navigation

Getting Started

The Dataset

Usage

Docker

Without Docker

Sample Projects

Generating the Dataset

To generate csv files:

Creating the Docker Image

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages