-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep external repositories' statistics (i.e. client, tools and modules) up to date #1030
Conversation
Using this script: ```python import datetime import requests import pprint import json import re import os GITHUB_TOKEN = os.environ['GITHUB_TOKEN'] REDIS_DOC_PATH = os.environ['REDIS_DOC_PATH'] headers = {'Authorization': 'Bearer {}'.format(GITHUB_TOKEN)} def run_query(query, variables): request = requests.post('https://api.github.com/graphql', json={'query': query, 'variables': variables}, headers=headers) if request.status_code == 200: return request.json() else: raise Exception("GitHub GrapQL query failed to run - status code: {}. {}".format(request.status_code, query)) def get_repo(owner, name, days=6*30): query = """ query ($owner: String!, $name: String!, $since: GitTimestamp!) { repository(owner: $owner, name: $name) { nameWithOwner description isArchived createdAt forkCount watchers { totalCount } stargazers { totalCount } openIssues: issues(states: OPEN) { totalCount } closedIssues: issues(states: CLOSED) { totalCount } openPullRequests: pullRequests(states: OPEN) { totalCount } closedPullRequests: pullRequests(states: CLOSED) { totalCount } mergedPullRequests: pullRequests(states: MERGED) { totalCount } licenseInfo { name } languages(first: 1, orderBy: {field: SIZE, direction: DESC}) { nodes { name } } periodCommits: defaultBranchRef { target { ... on Commit { history(since: $since) { totalCount } } } } lastCommit: defaultBranchRef { target { ... on Commit { history(first: 1) { edges { node { author { date } } } } } } } } rateLimit { limit cost remaining resetAt } } """ since = datetime.datetime.today()-datetime.timedelta(days=days) variables = """ {{ "owner": "{}", "name": "{}", "since": "{}" }} """.format(owner, name, since.isoformat()) reply = run_query(query, variables) repository = reply['data']['repository'] if repository is None: return None return { 'isArchived': repository['isArchived'], 'createdAt': repository['createdAt'], 'periodCommits': repository['periodCommits']['target']['history']['totalCount'], 'committedAt': repository['lastCommit']['target']['history']['edges'][0]['node']['author']['date'], 'fetchedAt': datetime.datetime.utcnow().replace(microsecond=0).isoformat(), 'forks': repository['forkCount'], 'watchers': repository['watchers']['totalCount'], 'stargazers': repository['stargazers']['totalCount'], 'openPullRequests': repository['openPullRequests']['totalCount'], 'openIssues': repository['openIssues']['totalCount'], } if __name__ == '__main__': jsons = [ 'clients.json', 'tools.json', 'modules.json' ] for jfile in jsons: try: open(jfile, 'r') run = False except FileNotFoundError: run = True if not run: continue with open('{}/{}'.format(REDIS_DOC_PATH, jfile)) as f: elems = json.load(f) ghpat = re.compile(r'^(https://github\.com/|git@github\.com:)(.*)/(.*)$') repos = list() for el in elems: if 'repository' in el: repository = str(el['repository']) mat = ghpat.search(repository) if mat: stats = get_repo(mat.group(2), mat.group(3)) if stats is not None: el['stats'] = stats if 'active' in el: el['active'] = stats['periodCommits'] > 0 if 'stars' in el: el['stars'] = stats['stargazers'] print('touched {} {}'.format(jfile, repository)) with open(jfile, 'w') as f: json.dump(elems, f, indent=4) ``` Signed-off-by: Itamar Haber <[email protected]>
@itamarhaber A suggestion: run it daily using GitHub Actions. With GitHub Actions, we can open a PR with the changes, or directly push to master if we trust in ourselves with the generated code. :) |
That's a great idea for the daily update - thanks! |
The commit history would grow very fast in this way. The stars don't need to be committed to the repo. Why not let the website itself fetch the data and cache it in Redis with a TTL of 24 hours or so? Alternatively, let them be updated when redis.io is deployed. |
Agree, fetching data is better. |
Great @huangz1990. I think the fetching from GitHub API and caching could go into |
100% agree with @zuiderkwast, let's focus on the other PR. |
@zuiderkwast, ok to close this PR? |
Yeah this is built into new website, right? I don't know enough about the new website since I was not involved much. |
@itamarhaber When are we open sourcing the new website? |
@antirez plzlemmeknow your thoughts before I implement the changes to the ui in redis-io (and probably port the script to ruby). Possible triggers: manual, every push, daily...
Generated using this script:
Signed-off-by: Itamar Haber [email protected]