Grasp.py – Explainable AI

Grasp is a lightweight AI toolkit for Python, with tools for data mining, natural language processing (NLP), machine learning (ML) and network analysis. It has 300+ fast and essential algorithms, with ~25 lines of code per function, self-explanatory function names, no dependencies, bundled into one well-documented file: grasp.py (250KB). Or install with pip, including language models (25MB):

$ pip install git+https://github.com/textgain/grasp

Tools for Data Mining

Download stuff with download(url) (or dl), with built-in caching and logging:

src = dl('https://www.textgain.com', cached=True)

Parse HTML with dom(html) into an Element tree and search it with CSS Selectors:

for e in dom(src)('a[href^="http"]'): # external links
    print(e.href)

Strip HTML with plain(Element) to get a plain text string:

for word, count in wc(plain(dom(src))).items():
    print(word, count)

Find articles with wikipedia(str), in HTML:

for e in dom(wikipedia('cat', language='en'))('p'):
    print(plain(e))

Find opinions with twitter.seach(str):

for tweet in first(10, twitter.search('from:textgain')): # latest 10
    print(tweet.id, tweet.text, tweet.date)

Deploy APIs with App. Works with WSGI and Nginx:

app = App()

@app.route('/')
def index(*path, **query):
    return 'Hi! %s %s' % (path, query)

app.run('127.0.0.1', 8080, debug=True)

Once this app is up, go check http://127.0.0.1:8080/app?q=cat.

Tools for Natural Language Processing

Get language with lang(str) for 40+ languages and ~92.5% accuracy:

print(lang('The cat sat on the mat.')) # {'en': 0.99}

Get locations with loc(str) for 25K+ EU cities:

print(loc('The cat lives in Catena.')) # {('Catena', 'IT', 43.8, 11.0): 1}

Get words & sentences with tok(str) (tokenize) at ~125K words/sec:

print(tok("Mr. etc. aren't sentence breaks! ;) This is:.", language='en'))

Get word polarity with pov(str) (point-of-view). Is it a positive or negative opinion?

print(pov(tok('Nice!', language='en'))) # +0.6
print(pov(tok('Dumb.', language='en'))) # -0.4

For de, en, es, fr, nl, with ~75% accuracy.
You'll need the language models in grasp/lm.

Tag word types with tag(str) in 10+ languages using robust ML models from UD:

for word, pos in tag(tok('The cat sat on the mat.'), language='en'):
    print(word, pos)

Parts-of-speech include NOUN, VERB, ADJ, ADV, DET, PRON, PREP, ...
For ar, da, de, en, es, fr, it, nl, no, pl, pt, ru, sv, tr, with ~95% accuracy.
You'll need the language models in grasp/lm.

Tag keywords with trie, a compiled dict that scans ~250K words/sec:

t = trie({'cat*': 1, 'mat' : 2})

for i, j, k, v in t.search('Cats love catnip.', etc='*'):
    print(i, j, k, v)

Get answers with gpt(). You'll need an OpenAI API key.

print(gpt("Why do cats sit on mats? (you're a psychologist)", key='...'))

Tools for Machine Learning

Machine Learning (ML) algorithms learn by example. If you show them 10K spam and 10K real emails (i.e., train a model), they can predict whether other emails are also spam or not.

Each training example is a {feature: weight} dict with a label. For text, the features could be words, the weights could be word count, and the label might be real or spam.

Quantify text with vec(str) (vectorize) into a {feature: weight} dict:

v1 = vec('I love cats! 😀', features=('c3', 'w1'))
v2 = vec('I hate cats! 😡', features=('c3', 'w1'))

c1, c2, c3 count consecutive characters. For c2, cats → 1x ca, 1x at, 1x ts.
w1, w2, w3 count consecutive words.

Train models with fit(examples), save as JSON, predict labels:

m = fit([(v1, '+'), (v2, '-')], model=Perceptron) # DecisionTree, KNN, ...

m.save('opinion.json')

m = fit(open('opinion.json'))

print(m.predict(vec('She hates dogs.')) # {'+': 0.4: , '-': 0.6}

Once trained, Model.predict(vector) returns a dict with label probabilities (0.0–1.0).

Tools for Network Analysis

Map networks with Graph, a {node1: {node2: weight}} dict subclass:

g = Graph(directed=True)

g.add('a', 'b') # a → b
g.add('b', 'c') # b → c
g.add('b', 'd') # b → d
g.add('c', 'd') # c → d

print(g.sp('a', 'd')) # shortest path: a → b → d

print(top(pagerank(g))) # strongest node: d, 0.8

See networks with viz(graph):

with open('g.html', 'w') as f:
    f.write(viz(g, src='graph.js'))

You'll need to set src to the grasp/graph.js lib.

Tools for Comfort

Easy date handling with date(v), where v is an int, a str, or another date:

print(date('Mon Jan 31 10:00:00 +0000 2000', format='%Y-%m-%d'))

Easy path handling with cd(...), which always points to the script's folder:

print(cd('kb', 'en-loc.csv')

Easy CSV handling with csv([path]), a list of lists of values:

for code, country, _, _, _, _, _ in csv(cd('kb', 'en-loc.csv')):
    print(code, country)

data = csv()
data.append(('cat', 'Kitty'))
data.append(('cat', 'Simba'))
data.save(cd('cats.csv'))

Tools for Good

A challenge in AI is bias introduced by human trainers. Remember the Model trained earlier? Grasp has tools to explain how & why it makes decisions:

print(explain(vec('She hates dogs.'), m)) # why so negative?

In the returned dict, the model's explanation is: “you wrote hat + ate (hate)”.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
etc		etc
kb		kb
lm		lm
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
__init__.py		__init__.py
graph.js		graph.js
grasp.jpg		grasp.jpg
grasp.py		grasp.py
setup.py		setup.py
svg.py		svg.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grasp.py – Explainable AI

Tools for Data Mining

Tools for Natural Language Processing

Tools for Machine Learning

Tools for Network Analysis

Tools for Comfort

Tools for Good

About

Releases

Packages

Languages

License

textgain/grasp

Folders and files

Latest commit

History

Repository files navigation

Grasp.py – Explainable AI

Tools for Data Mining

Tools for Natural Language Processing

Tools for Machine Learning

Tools for Network Analysis

Tools for Comfort

Tools for Good

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages