Skip to content

Commit b023107

Browse files
authored
USE QnA API (#490)
* add use-qna code * add test for use qna model * added qna demo and update the model input for qna * updated the model source to tfhub * updated README and versions in package.json * remove yalc link * fix tests * address comments * fix lint * address comments * updated README
1 parent f4a22ea commit b023107

16 files changed

+679
-187
lines changed

universal-sentence-encoder/README.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,17 @@ The sentences (taken from the [TensorFlow Hub USE lite colab](https://colab.sand
1818
5. An apple a day, keeps the doctors away.
1919
6. Eating strawberries is healthy.
2020

21+
# Universal Sentence Encoder For Question Answering
22+
23+
The Universal Sentence Encoder for question answering (USE QnA) is a model that encodes question and answer texts into 100-dimensional embeddings. The dot product of these embeddings measures how well the answer fits the question. It can also be used in other applications, including any type of text classification, clustering, etc.
24+
This module is a lightweight TensorFlow.js [`GraphModel`](https://js.tensorflow.org/api/latest/#loadGraphModel). The model is based on the Transformer ([Vaswani et al, 2017](https://arxiv.org/pdf/1706.03762.pdf)) architecture, and uses an 8k SentencePiece [vocabulary](https://tfhub.dev/google/tfjs-model/universal-sentence-encoder-qa-ondevice/1/vocab.json?tfjs-format=file). It is trained on a variety of data sources, with the goal of learning text representations that are useful out-of-the-box to retrieve an answer given a question.
25+
26+
In [this demo](./demo/index.js) we embed a question and three answers with the USE QnA, and render their their scores:
27+
28+
![QnA scores](./images/qna_score.png)
29+
30+
*The scores show how well each answer fits the question.*
31+
2132
## Installation
2233

2334
Using `yarn`:
@@ -68,3 +79,74 @@ use.loadTokenizer().then(tokenizer => {
6879
tokenizer.encode('Hello, how are you?'); // [341, 4125, 8, 140, 31, 19, 54]
6980
});
7081
```
82+
83+
To use the QnA dual encoder:
84+
```js
85+
// Load the model.
86+
use.loadQnA().then(model => {
87+
// Embed a dictionary of a query and responses. The input to the embed method
88+
// needs to be in following format:
89+
// {
90+
// queries: string[];
91+
// responses: Response[];
92+
// }
93+
// queries is an array of question strings
94+
// responses is an array of following structure:
95+
// {
96+
// response: string;
97+
// context?: string;
98+
// }
99+
// context is optional, it provides the context string of the answer.
100+
101+
const input = {
102+
queries: ['How are you feeling today?', 'What is captial of China?'],
103+
responses: [
104+
'I\'m not feeling very well.',
105+
'Beijing is the capital of China.',
106+
'You have five fingers on your hand.'
107+
]
108+
};
109+
var scores = [];
110+
model.embed(input).then(embeddings => {
111+
/*
112+
* The output of the embed method is an object with two keys:
113+
* {
114+
* queryEmbedding: tf.Tensor;
115+
* responseEmbedding: tf.Tensor;
116+
* }
117+
* queryEmbedding is a tensor containing embeddings for all queries.
118+
* responseEmbedding is a tensor containing embeddings for all answers.
119+
* You can call `arraySync()` to retrieve the values of the tensor.
120+
* In this example, embed_query[0] is the embedding for the query
121+
* 'How are you feeling today?'
122+
* And embed_responses[0] is the embedding for the answer
123+
* 'I\'m not feeling very well.'
124+
*/
125+
const embed_query = embeddings['queryEmbedding'].arraySync();
126+
const embed_responses = embeddings['responseEmbedding'].arraySync();
127+
// compute the dotProduct of each query and response pair.
128+
for (let i = 0; i < input['queries'].length; i++) {
129+
for (let j = 0; j < input['responses'].length; j++) {
130+
scores.push(dotProduct(embed_query[i], embed_responses[j]));
131+
}
132+
}
133+
});
134+
});
135+
136+
// Calculate the dot product of two vector arrays.
137+
const dotProduct = (xs, ys) => {
138+
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
139+
140+
return xs.length === ys.length ?
141+
sum(zipWith((a, b) => a * b, xs, ys))
142+
: undefined;
143+
}
144+
145+
// zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
146+
const zipWith =
147+
(f, xs, ys) => {
148+
const ny = ys.length;
149+
return (xs.length <= ny ? xs : xs.slice(0, ny))
150+
.map((x, i) => f(x, ys[i]));
151+
}
152+
```

universal-sentence-encoder/demo/index.html

Lines changed: 97 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -16,81 +16,85 @@
1616
<html>
1717

1818
<head>
19-
<title>TensorFlow.js Universal Sentence Encoder lite demo</title>
20-
<style>
21-
h1 {
22-
margin-bottom: 35px;
23-
}
24-
25-
#main {
26-
padding-top: 30px;
27-
font-family: Helvetica, sans-serif;
28-
max-width: 960px;
29-
min-width: 600px;
30-
width: 60vw;
31-
margin-left: auto;
32-
margin-right: auto;
33-
}
34-
35-
#sentences-container {
36-
flex: 1 1 auto;
37-
}
38-
39-
#sentences-container > div {
40-
margin-bottom: 10px;
41-
}
42-
43-
#container {
44-
display: flex;
45-
flex-direction: row;
46-
}
47-
48-
#self-similarity-matrix {
49-
position: relative;
50-
}
51-
52-
.labels {
53-
position: absolute;
54-
}
55-
56-
.x-axis {
57-
bottom: 100%;
58-
width: 100%;
59-
height: 20px;
60-
}
61-
62-
.x-axis > div {
63-
transform: translateX(-50%);
64-
}
65-
66-
.y-axis {
67-
right: 100%;
68-
height: 100%;
69-
width: 20px;
70-
}
71-
72-
.y-axis > div {
73-
transform: translateY(-50%);
74-
}
75-
76-
.labels > div {
77-
position: absolute;
78-
}
79-
80-
#description {
81-
margin-bottom: 50px;
82-
line-height: 1.6;
83-
}
84-
</style>
85-
<meta name="viewport" content="width=device-width, initial-scale=1">
19+
<title>TensorFlow.js Universal Sentence Encoder lite demo</title>
20+
<style>
21+
h1 {
22+
margin-bottom: 35px;
23+
}
24+
25+
#main {
26+
padding-top: 30px;
27+
font-family: Helvetica, sans-serif;
28+
max-width: 960px;
29+
min-width: 600px;
30+
width: 60vw;
31+
margin-left: auto;
32+
margin-right: auto;
33+
}
34+
35+
#sentences-container {
36+
flex: 1 1 auto;
37+
}
38+
39+
#sentences-container>div {
40+
margin-bottom: 10px;
41+
}
42+
43+
#container {
44+
display: flex;
45+
flex-direction: row;
46+
}
47+
48+
#self-similarity-matrix {
49+
position: relative;
50+
}
51+
52+
.labels {
53+
position: absolute;
54+
}
55+
56+
.x-axis {
57+
bottom: 100%;
58+
width: 100%;
59+
height: 20px;
60+
}
61+
62+
.x-axis>div {
63+
transform: translateX(-50%);
64+
}
65+
66+
.y-axis {
67+
right: 100%;
68+
height: 100%;
69+
width: 20px;
70+
}
71+
72+
.y-axis>div {
73+
transform: translateY(-50%);
74+
}
75+
76+
.labels>div {
77+
position: absolute;
78+
}
79+
80+
#description {
81+
margin-bottom: 50px;
82+
line-height: 1.6;
83+
}
84+
</style>
85+
<meta name="viewport" content="width=device-width, initial-scale=1">
8686
</head>
8787

8888
<body>
8989
<div id='main'>
9090
<h1>Universal Sentence Encoder lite demo</h1>
91-
<div id="description">This demo is taken from the <a target="_blank" href="https://colab.sandbox.google.com/github/tensorflow/hub/blob/master/examples/colab/semantic_similarity_with_tf_hub_universal_encoder_lite.ipynb#scrollTo=_GSCW5QIBKVe">TensorFlow Hub Universal Sentence Encoder lite colab</a>. It shows the model's ability to group sentences by semantic similarity usings their embeddings. The matrix on the right shows self-similarity scores (dot products) between the embeddings for the sentences on the left. The redder the cell, the higher the similarity score.</div>
91+
<div id="description">This demo is taken from the <a target="_blank"
92+
href="https://colab.sandbox.google.com/github/tensorflow/hub/blob/master/examples/colab/semantic_similarity_with_tf_hub_universal_encoder_lite.ipynb#scrollTo=_GSCW5QIBKVe">TensorFlow
93+
Hub Universal Sentence Encoder lite colab</a>. It shows the model's ability to group sentences by semantic
94+
similarity usings their embeddings. The matrix on the right shows self-similarity scores (dot products) between
95+
the embeddings for the sentences on the left. The redder the cell, the higher the similarity score.</div>
9296
<div id="loading">
93-
Loading the model...
97+
Loading the model...
9498
</div>
9599
<div id="container">
96100
<div id="sentences-container"></div>
@@ -101,6 +105,30 @@ <h1>Universal Sentence Encoder lite demo</h1>
101105
</div>
102106
</div>
103107
</div>
108+
<div id='main'>
109+
<h1>Universal Sentence Encoder QnA demo</h1>
110+
<div id="loadingQnA">
111+
Loading the model...
112+
</div>
113+
<div id="description">
114+
<h2>Encode Question and Answers</h2>
115+
<h3>Question</h3>
116+
<div>How are you feeling today?</div><br />
117+
<h3>Answer 1</h3>
118+
<div>I'm not feeling very well.</div>
119+
<div>Score: <span id="answer_1"></span></div><br />
120+
<h3>Answer 2</h3>
121+
<div>Beijing is the capital of China.</div>
122+
<div>Score: <span id="answer_2"></span></div><br />
123+
<h3>Answer 3</h3>
124+
<div>You have five fingers on your hand.</div>
125+
<div>Score: <span id="answer_3"></span></div><br />
126+
</div>
127+
</div>
128+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
129+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
130+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
131+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-cpu"></script>
104132
<script src="index.js"></script>
105133
</body>
106134

universal-sentence-encoder/demo/index.js

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,10 @@ const sentences = [
2626

2727
const init = async () => {
2828
const model = await use.load();
29-
3029
document.querySelector('#loading').style.display = 'none';
3130
renderSentences();
3231

3332
const embeddings = await model.embed(sentences);
34-
3533
const matrixSize = 250;
3634
const cellSize = matrixSize / sentences.length;
3735
const canvas = document.querySelector('canvas');
@@ -70,8 +68,42 @@ const init = async () => {
7068
}
7169
}
7270
};
73-
71+
const initQnA = async () => {
72+
const input = {
73+
queries: ['How are you feeling today?'],
74+
responses: [
75+
'I\'m not feeling very well.', 'Beijing is the capital of China.',
76+
'You have five fingers on your hand.'
77+
]
78+
};
79+
const model = await use.loadQnA();
80+
document.querySelector('#loadingQnA').style.display = 'none';
81+
let result = await model.embed(input);
82+
const query = result['queryEmbedding'].arraySync();
83+
const answers = result['responseEmbedding'].arraySync();
84+
for (let i = 0; i < answers.length; i++) {
85+
document.getElementById(`answer_${i + 1}`).textContent =
86+
`${dotProduct(query[0], answers[i])}`
87+
}
88+
};
7489
init();
90+
initQnA();
91+
// zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
92+
const zipWith =
93+
(f, xs, ys) => {
94+
const ny = ys.length;
95+
return (xs.length <= ny ? xs : xs.slice(0, ny))
96+
.map((x, i) => f(x, ys[i]));
97+
}
98+
99+
// dotProduct :: [Int] -> [Int] -> Int
100+
const dotProduct =
101+
(xs, ys) => {
102+
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
103+
104+
return xs.length === ys.length ? (sum(zipWith((a, b) => a * b, xs, ys))) :
105+
undefined;
106+
}
75107

76108
const renderSentences = () => {
77109
sentences.forEach((sentence, i) => {

universal-sentence-encoder/demo/package.json

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,7 @@
99
"node": ">=8.9.0"
1010
},
1111
"dependencies": {
12-
"@tensorflow-models/universal-sentence-encoder": "^1.2.2",
13-
"@tensorflow/tfjs": "^1.2.9",
12+
"@tensorflow/tfjs": "^2.0.1",
1413
"d3-scale-chromatic": "^1.3.3"
1514
},
1615
"scripts": {

0 commit comments

Comments
 (0)