Skip to content

Commit 75ba186

Browse files
authored
Add Object detection Model (#69)
* added the object detection model and demo * update the readme * update the tfjs version deps to 0.12.7 * addressed the review comments * updated readme and renamed the test file
1 parent f776892 commit 75ba186

34 files changed

+14097
-0
lines changed

coco-ssd/.gitignore

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
node_modules/
2+
coverage/
3+
package-lock.json
4+
npm-debug.log
5+
yarn-error.log
6+
.DS_Store
7+
dist/
8+
.idea/
9+
.vscode/
10+
*.tgz
11+
.cache
12+
13+
bazel-*
14+
15+
*.pyc

coco-ssd/.npmignore

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
.vscode/
2+
.rpt2_cache/
3+
demo/
4+
scripts/
5+
src/
6+
training/
7+
coverage/
8+
node_modules/
9+
karma.conf.js
10+
*.tgz
11+
dist/**/*.js.map
12+
.travis.yml
13+
.npmignore
14+
tslint.json
15+
yarn.lock

coco-ssd/README.md

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# Object Detection (coco-ssd)
2+
3+
Object detection model aims to localize and identify multiple objects in a single image.
4+
5+
This model is a TensorFlow.js port of the SSD-COCO model. For more information about Tensorflow object detection API, check out this readme in
6+
[tensorflow/object_detection](https://github.com/tensorflow/models/blob/master/research/object_detection/README.md).
7+
8+
This model detects objects defined in the COCO dataset, which is a large-scale object detection, segmentation, and captioning dataset, you can find more information [here](http://cocodataset.org/#home). The model is capable of detecting [90 classes of objects](./src/classes.ts). SSD stands for Single Shot MultiBox Detection.
9+
10+
This TensorFlow.js model does not require you to know about machine learning.
11+
It can take as input any browser-based image elements (`<img>`, `<video>`, `<canvas>`
12+
elements, for example) and returns an array of most bounding boxes with class name and confidence level.
13+
14+
## Usage
15+
16+
There are two main ways to get this model in your JavaScript project: via script tags or by installing it from NPM and using a build tool like Parcel, WebPack, or Rollup.
17+
18+
### via Script Tag
19+
20+
```html
21+
<!-- Load TensorFlow.js. This is required to use object detection model. -->
22+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]"> </script>
23+
<!-- Load the object detection model. -->
24+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/[email protected]"> </script>
25+
26+
<!-- Replace this with your image. Make sure CORS settings allow reading the image! -->
27+
<img id="img" src="cat.jpg"/>
28+
29+
<!-- Place your code in the script tag below. You can also use an external .js file -->
30+
<script>
31+
// Notice there is no 'import' statement. 'objectDetection' and 'tf' is
32+
// available on the index-page because of the script tag above.
33+
34+
const img = document.getElementById('img');
35+
36+
// Load the model.
37+
objectDetection.load().then(model => {
38+
// Classify the image.
39+
model.detect(img).then(predictions => {
40+
console.log('Predictions: ', predictions);
41+
});
42+
});
43+
</script>
44+
```
45+
46+
### via NPM
47+
48+
```js
49+
// Note: you do not need to import @tensorflow/tfjs here.
50+
51+
import * as objectDetection from '@tensorflow-models/object-detection';
52+
53+
const img = document.getElementById('img');
54+
55+
// Load the model.
56+
const model = await objectDetection.load();
57+
58+
// Classify the image.
59+
const predictions = await model.detect(img);
60+
61+
console.log('Predictions: ');
62+
console.log(predictions);
63+
```
64+
65+
You can also take a look at the [demo app](./demo).
66+
67+
## API
68+
69+
#### Loading the model
70+
`object-detection` is the module name, which is automatically included when you use the `<script src>` method. When using ES6 imports, object-detection is the module.
71+
72+
```ts
73+
objectDetection.load(
74+
base?: 'ssd_mobilenet_v1' | 'ssd_mobilenet_v2' | 'ssdlite_mobilenet_v2'
75+
)
76+
```
77+
78+
Args:
79+
**base:** Controls the base cnn model, can be 'ssd_mobilenet_v1', 'ssd_mobilenet_v2' or 'ssdlite_mobilenet_v2'. Defaults to 'ssdlite_mobilenet_v2'.
80+
ssdlite_mobilenet_v2 is smallest in size, and fastest in inference speed.
81+
ssdlite_mobilenet_v2 has the highest classification accuracy.
82+
83+
Returns a `model` object.
84+
85+
#### Detecting the objects
86+
87+
You can detect objects with the model without needing to create a Tensor.
88+
`model.detect` takes an input image element and returns an array of bounding boxes with class name and confidence level.
89+
90+
This method exists on the model that is loaded from `objectDetection.load`.
91+
92+
```ts
93+
model.detect(
94+
img: tf.Tensor3D | ImageData | HTMLImageElement |
95+
HTMLCanvasElement | HTMLVideoElement, maxDetectionSize: number
96+
)
97+
```
98+
99+
Args:
100+
101+
- **img:** A Tensor or an image element to make a detection on.
102+
- **maxNumBoxes:** The maximum number of bounding boxes of detected objects. There can be multiple objects of the same class, but at different locations. Defaults to 20.
103+
104+
Returns an array of classes and probabilities that looks like:
105+
106+
```js
107+
[{
108+
bbox: [x, y, width, height],
109+
class: "person",
110+
score: 0.8380282521247864
111+
}, {
112+
bbox: [x, y, width, height],
113+
class: "kite",
114+
score: 0.74644153267145157
115+
}]
116+
```
117+
118+
### Technical details for advance users
119+
120+
This model is based on the TensorFlow object detection API, you can download the original models from [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models). We applied following optimizations to improve the performance for browser execution:
121+
122+
1. Removed the post process graph from the original model.
123+
2. Used single class NonMaxSuppression instead of original multiple classes NonMaxSuppression for faster speed with similar accuracy.
124+
3. Executes NonMaxSuppression operations on CPU backend instead of WebGL to avoid delays on the texture downloads.
125+
126+
Here is the converter command for removing the post process graph.
127+
128+
```sh
129+
tensorflowjs_converter --input_format=tf_saved_model \
130+
--output_node_names='Postprocessor/ExpandDims_1,Postprocessor/Slice' \
131+
--saved_model_tags=serve \
132+
./saved_model \
133+
./web_model
134+
```

coco-ssd/demo/.babelrc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"presets": [
3+
[
4+
"env",
5+
{
6+
"esmodules": false,
7+
"targets": {
8+
"browsers": [
9+
"> 3%"
10+
]
11+
}
12+
}
13+
]
14+
],
15+
"plugins": [
16+
"transform-runtime"
17+
]
18+
}

coco-ssd/demo/image1.jpg

291 KB
Loading

coco-ssd/demo/image2.jpg

52.7 KB
Loading

coco-ssd/demo/index.html

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<!-- Copyright 2018 Google LLC. All Rights Reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License");
4+
you may not use this file except in compliance with the License.
5+
You may obtain a copy of the License at
6+
7+
http://www.apache.org/licenses/LICENSE-2.0
8+
9+
Unless required by applicable law or agreed to in writing, software
10+
distributed under the License is distributed on an "AS IS" BASIS,
11+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
See the License for the specific language governing permissions and
13+
limitations under the License.
14+
==============================================================================-->
15+
<!doctype html>
16+
<html>
17+
18+
<body>
19+
<h1>TensorFlow.js Object Detection</h1>
20+
<select id='base_model'>
21+
<option value="ssdlite_mobilenet_v2">SSD Lite Mobilenet V2</option>
22+
<option value="ssd_mobilenet_v1">SSD Mobilenet v1</option>
23+
<option value="ssd_mobilenet_v2">SSD Mobilenet v2</option>
24+
</select>
25+
<button type="button" id="run">Run</button>
26+
<button type="button" id="toggle">Toggle Image</button>
27+
<div>
28+
<img id="image"/>
29+
<canvas id="canvas" width="600" height="399"></canvas>
30+
</div>
31+
</body>
32+
33+
<script src="index.js"></script>
34+
</html>

coco-ssd/demo/index.js

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
/**
2+
* @license
3+
* Copyright 2018 Google LLC. All Rights Reserved.
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
* =============================================================================
16+
*/
17+
18+
// TODO(ping): Switch to package import when the npm is published.
19+
import * as objectDetection from '../src';
20+
21+
import imageURL from './image1.jpg';
22+
import image2URL from './image2.jpg';
23+
24+
let modelPromise;
25+
let baseModel = 'ssdlite_mobilenet_v2';
26+
27+
window.onload = () => modelPromise = objectDetection.load();
28+
29+
const button = document.getElementById('toggle');
30+
button.onclick = () => {
31+
image.src = image.src.endsWith(imageURL) ? image2URL : imageURL;
32+
};
33+
34+
const select = document.getElementById('base_model');
35+
select.onchange = async (event) => {
36+
const model = await modelPromise;
37+
model.dispose();
38+
modelPromise = objectDetection.load(
39+
event.srcElement.options[event.srcElement.selectedIndex].value);
40+
};
41+
42+
const image = document.getElementById('image');
43+
image.src = imageURL;
44+
45+
const runButton = document.getElementById('run');
46+
runButton.onclick = async () => {
47+
const model = await modelPromise;
48+
console.log('model loaded');
49+
console.time('predict1');
50+
const result = await model.detect(image);
51+
console.timeEnd('predict1');
52+
53+
54+
const c = document.getElementById('canvas');
55+
const context = c.getContext('2d');
56+
context.drawImage(image, 0, 0);
57+
context.font = '10px Arial';
58+
59+
console.log('number of detections: ', result.length);
60+
for (let i = 0; i < result.length; i++) {
61+
context.beginPath();
62+
context.rect(...result[i].bbox);
63+
context.lineWidth = 1;
64+
context.strokeStyle = 'green';
65+
context.fillStyle = 'green';
66+
context.stroke();
67+
context.fillText(
68+
result[i].score.toFixed(3) + ' ' + result[i].class, result[i].bbox[0],
69+
result[i].bbox[1] > 10 ? result[i].bbox[1] - 5 : 10);
70+
}
71+
};

coco-ssd/demo/package.json

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
{
2+
"name": "tfjs-object-detection-demo",
3+
"version": "0.0.1",
4+
"description": "",
5+
"main": "index.js",
6+
"license": "Apache-2.0",
7+
"private": true,
8+
"engines": {
9+
"node": ">=8.9.0"
10+
},
11+
"dependencies": {
12+
"@tensorflow/tfjs": "0.12.7",
13+
"stats.js": "^0.17.0"
14+
},
15+
"scripts": {
16+
"watch": "NODE_ENV=development parcel --no-hmr --open index.html ",
17+
"build": "NODE_ENV=production parcel build index.html --no-minify --public-url ./",
18+
"lint": "eslint .",
19+
"link-local": "yalc link @tensorflow-models/object-detection"
20+
},
21+
"devDependencies": {
22+
"babel-plugin-transform-runtime": "~6.23.0",
23+
"babel-runtime": "6.26.0",
24+
"babel-polyfill": "~6.26.0",
25+
"babel-preset-env": "~1.6.1",
26+
"babel-preset-es2017": "^6.24.1",
27+
"clang-format": "~1.2.2",
28+
"dat.gui": "^0.7.1",
29+
"eslint": "^4.19.1",
30+
"eslint-config-google": "^0.9.1",
31+
"parcel-bundler": "~1.6.2",
32+
"yalc": "~1.0.0-pre.21"
33+
},
34+
"eslintConfig": {
35+
"extends": "google",
36+
"rules": {
37+
"require-jsdoc": 0,
38+
"valid-jsdoc": 0
39+
},
40+
"env": {
41+
"es6": true
42+
},
43+
"parserOptions": {
44+
"ecmaVersion": 8,
45+
"sourceType": "module"
46+
}
47+
},
48+
"eslintIgnore": [
49+
"dist/"
50+
]
51+
}

0 commit comments

Comments
 (0)