Final project for Northwestern University MSDS 462 - Computer Vision
This project is a combination of object detection and image classification on an Android edge device. In short, users can take a photo of a Tarot spread and the app will detect and classify up to 5 individual cards.
- The awesome Augmentor Python package helped a ton with expanding the training dataset.
- Google AutoML Vision was used to train and test the model.
- A Tensorflow Lite multi-class classification model for edge device deployment.
- The app itself is modified from Google's ML-Kit demo app.
The dataset is custom. While many different Tarot decks exist, by far the most popular is the Rider-Waite-Smith deck, so I opted to go with that. I used a Google Pixel 3a to photograph the cards.
Standard Tarot sets have 78 cards, each of which has a different interpretation when displayed right-side up or upside-down. Unfortunately, developing a dataset with and training a custom object detection model on 158 classes was a monster task, so I opted to focus solely on the 22 major arcana. Hopefully this can be extended in the future to include the whole enchilada!
This task lent itself well to image augmentation. The majority of variability within each class is likely to come from different camera sensors, lighting, or photo angle (as opposed to natural variability within the class). I used the awesome Augmentor Python package to augment the size of the training dataset enormously.

I used Google's AutoML Vision service to train and evaluate the models. It's an awesome service that makes machine learning model building so straightforward. It was a simple matter of uploading the photos to a Google Cloud storage bucket, generating a CSV to direct the model to each photo, then choosing the model output type I desired.
Unfortunately, I wasted quite a bit of time (and GCP credit!) by first training an object detection model. Turns out that ML-Kit object detection has its own built-in object detector which relies on a user-supplied classification model to actually perform classification. In the end, I ended up training two models.
That said, the second multi-class classification model I trained ended up being quite a bit more accurate and less computationally expensive than the initial object detection model.

To make the model compact and quick enough that an edge device could store and run it, Tensorflow Lite is the model type that ML-Kit is built to work with. Thankfully, I was able to simply download this model from AutoML instead of having to generate it myself with the notoriously difficult TOCO command line tool.
If you want to visualize the architecture of the Tarot-Bot TFLite model (or any neural net model!), upload it on Netron.
Google's ML-Kit demo app made deployment of the model on an edge device pretty straightforward. The demo app has a ton of optional code to allow for demonstration of several different computer vision scenarios in both Java and Kotlin. However, getting a custom model deployed was a matter of editing a single file to add a custom detector option, then dropping the .tflite model into the assets folder within the app directory.
For those curious, I added lines 74 and 393-408 in the StillImageActivity.java file according to the ML-Kit object detection docs in order to implement my custom Tarot detector option. No, I am not very good with Java.
- Clone this repo.
- Install Android Studio.
- Open Android studio, choose the option to Open an Existing Project. Select the
mlkitfolder within the repo. Approve any Gradle or APK updates. - Once Gradle finishes syncing, click the green play button on the top right to run the app on an Android emulator.
- Upload photos to the downloads folder of the emulator (there are already a few in
assets/test_photoswithin this repo). - Launch the MLKit App, choose Run the ML Kit quickstart written in Java, then StillImageActivity.
- Select the Custom Object Detection (Tarot) from the bottom right menu, then select an image.
The system does a fairly good job of detecting objects and classifying cards, although its maximum of 5 objects leaves much to be desired in terms of reading an actual tarot spread.
When using the app via the Android emulator in Android Studio (shown below), the only way to get photos into the emulator is to drag and drop into the downloads folder. When running the app on an actual Android phone, it's possible to take a photo of a card spread using the device camera.

