Serif Imagine is a tool to generate an ascii storybook video from a passage of text.
To set up the client, install the node requirements and start the localhost.
npm i
npm run start
To set up the server, navigate to /backend and download the requirements. From there, you can start the server.
cd backend
pip install requirements.txt
flask run
When entering a query into the client, you will see the text outputted to the console. 
We use NLTK to parse out all nouns from the text. We then filter out common stopwords and noise and return the three most frequent nouns to capture the overarching topic of a piece.
We are working on refining this extraction with a couple of other experimental approaches:
- Extracting the least common nouns when compared to a large dataset of words to capture unique objects or concepts to the story.
- Extracting bigrams of adjective-noun pairings and groupings of nouns to generate imagery more unique to the story itself.
Both of these approaches are in development but can be tested by modifying the synthesize
function within backend/src/text_to_ascii_background_generator.py
to use the second output of extract nouns
or extract_interesting_bigrams
.
We feed the output of the noun extraction directly to DALL-E. From there, we strip the background of the image and convert the image to ascii text output via an image-to-ascii processor.
These images are sent back to the client and rendered in a storybook-like format. Right now the image generation process takes around ~20-30s. We are exploring affordable alternatives like Stable Diffusion to reduce the load time.