Skip to content
/ blini Public

Chatbot module based on second-order Markov chains

License

Notifications You must be signed in to change notification settings

rzumer/blini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Blini

Chatbot module based on second-order Markov chains.

Features

  • Text generation with two words of context (or fewer if the learned vocabulary is too small)
  • Japanese tokenization using TinySegmenter, including mixed Japanese/alphabetical strings
  • Index of image URLs (e.g. for overlaying text output on top of learned images)
  • Tagging support for tokens (e.g. for generating output based on metadata gathered during learning)
  • Optional persistent storage via node-persist, or any similar package that provides a setItem() function based on the Web Storage API

Usage

Basic text generation

const blini = new (require('blini')).Blini();
blini.processInput('hello world and hello blini');
console.log(blini.generateOutput('')); // Generates output without context
console.log(blini.generateOutput('hello')); // Generates output with a single word of context
console.log(blini.generateOutput('hello world')); // Generates output with two words of context
console.log(blini.generateOutput('goodbye world')); // Generates output with only the last word for context, because "goodbye" is not part of the learned vocabulary
console.log(blini.generateOutput('goodbye and hello')); // Generates output with context taken from the last two words of the input

Example output:

hello world and hello blini
hello blini
hello world and hello blini
goodbye world and hello blini
goodbye and hello blini

Of course, the more natural text is provided for learning, the more varied and interesting the output will be.

Setting up persistent storage using node-persist

const blini = new (require('blini')).Blini();
const storage = require('node-persist');

storage.init().then(_ => {
    blini.storage = storage;
    storage.getItem('bliniDictionary').then(function(data) {
        if (data) blini.dictionary = JSON.parse(data);
        console.info(`Loaded Blini dictionary (${Object.keys(blini.dictionary).length} entries)!`);
    });
    storage.getItem('bliniImages').then(function(data) {
        if (data) blini.images = JSON.parse(data);
        console.info(`Loaded Blini images (${Object.keys(blini.images).length} entries)!`);
    });
});

Image generation

There are some special requirements to enable image generation.

  • It needs filesystem write access to the img/.attachments path (relative to your application's working directory), to store generated images temporarily and delete them after your callback function has completed.
  • It needs its imageFonts property set to a list of fonts to use for overlaying text on top of images. You can specify multiple fonts to cover a larger range of character sets, for example latin characters, Chinese characters, emoji, etc.
    • Fonts must be objects with path and family properties. You can use relative paths, based on your application's working directory, and copy the fonts you need manually for portability.

To ensure that overlaid text has enough space to be readable, images must have an aspect ratio no higher than 2.5 in either direction.

const blini = new (require('blini')).Blini();
blini.imageFonts.push({ path: '.fonts/NotoSans-Bold.ttf', family: 'Noto Sans' });

blini.processInput('hello world and hello blini');
blini.processImage({ width: 985, height: 656, url: 'https://media.xiph.org/tdaede/a.png' });
blini.generateImage('hello world', async function(imagePath) {
    if (!imagePath) {
        throw new Error('could not generate an image');
    }

    // Copy the generated image, as the temporary file will be deleted after this function returns.
    fs.copyFile(imagePath, 'out.png', err => {
        if (err) {
            console.log(err);
        } else {
            console.log('Generated an image at out.png!');
        }
    });
    return new Promise(null);
});

Example output:

Sample image with generated text overlaid on top

Tagging

It's possible to filter the vocabulary based on arbitrary metadata gathered at input time. One simple example is to tag chat messages by author, and generate output based on input from a specific author:

const blini = new (require('blini')).Blini();
blini.processInput('hello world and hello blini', { author: 'rzumer' });
blini.processInput('hello alfred and goodbye blini', { author: 'someguy' });
console.log(blini.generateOutput('hello')); // Generates output without tag filtering
console.log(blini.generateOutput('hello', 'author', 'rzumer')); // Generates output with tag filtering
console.log(blini.generateOutput('goodbye', 'author', 'rzumer')); // Generates output with tag filtering, but context missing from the tagged author's vocabulary

Example output:

hello alfred and goodbye blini
hello world and hello blini
goodbye?

About

Chatbot module based on second-order Markov chains

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published