Skip to content

Typecraft/casetagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

b311f33 · May 10, 2017

History

96 Commits
Aug 15, 2016
May 8, 2017
Sep 15, 2016
Mar 22, 2017
Aug 15, 2016
May 8, 2017
Aug 15, 2016
Mar 17, 2017
Aug 21, 2016
Aug 15, 2016
Aug 19, 2016
Aug 15, 2016
Aug 15, 2016
Aug 15, 2016
Aug 15, 2016
Aug 15, 2016
Mar 29, 2017
Aug 15, 2016
Mar 22, 2017
May 10, 2017
Aug 21, 2016
Aug 15, 2016

Repository files navigation

Case Tagger

Documentation Status Updates

Part-of-speech and morphological tagger employing a simple cased-based algorithm.

Overview

The case tagger is a polyglot part-of-speech and morphological gloss-tagger. The tag-set used is the Typecraft tag-set.

The tagger uses simple case-based learning from a large corpus to create a large database of different cases for each language.

When tagging a phrase, the tagger fetches any relevant case for each word, and then 'merges' the cases.

Installation

or

Usage

After installation, you will have available the casetagger command:

The three different subcommands are tag, train and test.

Usage: casetagger test [OPTIONS] [FILES]...

Options:
  --language TEXT
  --raw-text
  --output-raw-text
  --print-test-details
  --help                Show this message and exit.
Usage: casetagger train [OPTIONS] [FILES]...

Options:
  --language TEXT
  --help           Show this message and exit.

Each command takes a files as arguments. Each file is expected to be a TC-XML file. All output is written to stdout.

Configuration

TODO

Features

  • TODO

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.