+2. [X] Uses Maven (`pom.xml` exists)
+3. [X] Is available as OSGi bundle (has `MANIFEST.MF`)
+4. [X] Is available from a p2 repository:
+
+#### Feature matrix
+
+
+
+| | Avail. | Multiple Options | Funct. docs | Trainable | Training docs | Input | Output |
+| ------------------------------ | :----: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------- | :-------: | ------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Tokenization/
segmentation | X | [- Abstract Tokenizer]()
[- CHBT Tokenizer]()
[- Lexer Tokenizer]()
[- Negra Penn Tokenizer]()
[- Penn Treebank Tokenizer]()
[- PTB Tokenizer]()
[- Robust Tokenizer]()
[- WhitespaceTokenizer]() | [Tokenizer Documentation]() | | | - string of document text
- CoreNLPs `CoreDocument` (Instatiated with string document text) | - list of strings
- list of characteroffsetbegin indices
- list of characteroffsetendindices
`CoreDocument` with previous annotation properties |
+| Sentencing | X | | [Sentencer Documentation]() | | | tokenized `CoreDocument` | `CoreDocument` with Sentence List of POS-Tags as property |
+| POS-tagging | X | | [POS Tag Documentation]() | X | | tokenized and sentence-splitted `CoreDocument` | `CoreDocument` with String List of POS-Tags as property |
+| Constituency parsing | X | [Viterbi Parser]()
[Shift reduce Parser]()
[Iterative CKYPCFG Parser]()
[Fast Factored Parser]()
[Exhaustive PCFG Parser]() | [Constituency Parser Documentation]() | | | tokenized, sentence-splitted (and for some models POS-tagged) `CoreDocument` | `CoreDocument` with TreeAnnotation (exact form depends on chosen parser) |
+| Dependency parsing | X | [BiLexPCFGParser]()
[Exhaustive Dependency Parser]() | [Dep Parse Documentation]() | X | [Train own Model]() | tokenized, sentence-splitted and POS-tagged `CoreDocument` | `CoreDocument` with DependencyAnnotation (exact form depends on chosen parser) |
+| Named Entity Recognition | X | [NER Classifier Combiner]()
[- Regex NER Annotator]() | [NER Documentation]() | X | [NER Training docs]() | tokenized, ssplitted, pos-tagged, (lemmatized) `CoreDocument` | `CoreDocument` with `Named Entity Tag Annotation` or `Normalized Named Entity Tag Annotation` |
+| Functionalities extensible | X | [Custom annotator]() | | | |
+| Can consume own models | X | | [Example of including own (caseless) model]() | | | |
+
+
+
+
+### Talismane
+
+> [Talismane website]()
+
+1. [X] Implemented in Java
+ 1. [ ] Not Java, but API can be addressed from Java
+ - Can be addressed as follows: to the best of our knowledge it's only accesible via common ways to integrate Python scripts in Java (e.g. JEPP, PythonInterpreter, Runtime.exec(),..)
+2. [X] Uses Maven (`pom.xml` exists)
+3. [ ] Is available as OSGi bundle (has `MANIFEST.MF`)
+4. [ ] Is available from a p2 repository:
+
+#### Feature matrix
+
+
+
+| | Avail. | Multiple Options | Funct. docs | Trainable | Training docs | Input | Output |
+| ------------------------------ | :----: | ------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------- | :-------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ | ------------ |
+| Tokenization/
segmentation | X | - Simple tokenizer
- pattern tokenizer | [Tokenization Documentation](<>) | X | [Tokenizer Training docs]() | string of raw text | CoNLL format |
+| Sentencing | X | | | X | [Sentence Splitter Training docs]() | string of raw text | CoNLL format |
+| POS-tagging | X | | | X | [POS Tagger Training docs]() | String of raw text | CoNLL format |
+| Constituency parsing | | | | | | |
+| Dependency parsing | X | | [DepParser Documentation (under construction)]() | X | [DepParser Training docs]() | string of raw text | CoNLL format |
+| Named Entity Recognition | | | | | | | |
+| Functionalities extensible | X | | [Advanced Usage]() | | | | |
+| Can consume own models | | | | | | | |
+
+
+
+
+
+
+### TextBlob
+
+> [TextBlob website]()
+
+1. [ ] Implemented in Java
+ 1. [ ] Not Java, but API can be addressed from Java
+ - Can be addressed as follows: to the best of our knowledge it's only accesible via common ways to integrate Python scripts in Java (e.g. JEPP, PythonInterpreter, Runtime.exec(),..)
2. [ ] Uses Maven (`pom.xml` exists)
3. [ ] Is available as OSGi bundle (has `MANIFEST.MF`)
-4. [ ] Is available from a p2 repository: n/a
+4. [ ] Is available from a p2 repository:
#### Feature matrix
-
+
+
+| | Avail. | Multiple Options | Funct. docs | Trainable | Training docs | Input | Output |
+| ------------------------------ | :----: | -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-------: | ------------- | ------------------ | -------------------------------------------------------------------------------------------- |
+| Tokenization/
segmentation | X | | [Tokenization Tutorial](
)
[Advanced Tokenization Documentation]() | | | string of raw text | `TextBlob` data type - access through `words` property: WordList of word strings |
+| Sentencing | X | | [Sentence Splitting Tutorial]()
[Advanced Sentence Splitting Documentation]() | | | string of raw text | `TextBlob` data type - access through `sentences` property: list of sentence objects |
+| POS-tagging | X | - PatternTagger
- NLTKTagger | [POS Tagger Tutorial]()
[POS Tagger Advanced Usage]() | | | string of raw text | `TextBlob` data type - access through `tags` property: List of word string tag string tuples |
+| Constituency parsing | X | | [Parser Tutorial]()
[Parser Advanced Usage]() | | | String of raw text | `TextBlob` data type - access through `parse()` method: TaggedString |
+| Dependency parsing | X | | [Parser Tutorial]()
[Parser Advanced Usage]() | | | String of raw text | `TextBlob` data type - access through `parse()` method: TaggedString |
+| Named Entity Recognition | | | | | | | |
+| Functionalities extensible | | | | | | | |
+| Can consume own models | X | | [Passing models into the Pipeline]()
[Training own data]() | | | | |
-| | Has functionality | Functionality extensible | Functionality documentation | Extension documentation | Input data | Output data |
-|---------------------------|--------------------------|--------------------------|-----------------------------|------------------------------|--------------------------------------|-------------|
-| Tokenization/segmentation | | | | | | |
-| Sentencing | | | | | | |
-| POS-tagging | | | | | | |
-| Constituency parsing | | | | | | |
-| Dependency parsing | | | | | | |
-| Trainable models | | | | | | |
-| Can consume own models | | | | | | |
----
+
-[^1]: The survey was carried out by Prashant Dangwal.
\ No newline at end of file
+[^1]: The survey was carried out by Clara Lachenmaier.