Skip to content

HfstTransducerExtractPaths

eaxelson edited this page Aug 29, 2017 · 2 revisions

HfstTransducer.extract_paths(self **kwargs)

Extract paths that are recognized by the transducer.

Parameters:

  • kwargs Arguments recognized are filter_flags, max_cycles, max_number, obey_flags, output, random.
  • filter_flags Whether flags diacritics are filtered out from the result (default True).
  • max_cycles Indicates how many times a cycle will be followed, with negative numbers indicating unlimited (default -1 i.e. unlimited).
  • max_number The total number of resulting strings is capped at this value, with 0 or negative indicating unlimited (default -1 i.e. unlimited).
  • obey_flags Whether flag diacritics are validated (default True).
  • output Output format. Values recognized: 'text', 'raw', 'dict' (the default). 'text' returns a string where paths are separated by newlines and each path is represented as input_string + ":" + output_string + "\t" t weight. 'raw' yields a tuple of all paths where each path is a 2-tuple consisting of a weight and a tuple of all transition symbol pairs, each symbol pair being a 2-tuple of an input and an output symbol. 'dict' gives a dictionary that maps each input string into a list of possible outputs, each output being a 2-tuple of an output string and a weight.
  • random Whether result strings are fetched randomly (default False).

Returns

The extracted strings. output controls how they are represented.

Preconditions

The transducer must be acyclic, if both max_number and max_cycles have unlimited values. Else a hfst.exceptions.TransducerIsCyclicException will be thrown.

More information

An example:

>>> tr = hfst.regex('a:b+ (a:c+)')
>>> print(tr)
0       1       a       b       0.000000
1       1       a       b       0.000000
1       2       a       c       0.000000
1       0.000000
2       2       a       c       0.000000
2       0.000000

>>> print(tr.extract_paths(max_cycles=1, output='text'))
a:b     0
aa:bb   0
aaa:bbc 0
aaaa:bbcc       0
aa:bc   0
aaa:bcc 0

>>> print(tr.extract_paths(max_number=4, output='text'))
a:b     0
aa:bc   0
aaa:bcc 0
aaaa:bccc       0

>>> print(tr.extract_paths(max_cycles=1, max_number=4, output='text'))
a:b     0
aa:bb   0
aa:bc   0
aaa:bcc 0

Throws

TransducerIsCyclicException

See also

hfst.HfstTransducer.n_best

Notes

Special symbols are printed as such.

Todo

A link to flag diacritics.