HFST - Helsinki Finite-State Transducer Technology API
version 3.7.1
|
After <a href="InstallHfst.html">installing</a> HFST on your computer, include file HfstTransducer.h to the beginning of your program file and link to the HFST library. For example, if you have SFST installed on your computer, the following simple program named test.cc
#include <cstdio> #include "HfstTransducer.h" using namespace hfst; int main() { HfstTransducer tr1("foo", "bar", SFST_TYPE); HfstTransducer tr2("bar", "baz", SFST_TYPE); tr1.compose(tr2); tr1.write_in_att_format(stdout); }
compiled with the command (this may vary on different computers)
gcc test.cc -lhfst -o test
should print to standard out the following text when run:
0 1 foo baz 1
The HFST API is written in the namespace hfst that includes the following classes:
and the following namespaces:
An example of creating a simple transducer from scratch and converting between transducer formats and testing transducer properties and handling exceptions:
using namespace hfst; using implementations::HfstBasicTransducer; using implementations::HfstBasicTransition; /* Create a HFST basic transducer [a:b] with transition weight 0.3 and final weight 0.5. */ HfstBasicTransducer t; t.add_state(1); t.add_transition(0, HfstBasicTransition(1, "a", "b", 0.3)); t.set_final_weight(1, 0.5); /* Convert to tropical OpenFst format and push weights toward final state. */ HfstTransducer T(t, TROPICAL_OFST_TYPE); T.push_weights(TO_FINAL_STATE); /* Convert back to HFST basic transducer. */ HfstBasicTransducer tc(T); try { /* Rounding might affect the precision. */ if (0.79 < tc.get_final_weight(1) && tc.get_final_weight(1) < 0.81) { fprintf(stderr, "TEST OK\n"); exit(0); } fprintf(stderr, "TEST FAILED\n"); exit(1); } /* If the state does not exist or is not final */ catch (HfstException e) { fprintf(stderr, "TEST FAILED: An exception thrown.\n"); exit(1); }
An example of creating transducers from strings, applying rules to them and printing the string pairs recognized by the resulting transducer.
using namespace hfst; ImplementationType type=FOMA_TYPE; /* Create a simple lexicon transducer [[foo bar foo] | [foo bar baz]]. */ HfstTokenizer tok; tok.add_multichar_symbol("foo"); tok.add_multichar_symbol("bar"); tok.add_multichar_symbol("baz"); HfstTransducer words("foobarfoo", tok, type); HfstTransducer t("foobarbaz", tok, type); words.disjunct(t); /* Create a rule transducer that optionally replaces "bar" with "baz" between "foo" and "foo". */ HfstTransducerPair context (HfstTransducer("foo", type), HfstTransducer("foo", type) ); HfstTransducer mapping ("bar", "baz", type); bool optional=true; StringPairSet alphabet; alphabet.insert(StringPair("foo", "foo")); alphabet.insert(StringPair("bar", "bar")); alphabet.insert(StringPair("baz", "baz")); HfstTransducer rule = rules::replace_up (context, mapping, optional, alphabet); /* Apply the rule transducer to the lexicon. */ words.compose(rule).minimize(); /* Extract all string pairs from the result and print them to stdout. */ HfstTwoLevelPaths results; try { words.extract_paths(results); } catch (TransducerIsCyclicException e) { /* This should not happen because transducer is not cyclic. */ fprintf(stderr, "TEST FAILED\n"); exit(1); } /* Go through all paths. */ for (HfstTwoLevelPaths::const_iterator it = results.begin(); it != results.end(); it++) { /* Go through each path. */ StringPairVector spv = it->second; std::string istring(""); std::string ostring(""); for (StringPairVector::const_iterator IT = spv.begin(); IT != spv.end(); IT++) { istring.append(IT->first); ostring.append(IT->second); } fprintf(stdout, "%s : %s\n", istring, ostring); }
An example of reading binary transducers from standard input, converting them to SFST format and writing them to stdout and in AT&T format to file "testfile.att":
HfstInputStream in; HfstOutputStream out(SFST_TYPE); FILE * file = fopen("testfile.att", "wb"); bool first_transducer=true; while (not in.is_eof()) { if (not first_transducer) fprintf(file, "--\n"); /* AT&T format separator. */ HfstTransducer t(in); HfstTransducer tc(t, SFST_TYPE); out << tc; tc.write_in_att_format(file); first_transducer=false; } in.close(); out.close(); fclose(file);