ccLexEx – a tool for lexicon extraction from comparable corpora

The tool was developed during the experiments described in the papers regarding lexicon extraction from comparable corpora. The tool consists of scripts for:

  • building context vectors for a list of headwords from each
  • translating features of context vectors from SL to TL via a
    seed lexicon
  • calculating best translation candidates between headwords in
    SL and TL

The tool is distributed under the Apache 2.0 license.

Please contact me for a copy.