ccLexEx – a tool for lexicon extraction from comparable corpora
The tool was developed during the experiments described in the papers regarding lexicon extraction from comparable corpora. The tool consists of scripts for:
- building context vectors for a list of headwords from each
corpus - translating features of context vectors from SL to TL via a
seed lexicon - calculating best translation candidates between headwords in
SL and TL
The tool is distributed under the Apache 2.0 license.
Please contact me for a copy.