Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Machine Translation
Rule-based MT & MT evaluation
Jörg Tiedemann
jorg.tiedemann@lingfil.uu.se
Department of Linguistics and Philology
Uppsala University
September 2009
Is it feasible?
simplistic approach: only low-level pre/post-processing I a lot of compositionality in natural language
(tokenization, etc ...) I many similarities between languages
advanced approach: handle some specific phenomena (especially between related languages)
I identification & handling of syntactic ambiguity I example: Systran (in daily use by the European
I morphological processing/synthesis Commission)
I word re-ordering rules I > 1.6 million dictionary units
I rules for prepositions I dictionaries for different domains
I handling of compounds and idioms, ... I more-and-more transfer based
Motivation:
I complete analysis of source language sentences
I transfer step covers divergences between languages
I handle lexical & structural ambiguity in one formalism I source language parser (morpho-syntactic analysis)
→ What kind of information/tools do we need? I transfer engine (e.g. unification based grammar)
I target language generator
→ modular design
Transfer-based MT Transfer-based MT
I English to Spanish:
on → på
I NP → Adjective1 Noun2 ⇒ NP → Noun2 come.vb → kom.vb
Adjective1 come on → kom igen
I Chinese to English:
sit.vb on NP → sitta.vb på NP
I VP → PP[+Goal] V ⇒ VP → V PP[+Goal]
sit.vb on the couch → sitta.vb i soffan
I English to Japanese:
I VP → V NP ⇒ VP → NP V
→ Common: preference for more specific rules
I NP → NP1 RelClause2 ⇒ NP → RelClause2 NP1
I many lexical transfer rules I lots of grammar engineering (writing rules ...)
I often feature-based representations I language-pair specific rules
I rules can copy, delete, transfer, assign features I exponential ambiguity
I fixed rule preference (e.g. specific first) I variation & preference
I morphological generation I coverage & robustness
Advantages:
I no language-pair specific transfer
→ Too much manual work involved!
I simple (?) to add new languages
(add new analysis/generation component) Is there no hope for rule-based systems?
I Domain-specific tasks
Disadvantages: I Rule-induction
I need to design interlingua that covers all language I Hybrid systems
phenomena
I need semantic representation (and that’s hard!)
I may even fail for simple (direct) examples
Domain-specific MT
Typical setup:
Compare MT engines:
Adequacy Fluency
5 = All 5 = Flawless English I rank proposed translations
4 = Most 4 = Good English
3 = Much 3 = Non-native English
I measure relative quality
2 = Little 2 = Disfluent English I could include manual translation
1 = None 1 = Incomprehensible I could rank selected segments only
I constant evaluation is necessary for system development Why are there so many automatic evaluation measures?
I ... but manual evaluation is too expensive!
I only approximations of adequacy & fluency
→ Automatic evaluation is required! I different types of correlations with human evaluation
I possible bias towards certain approaches
Comparison of MT output with reference translations: I tuning on automatic measures makes them inappropriate
BLEU, NIST, METEOR, WER, PER, TER, ROUGE ...
Basic idea:
I introduced in 2002 by Papineni et al
I translation is better if it is closer to given (correct)
I desperately needed by rapid MT development
reference translations
I quickly adapted by statistical MT community
I “closeness” can be measured in terms of N-gram overlaps
I created a boom in MT research/experiments → modified form of precision
I add “brevity penalty” to account for sentence length
→ Many MT papers report only BLEU scores and don’t even
look at the translations ...
→ High correlation with human judgments
(0.99 & 0.96 in original paper)!
Rule-based MT MT evaluation
Next
Lab:
1. try to manually evaluate on-line translation services
2. evaluation experiment: play a little game
I try to guess the type of translations (automatic or manual)
I test if automatic translations are understandable or not
I challenge the system and find out MT weaknesses
Next lecture:
I The amazing utility of parallel corpora (part I)