Many machine translation systems have been developed for long time and have over three generations of technology.
The first generation was a rule-based translation method, which was developed over the course of many years. This method had translation rules that were written by hand. Thus, if the input sentence completely matched the rule, the output sentence had the best quality. However, many expressions are used for natural language, this technology had very small coverage. In addition, the main problem are that the cost to write rules was too high and that maintaining the rules was hard.
The second generation involved example-based machine translation method. This method finds a similar sentence from corpus and generates a similar output sentence. The problem with this method is calculating the similarity. Many methods like dynamic program (DP) are available. However, they are very heuristic and intuitive and not based on mathematics.
The third generation was a statistical machine translation method and is very popular now. This method is based on the statistics, and it is very reasonable. Even though, many models for statistical machine translation are available. An early model of statistical machine translation was based on IBM1 5. This model is based on words and thus a ``null word'' model is needed. This ``null word'' model sometimes has very hard and serious problems, especially decoding. Thus, recent statistical machine translation usually use a phrase-based models.
Incidentally, two points are used to evaluate the English sentences of machine translations; one is adequacy, and the other is fluency. We believe adequacy is related to translation model and fluency is related to language model . So, we need to make long phrase tables to achieve high adequacy. Similar languages like English and German may have short phrases. However, languages that differ greatly, like Japanese and English, need long phrases. And long phrase tables mean that a large number of Japanese-English parallel sentences are needed. Also, we believe that word trigram model is enough to express the fluency of English.
We implemented our statistical machine translation using a large number of Japanese-English parallel sentences and long phrase tables. So, our system was similar to a statistical example-based translation system. We believe that these concepts provide the best method for Japanese-English translation.
We collected 698,973 Japanese-English parallel sentences. And we made number of 3,769,988 phrase tables from them. Also, we used general tools for statistic machine translation, such as "Giza++"GIZA++, "moses"[2], and "training-phrase-model.perl"[3]. We used this data and these tools, we challenge the contest for IWSLT07 and obtained the BLEU score of 0.4321.