We used the English punctuation procedure, which means that we changed "," and "." to " , " and " . ". Also, we did not handle English case forms. Also, we could not use all data for restrict of computer memory and computational costs. We used only NTCIR-7 data. It means we used only 1798571 sentences.