As mentioned above, our experiments show that the sentence recognition rate improves in 23 out of 70 sentences. However, we found that, in 8 of these 23 sentences, the sentence format ending in a noun contributed to the improvement of the rate, even though the sentence was grammatically, semantically incorrect (UGS). Below is an example of a sentence ending in a noun.
Input sentence: |
彼女は英語を始めた |
``Kanojyo wa eigo wo Hajimeta. '' |
`She begin to study English.' |
The first candidate: |
彼女は英国は自宅. (UGS) |
``Kanojyo ha eikoku wo jitaku (UGS)'' |
`She is Enlish is home.' (UGS) |
Such sentences can be easily eliminated by using a simple rule that selects only sentences ending with a verb or verb phase, instead of using the valency patterns.
In this study, we used the word bigram model for the speech recognition program and calculated word bigrams using the sentences collected from newspaper issues published in one year [6]. These issues contained many headlines that usually ended with a noun. Accordingly, many sentences ending with a noun were used as data for calculating word bigrams. Consequently, the text data for computing word bigrams should be such that the end of the sentence is noun.
We should remove the running titles of the newspapers before calculating the word bigram models.