過ちはコンピューターの常:機械翻訳が人間の翻訳者に追いつけない理由 (前編) To err is computers: Why machine translation cannot catch up to human translators [Part 1 of 2]

  • 2017/01/31
  • エディターの視点
  • 翻訳
Article image

– K.N., Senior Translator

While machine translation may sometimes be able to provide a gist translation, it almost invariably generates low-quality translations with major flaws in terms of accuracy and readability. Why is it that, in an age in which cars can drive themselves, robots can perform neurosurgery, and computer software can beat shogi masters, and after many decades of research and development, computers still fail to capably render text between different languages?


A simple and commonly cited reason is that words can have multiple meanings, and computers cannot make the appropriate choice based on the context. Many words have multiple meanings and usages—as many as 179—and their meaning depends on the context. The word “pitch” for example has different meanings in music, baseball, mountaineering, and business, as well as additional meanings in British English. Even if computers could identify the field, they would still be powerless to select the correct term in cases of homographs used within the same field, such as 生物 (may mean “organism”, “uncooked food”, or “biology”) and バリウム (“barium” or “Valium”). The same applies to phrases and sentences; “chemistry test” has entirely different meanings in classroom settings and in the film industry, and the meaning of “He turned two.” depends on whether said subject is a toddler or an infielder. It is thus necessary to judge the context not only on a sentence level, but also on a broader scale far beyond the reach of machine translation. Moreover, many terms diverge into multiple concepts in other languages; for example, in Spanish “fish” splits into “pez” and “pescado” depending on the state of the fish, and “understand” translates into either “entender” or “comprender” depending on the situation. Still others have no true equivalents, such as やるせない and ~してしまった; translation of such expressions requires human ingenuity. These are just a few of a countless number of examples in which no one-to-one correspondence exists between different languages. The meaning of words and phrases always depends on the context. Because machine translation is not capable of accurately judging context, it inevitably produces erroneous translations.


Second, in addition to the challenge of translating the words that appear in the text, it is also necessary to determine and reflect information that is not explicit. For instance, since subjects are often omitted in Japanese sentences but are necessary in English, a Japanese-to-English translator must deduce the subject from the context. Similarly, because Japanese lacks articles and generally has no singular and plural forms, these details must also be added when translating into English, by examining the context. The choice of words in the translation also depends on inferred elements such as flow, tone, placement of emphasis, subtle nuances, and interplay between words, as well as background information such as stylistic needs, purpose, and target readership. Consideration of the above—which is essential for fully understanding and translating the meaning of the text—entails reading between the lines, and may also require research skills, specialist knowledge, and cultural fluency. These are uniquely human skills that computers do not possess. If the meaning of a given text were the proverbial iceberg, individual words would merely constitute its superficial tip; much of the meaning lies embedded beneath, and it takes human brainpower to understand and express the meaning in its entirety.

[Click here for Part 2]

– K.N., シニアトランスレーター



その理由としてよく引き合いに出されるのは、言葉には複数の意味があり、コンピューターは文脈に基づいて適切な意味を選択することができないという点です。複数の意味や用法を持つ単語はたくさんあり (中には179もの意味や用法を持つ単語もあります)、単語の意味は文脈に依存します。例えば「pitch」という単語は、音楽、野球、登山、ビジネスのどの分野で使われるかによって意味が異なり、イギリス英語ではそれ以外の意味も持ちます。たとえコンピューターが分野を特定できたとしても、同じ分野で用いられる同綴異義語 (例:「いきもの」とも「なまもの」とも「せいぶつ」とも読める「生物」;レントゲン検査に使われる「バリウム」と抗不安薬の「バリウム」) を区別することはできません。このことは単語だけでなくフレーズや文にもあてはまります。例えば「chemistry test」は学校の現場で使うのか (「化学の試験」)、映画産業で使うのか (「相性テスト」) によって意味が異なり、「He turned two.」は幼児について言うのか (「2歳になった」)、内野手について言うのか (「ダブルプレーを取った」) によって意味が違います。つまり翻訳では、単語や文単位の情報だけでなく、機械翻訳では力の及ばない、より広い範囲の文脈情報を用いることが必要なのです。その上、他の言語では同じ言葉が複数の概念に分かれるケースもたくさんあります。例えばスペイン語では、「魚」はその状態に応じて「pez」と「pescado」を使い分け、「理解する」と言う場合も状況によって「entender」または「comprender」を用います。また、「やるせない」や「~してしまった」のような対訳のない日本語独特の言葉を翻訳するときは、翻訳者の創意工夫が必要です。上記の例はほんの一部で、異なる言語の間に一対一で対応する言葉がないケースは数えきれないほどあります。単語やフレーズの意味は常に文脈に依存しますが、機械翻訳は文脈を正確に理解することができないため、どうしても間違った翻訳文を生成してしまうのです。





  • {{msg}}


  • 2017/01/31
  • 大いに有益だった

Very good article. I'll look forward to Part II. AC