|
Ваше мнение:
Как вам новый сайт бюро переводов Лингвотек?
Лингвотек – это гарантия качественно сделанного перевода.
В этом уже убедились наши постоянные заказчики. |
Лингвотек – гарантия качественного перевода
Machine translationMachine translation, sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies. Current machine translation software often allows for customisation by domain or profession (such as weather reports) — improving output by limiting the scope of allowable substitutions. This technique is particularly effective in domains where formal or formulaic language is used. It follows then that machine translation of government and legal documents more readily produces usable output than conversation or less standardised text. Improved output quality can also be achieved by human intervention: for example, some systems are able to translate more accurately if the user has unambiguously identified which words in the text are names. With the assistance of these techniques, MT has proven useful as a tool to assist human translators, and in some cases can even produce output that can be used "as is". History
Real progress was much slower, however, and after the ALPAC report (1966), which found that the ten-year-long research had failed to fulfill expectations, funding was greatly reduced. Beginning in the late 1980s, as computational power increased and became less expensive, more interest was shown in statistical models for machine translation. The idea of using digital computers for translation of natural languages was proposed as early as 1946 by A. D. Booth and possibly others. The Georgetown experiment was by no means the first such application, and a demonstration was made in 1954 on the APEXC machine at Birkbeck College (University of London) of a rudimentary translation of English into French. Several papers on the topic were published at the time, and even articles in popular journals (see for example Wireless World, Sept. 1955, Cleave and Zacharov). A similar application, also pioneered at Birkbeck College at the time, was reading and composing Braille texts by computer.
Decoding the meaning of the source text; and
Therein lies the challenge in machine translation: how to program a computer that will "understand" a text as a person does, and that will "create" a new text in the target language that "sounds" as if it has been written by a person. This problem may be approached in a number of ways.
It is often argued that the success of machine translation requires the problem of natural language understanding to be solved first. Generally, rule-based methods parse a text, usually creating an intermediary, symbolic representation, from which the text in the target language is generated. According to the nature of the intermediary representation, an approach is described as interlingual machine translation or transfer-based machine translation. These methods require extensive lexicons with morphological, syntactic, and semantic information, and large sets of rules. Given enough data, machine translation programs often work well enough for a native speaker of one language to get the approximate meaning of what is written by the other native speaker. The difficulty is getting enough data of the right kind to support the particular method. For example, the large multilingual corpus of data needed for statistical methods to work is not necessary for the grammar-based methods. But then, the grammar methods need a skilled linguist to carefully design the grammar that they use. To translate between closely related languages, a technique referred to as shallow-transfer machine translation may be used.
Main article: Rule-based machine translation
Main article: Transfer-based machine translation
Main article: Interlingual machine translation
Dictionary-based Main article: Dictionary-based machine translation
[edit] Disambiguation
Shallow approaches assume no knowledge of the text. They simply apply statistical methods to the words surrounding the ambiguous word. Deep approaches presume a comprehensive knowledge of the word. So far, shallow approaches have been more successful.[citation needed] The late Claude Piron, a long-time translator for the United Nations and the World Health Organization, wrote that machine translation, at its best, automates the easier part of a translator's job; the harder and more time-consuming part usually involves doing extensive research to resolve ambiguities in the source text, which the grammatical and lexical exigencies of the target language require to be resolved: Why does a translator need a whole workday to translate five pages, and not an
hour or two? ..... About 90% of an average text corresponds to these simple conditions.
But unfortunately, there's the other 10%. It's that part that requires six [more]
hours of work. There are the ambiguities one has to resolve. For instance, the
author of the source text, an Australian physician, cited the example of an epidemic
which was declared during World War II in a "Japanese prisoner of war camp". Was
he talking about an American camp with Japanese prisoners or a Japanese camp with
American prisoners? The English has two senses. It's necessary therefore to do
research, maybe to the extent of a phone call to Australia.
SYSTRAN, which powers AltaVista's Babel Fish
Despite their inherent limitations, MT programs are used around the world. Probably the largest institutional user is the European Commission. Toggletext uses a transfer-based system (known as Kataku) to translate between English and Indonesian. Google has claimed that promising results were obtained using a proprietary statistical machine translation engine. The statistical translation engine used in the Google language tools for Arabic <-> English and Chinese <-> English has an overall score of 0.4281 over the runner-up IBM's BLEU-4 score of 0.3954 (Summer 2006) in tests conducted by the National Institute for Standards and Technology. Uwe Muegge has implemented a demo website that uses a controlled language in combination with the Google tool to produce fully automatic, high-quality machine translations of his English, German, and French web sites. With the recent focus on terrorism, the military sources in the United States have been investing significant amounts of money in natural language engineering. In-Q-Tel[10] (a venture capital fund, largely funded by the US Intelligence Community, to stimulate new technologies through private sector entrepreneurs) brought up companies like Language Weaver. Currently the military community is interested in translation and processing of languages like Arabic, Pashto, and Dari.[citation needed] Information Processing Technology Office in DARPA hosts programs like TIDES and Babylon Translator. US Air Force has awarded a $1 million contract to develop a language translation technology.
Relying exclusively on unedited machine translation ignores the fact that communication in human language is context-embedded, and that it takes a human to adequately comprehend the context of the original text. Even purely human-generated translations are prone to error. Therefore, to ensure that a machine-generated translation will be of publishable quality and useful to a human, it must be reviewed and edited by a human. It has, however, been asserted that in certain applications, e.g. product descriptions written in a controlled language, a dictionary-based machine-translation system has produced satisfactory translations that require no human intervention.
|
Бюро переводов
Мы не просто делаем для вас перевод текста - мы решаем комплекс ваших проблем, связанных с переводом, версткой и легализацией документов...( подробнее )
![]()
Пресс-релизы: ( все релизы )
01.11.2008
Жи и ши - и не только На сайте "Лингвотек" появился раздел, посвященный русской орфографии11.10.2008
Секреты русского синтаксиса Разработан очередной раздел нашего сайта |
|||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||