Friday, November 29, 2013

SYSTRAN: A Brief History of Machine Translation

When we last looked at the history of machine translation (MT), we covered the ALPAC report and prior to that, the Georgetown-IBM experiment. Today we're looking at SYSTRAN, one of the oldest technologies in MT.

SYSTRAN traces its origins back to the Georgetown-IBM experiment, and in 1968, the company was founded by Dr. Peter Toma. Despite the lack of funding available to MT research following the ALPAC report, SYSTRAN survived and would work closely with the US Department of Defense.

In 1969, SYSTRAN was contracted by the US Air Force (USAF) in order to provide MT for them. During the Cold War, as per usual, US military branches were very interested in what the Russians were up to. Translations were from Russian to English and covered various domains, while the USAF was particularly interested in scientific and technical documents.

If you have used MT before, you will know that the quality tends to lag far behind that of human translators. The same could be said for the translations provided by SYSTRAN during the Cold War. Despite the quality of the translations, they were generally understood by those using them.

A barbel fish, not to be confused with BabelFish.
SYSTRAN was contracted to work for the Commission of European Communities (CEC) in 1975. Work began on a new system in 1976 operating from English to French. The system for French to English arrived the following year, and a third language combination was provided in 1979.

By 1981, the CEC was using SYSTRAN on an experimental basis for English-French, French-English, and English-Italian. At the time, French translators did not show the same zeal towards the systems as those translating between English and Italian. In 1982, 293 pages were translated from English to Italian with the assistance of SYSTRAN and 330 pages were translated from French to English. That said, these numbers equated to 50% of the Italian workload and only 25% of the French workload.

SYSTRAN had also provided services for Xerox as of 1978 and had been shown to increase productivity, though in-house translators still expected a higher degree of quality than that of the MT provided. English was translated into six target languages for Xerox, and SYSTRAN reported that they were satisfied with the results.

Xerox staff were encouraged by SYSTRAN to change the way they worked in order to maximise the efficiency of their products, whereas the CEC did not report as much productivity as Xerox. The USAF was also still using SYSTRAN and incorporating the newer language pairings as they became available.

In 1995, SYSTRAN released SYSTRAN PRO on Windows, and by 1997, search engine AltaVista's BabelFish, powered by SYSTRAN, was providing real-time translations on the internet. For many years SYSTRAN provided rule-based MT and helped power Google's language tools until 2007 and the translation widget in Mac OS X, among other things.

SYSTRAN also provided MT combining rule-based translation and statistical machine translation in 2010, one of the first products on the marketplace to do so. Though SYSTRAN is still a distance from the levels attained by human translators, the research conducted throughout the decades could be argued to have helped MT to survive until now.

Friday, November 22, 2013

The ALPAC Report: The Failings of Machine Translation

One of the organisations interested in
the potential of machine translation.
Not long ago, we had a look at the birth of machine translation (MT) with the Georgetown-IBM experiment. Following the experiment, optimism was at an all-time high for MT, and the problem was expected to be solved promptly. Today we're looking at the next important milestone in early MT, the ALPAC Report. Unfortunately, our tale includes a lot of government bodies and research groups, so expect a lot of acronyms.

In the US, the Department of Defense, the National Science Foundation, and the Central Intelligence Agency (CIA) were very interested in the prospect of automatically processing languages and MT. In the case of the Department of Defense and the CIA, this was mainly because the US was extremely curious and sceptical of the Russians and wanted to know what they were up to. By 1964 they had promoted and funded work in the field for almost a decade, and together the three organisations founded the Joint Automatic Language Processing Group (JALPG).

In 1964, JALPG set up the Automatic Language Processing Advisory Committee (ALPAC) in order to assess the progress of research. ALPAC was, in essence, founded by the US Government to ensure that funds were being spent wisely.

John R. Pierce, head of ALPAC.
The group was headed by chairman John R. Pierce, an employee of Bell Labs, who was assisted by various researchers into MT, linguists, a psychologist and an artificial intelligence researcher. They worked together in order to produce the 1966 ALPAC report, which was published in November of that year.

Titled "Languages and machines: computers in translation and linguistics", the report would appear to have a focus not only on MT, but also on computational linguistics as a whole. However, the report viewed MT very narrowly, from the perspective of its applications in terms of the US government and military, and how they could use the technology exclusively with the Russian language.

The report showed that since most scientific publications were in English, it would actually be quicker and therefore more cost-effective to learn and read Russian than to pay for translations into English. They also noted that there were an abundance of translators and that the supply of translators outweighed the demand for them, meaning that there was even less demand for research into MT to replace human translators.

While the report evaluated the translation industry in general, it also covered research into MT. It condemned the work done in Georgetown, as there was little evidence to support quality translations from the same place that had spawned the idea that the MT issue was close to being solved.

In fact, Georgetown's MT project had produced no translations of scientific texts, nor had it any immediate plans to do so. The report had defined MT as a process that required no human interaction and the fact that Georgetown's work still required human post-editing left ALPAC to deem it as a failure.

One of the criticisms of the unedited output of the MT was that though it could be deciphered by a human reader, it was sometimes inaccurate or completely wrong. It also criticised the work of Georgetown when compared with the 1954 experiment, stating that the output from 10 years previous were not only better, but showed little progress of the programme after that time.

Though the input for the original experiment was extremely limited and the systems tested by ALPAC were experimental, this did not lead to ALPAC cutting Georgetown any slack. ALPAC did, however, state that MT was not an issue with a foreseeable resolution as the Georgetown-IBM experiment had certainly suggested.

Though ALPAC hardly praised MT, it did appear to approve of the ideas of "machine-aided translation", which effectively refers to translation tools, which are fairly commonplace in today's translation industry. The report assessed that MT had advanced the field of linguistics more than it had the field of computing, and that MT was not deserving of more funding. Before it could receive more funding, certain criteria would have to be met.

In conclusion, ALPAC suggested the following:
  1. practical methods for evaluation of translations; 
  2. means for speeding up the human translation process;
  3. evaluation of quality and cost of various sources of translations;
  4. investigation of the utilization of translations, to guard against production of translations that are never read;
  5. study of delays in the over-all translation process, and means for eliminating them, both in journals and in individual items;
  6. evaluation of the relative speed and cost of various sorts of machine-aided translation;
  7. adaptation of existing mechanized editing and production processes in translation;
  8. the over-all translation process; and
  9. production of adequate reference works for the translator, including the adaptation of glossaries that now exist primarily for automatic dictionary look-up in machine translation
It would be fair to say that given the aim of the report, ALPAC achieved its objective of assessing MT. The downside to the report is that research into MT was effectively suspended for two decades, since all significant government funding was cut.

Perhaps we are little bitter that the ALPAC report was so damning of the work of MT merely because we can still see failings in modern day MT, such as our "favourite" Google translate. However, it would be fascinating to see what MT could have achieved had it been funded with as much fervour during the 60s, 70s, and 80s as it had been in the mid-to-late 50s.

Do you feel we would be better off had MT research continued? Or do you think "Machine-Aided Translation" was the correct avenue to pursue? Tells us your thoughts in the comments below. If you wish to read the 1964 ALPAC report, a full copy can be found here.

Wednesday, November 20, 2013

Intro to Translation Studies: Equivalence

When we first introduced Translation Studies (TS), we discussed the emergence of the linguistic turn, whereby TS drew most of its fledgling knowledge from its sister discipline, linguistics. When first considering a translation, there is a question that every translator asks themselves. Does the target text (TT) accurately reflect what is written in the source text (ST)?

Obviously, for every good translator the answer to this question should be "yes". However, it surely can't be that simple, can it? Unfortunately, the answer seems to be "no".

As we covered in our introduction to the series, prior to the 1950s there was not a significant call for TS as a discipline. However, by the 1950s there were studies being conducted and even classes being taught in comparative linguistics, whereby established academics and students alike were formalising the field of TS.

The concept that we will be discussing today, equivalence, put simply is finding an equal value (hence equi and valence) between the ST and the TT. However, you will soon see, as with many things in TS, that it's not that simple.

The idea of natural equivalence proposed that languages have pre-existing equivalents before translation takes place. To oversimplify, if, prior to contact with each other, the French and the English had made bread, surely there would be a word for "bread" in both English and French that shares a natural equivalence. Any time the word bread is used in English, it could surely be translated as pain in French, and vice-versa.

A Korean road sign. Would anyone care to
tell us how it translates literally?
Of course, as anyone with any practical experience in translation knows, this is rather idealistic. Unsurprisingly, natural equivalence is fairly prescriptive and of little use to practising translators.

The other, and perhaps more useful, side of the coin is directional equivalence, whereby the translation does not pre-exist the act of translation. The French TS scholars Vinay and Darbelnet humourously discussed this in an anecdote concerning road signs on Canada's highways, particularly in Quebec, where the signs are in both English and French.

On one particular road sign, the word slow in English is represented as lentement in French, a translation of the English. This is seen to be peculiar as in France the word ralentir is used, as chosen by the French to convey the message, rather than being a translation from English. The relationship between slow and ralentir in road signs is seen as natural equivalence, whereas the relationship between slow and lentement is seen as directional equivalence.

In the 1950s, American Eugene Nida looked at TS through a linguistic lens. His earliest works were based on structural linguistics, a field of linguistics stemming from Ferdinand de Saussure's seminal work. While Nida's work in linguistics may be of interest to the linguistic field, it was his work in TS in the 1960s that we will be paying closer attention to today.

Nida's most important work on equivalence could be argued to have helped bridge the linguistic turn to the cultural turn that we will soon be covering. Nida's work concerned directional equivalence, which was subdivided into formal and dynamic equivalence.

In formal equivalence, the structure (such as syntax and grammar) is strictly adhered to, which in some cases can create unnatural-sounding and unwieldy expressions in the target language. Dynamic equivalence, however, renders the TT in more idiomatic and natural ways in the target language, while maintaining the meaning of the ST. However, these are not a "one-size-fits-all" solution and have been criticised as oversimplifying issues in translation.

Nida's theories on equivalence, though sometimes criticised, are important to TS as they were some of the first to consider culture as the main focus in the translation action and pioneering for the time.

Friday, November 15, 2013

Four Ways You Can Become a Better Language Tutor by Ron G

It’s tough becoming a good language tutor.

I've had experience tutoring all kinds of people, ranging from students in intensive language programs, to translators needing help passing certification exams, to college students needing to learn how to write better term papers and essays.

Early in my tutoring career, I was frustrated because I could tell I wasn't helping people as well as I wanted to. With practice and effort, however, I improved.

Based on what I learned as a language tutor, here are four ways to become a better tutor and get the most out of your students.

1. Be Kind

Most people get into tutoring because they like to help people. Yet some people think that to really help their students, they have to be hard on them.

That might not be the best idea.

Tutoring is so effective because you’re dealing with a person in a one-to-one setting and can therefore focus more of your attention. That same kind of intimacy, though, amplifies any criticism or negative comments. Being too harsh, even if your intentions are good, can easily cause a student to become discouraged and clam up.

While you’re correcting students, use quite a bit of tact. Err on the side of kindness. It’s definitely better to be too nice than to be too mean.

2. Tailor your instruction to the student

While studying Spanish, I decided to hire a tutor to help me with my conversation skills. My tutor was a native Spanish speaker who was very intelligent and understood Spanish really well. Unfortunately, during our first session together, he didn’t individualize his instruction for me at all. He simply read from Chapter One of a Spanish textbook.

I got very little out of that session. Had he spent even a couple minutes assessing my needs, he would’ve understood that I needed conversation practice and not a grammar lecture. He would’ve also understood that I was at an intermediate level at the time, well past Chapter One of a beginner’s textbook.

Treat each of your students as an individual. Figure out what they’re strong at and where they need specific improvement, and then come up with an appropriate plan of instruction.

3. Keep your student’s goals in mind

What is your student really trying to accomplish? Be specific when answering this question. Is she trying to:
  • Become conversationally fluent?
  • Pass a test, class, or certification exam?
  • Improve a specific skill, like vocabulary?
  • Prepare for international travel?
Each goal requires its own plan of attack. You probably wouldn't teach a student intricate grammar points if her goal is simply to speak better in social settings. You probably would do that, however, if she were preparing for a CEFR exam.

If your student doesn't have a specific goal in mind, help her identify one. Figure out exactly what she wants to achieve, when she wants to achieve it, and how you’ll measure whether she’s successful.


4. Pick your battles

You can’t nitpick a student to death, especially during conversation.

If you use written assignments, administer quizzes, or perform targeted drills, it’s fine to mark a student’s answers as correct or incorrect. Most people are comfortable with being “graded” if the rules and criteria are well-established and the correction is limited to the context of the activity.

But if you’re helping a student practice conversation, you have to let the student talk.

Language learners speaking their new language have to pronounce words correctly, use proper grammar, and remember vocabulary words accurately. With so much to keep track of and so many ways to slip up, they’re going to make a lot of mistakes.

Try not to correct every single error during speaking. Nitpicking backfires for a several reasons. It will slow down the session, and it will frustrate, discourage, and overwhelm the student.

Instead, pick your battles. During conversation practice, try not to interrupt or correct the student unless you:
  • Cannot understand what the student is trying to say.
  • Notice recurring errors.
  • Spot an error that would cause the student social embarrassment.
Wrapping Up

Tutoring well requires quite a bit of finesse, which comes with practice and time. In the meantime, if you try to keep in mind the four tips above, you should see immediate improvements in your ability to connect with your students and help them achieve their language goals.

Ron G. is a technical writer and translator from Orlando, FL. He writes about language learning at www.languagesurfer.com.


Monday, November 11, 2013

Intro to Translation Studies: Part 2

On Friday we started our new Intro to Translation Studies series with a brief overview of the background of the field, discussing translation theorists before there was any solid translation theory. Whilst little was being done in the field of translation studies, linguistics had always seemed to be around and everybody believed it was more than sufficient in its own right to describe any phenomena that could occur in translation.

For many years, any study of translation fell into the hands of linguists, at least until the late 1950s and early 1960s, when scholars felt that translation needed its own academic discipline. Slowly but surely, translation studies was born.

The first translation theories were still founded in linguistics. As linguistics is the study of language and translation was definitely a language-based activity, basing the initial theories within linguistics seemed almost inevitable. For a while, this would do just fine. The field hardly boomed and struggled with attempts at becoming a globally-recognised discipline.

This linguistic-based view of translation would later be referred to as the linguistic turn, with two more turns coming. These first theorists were generally from North America and Western Europe, Britain in particular.

Jean-Paul Vinay and Jean Darbelnet were two linguistic translation theorists during the linguistic turn. Though both were born in Paris, France, they would later emigrate to Canada, where they would complete their most influential works in translation studies.

A British shorthair, perhaps named John...
Vinay and Darbelnet mainly worked between the languages of French and English, using Quebec and the Northeast of the United States as their inspiration for analysing the two languages through comparative linguistics.

British-born John Catford would further add to translation studies' budding corpus, basing his work firmly in the field of linguistics, as would be expected from someone who studied phonetics.

American Eugene Nida was famed for his theory on dynamic equivalence which he devised principally for Bible translations. He should also be noted for the ideas of foreignization and domestication which we will be covering later in the series.

These four translation theorists are certainly not the only four to have contributed to translation studies' linguistic turn, but we certainly felt they would be apt examples to whet your appetite as we approach the next milestone in our story, the cultural turn.

Part 1 | Part 2

Friday, November 8, 2013

Intro to Translation Studies: Part 1

When it comes to languages, there is an activity which can appear incredibly simple and horrifyingly complex at the same time. This bipolar activity is translation. As part of a new series on The Lingua File, we'll be running a crash course on the "science" behind translation. We're not claiming to be experts on the subject, rather we're looking to inform all you language lovers out there that there is more to translation than making "language A" into "language B".

As you know, there are many languages in the world, and there have been for a very long time. Throughout history, peoples of different cultures, races, and languages tended to kill each other long before trying to communicate with one another, but when they actually did communicate, there was an inherent need for translation.

A marble bust of Cicero around age 60.
The act of translation was rarely studied prior to modern-day history, though translation scholars will often point to the second century BCE with Roman philosopher and jack of all trades Cicero. For the purpose of translation studies, not much is needed to be known about Cicero, other than that he was one of the first notable figures to openly think about the translation process whilst translating Greek philosophical theory into Latin. Cicero obviously knew both Greek and Latin, and he is known to have stated that the act of translation improved his speaking skills in both languages.

The sparse history of translation studies doesn't end there, though it certainly is lacking in events. It would be another 500 years before any notable figures would grace the translation studies scene, at least in Western translation studies history. The patron saint of translation is St. Jerome, whose day is honoured as International Translation Day for translators or the more secular-minded. For more information, you can see how we "loosely" covered the history of St. Jerome last year.

We'll be back on Monday continuing our first steps into the fascinating field of translation studies.

Part 1 | Part 2

Friday, November 1, 2013

Algerian War of Independence: The Languages Of Algeria

On this day in 1954, the Algerian War of Independence, which is known as the Algerian War in English, Guerre d'Algรฉrie in French, and ุงู„ุซูˆุฑุฉ ุงู„ุฌุฒุงุฆุฑูŠุฉ in Arabic, began. The conflict, which lasted over seven years, resulted in Algeria's independence from France, as well as becoming an important conflict in world history for the lessons learnt in terms of decolonisation and what is known as an
asymmetrical war.

The Tassili n'Ajjer mountain range in Algeria.
Algeria had been a French colony for over 100 years following its invasion by the French in 1830. However, today marks what was labelled a "Red All Saints' Day", or Toussaint Rouge, the day when members of the National Liberation Front (Front de Libรฉration Nationale or FLN in French) staged attacks on both military and civilian targets and broadcast a proclamation for Muslims in Algeria to restore the state to a sovereign nation under the principles of Islam.

Though a fairly sombre day to be remembered, the result of the war would allow Algeria its independence and result in the fall of France's fourth republic. Rather than dwell on the depressing nature of war, we felt it would be wiser to spend today commemorating Algeria's linguistic diversity, some of which is due to its relationship with France.

Official Languages

Arabic

Algeria's sole official language is Modern Standard Arabic. This form of the macrolanguage is considered to be the most formal and is based on Classical Arabic, the type of Arabic that is used in the Qur'an.

Nearly 3 in 4 Algerians speak Arabic as their native language. Algerian Arabic has been influenced by Berber, another native language of Algeria, not to mention French and Spanish, owing to the country's colonial past.

85% of Algeria's population can speak Algerian Arabic and of those who speak Arabic in the country, 83% speak this form of Arabic.

National Languages

Berber

Along with Arabic, the Berber language is also considered to be native to Algeria. Berber is considered to be a language family that spans across a large portion of North Africa, including Burkina Faso, Egypt, Libya, Mali, Mauritania, Morocco, and Niger.

Over a quarter of Algeria's population speak a Berber dialect, and since 2001 Berber has held official status in Algeria as a national language.

Regional Languages

Hassaniya Arabic

The variety of Arabic spoken in the west of Algeria, Hassaniya Arabic, is estimated to have around 3.3 million speakers. Though the language is spoken principally in Mauritania and the disputed area of Western Sahara, it can also be found in the bordering areas of Algeria.

It is considered to be somewhat distinct from Algerian Arabic, and as a result holds no official status in the country.

The province of El Taref in northeast Algeria.
Korandje

This language is spoken by only 3,000 people in the town of Tabelbaba, in the southwest of Algeria. This town is unique in Algeria as being the only town to not speak Arabic or Berber as a principal language. The Korandje language is considered to be a member of the Songhay language family, a group of languages generally found around the central areas of the Niger River, though very few studies have been conducted or published on it.

Immigrant Languages

French

Algeria is considered to be the second largest Francophone country in the world. 1 in 3 Algerians are said to be able to read and write French, and despite independence from France, government and official affairs are still sometimes conducted in French.

Despite its independence from France, French still is the most commonly studied language in Algeria and many university classes are still conducted in French. However, education and bureaucracy in Algeria is becoming more and more Arabic in its affairs.

Are there any important languages spoken in Algeria that we have missed? Tell us about them in the comments below.