How sick and tired I am of translation, and what a losing battle it is always.
Wish I had the courage to wash my hands of it all, I mean leave it to others
and try and get on with some work. - Samuel Beckett
One day when I was in Alcaná de Toledo, I saw a boy selling some old
papers and portfolios to a silk dealer; now, since I am fond of reading - even
old scraps of paper found in the street - I followed my natural inclination
and took one of the notebooks the lad was selling; I recognised the script as
Arabic. Although I recognised it I could not read it, so I cast about for a
baptised and lettered Moor to read it for me; it was not very difficult to find
such an interpreter, since even translators of an older and better language
were available. Fortune found me one who, when I told him what I was after and
put the book in his hands, opened it in the middle and, after reading a few
lines, began to laugh. I asked him what the joke was, and he said it was a note
written in the margin. I asked him to tell me what it was, and he, still laughing,
said:
- As I said, in the margin it says: "This Dulcinea del Toboso, who is so
frequently mentioned in this story, is said to have salted pork better than
any other woman in La Mancha".
When I heard the name "Dulcinea del Toboso" I was astonished and amazed,
realising that those notebooks contained the story of Don Quixote. I therefore
urged him to read the beginning and he, immediately turning the Arabic into
Castilian Spanish, told me that it said: The History of Don Quixote of La Mancha,
written by Sidi Hamete Benengeli, Arab Historian. Great presence of mind was
required to conceal my satisfaction when the title of the book reached my ears;
racing to the silk dealer's I bought all the papers and portfolios from the
boy for half a real (i.e. one eightth of a peseta); had he had the wit to see
how much I wanted them, he could have got more than six reales out of the deal.
I then took the Moorish convert to the cloister of the cathedral and asked him
to translate into Castilian everything concerning Don Quixote, omitting nothing
and adding nothing, for the price that he would name. He settled for two arrobas
of raisins and three bushels of wheat, promising a faithful and prompt translation;
however, to simplify matters, and so as not to let slip such a bargain, I brought
him to my house, where in little over a month and a half he translated the lot,
just as it is set out here.
I: Translators
It took me fifty minutes to draft that translation of 424 words from Don Quixote.
Another ten minutes to change a few words, without keying them in, and there
were two points I had to check with a Spaniard and a better dictionary. Let's
say 400 words per hour for a 400-year-old text in my fourth-best foreign language.
Even assuming I gathered speed as I got the hang of the text it would still
take me six months to the Moor's six weeks. And while my version might just
about pass muster, his is one of the finest bits of Spanish ever written. If
that were all, I might still sleep at nights. But look at his rates! Since he
actually asked for goods rather than cash, we know that the retail value of
his two arrobas (22.5kg) of raisins is 100 Swiss francs. The quantity of wheat
was just over 0.1 cubic metres in modern units. I don't know how much that costs,
but I suspect that whereas the United Nations offers the (rather mean) rate
of 220 Swiss francs per thousand words, that translator traitor took no more
than 220 francs for half a million words. It is true that he got office space
and per diem in spite of himself, but the idea of a man with so much talent
and so little sense makes Don Quixote himself look like a boring civil servant.
"La traduction est un travail de con qu'un con ne peut pas faire".
Be careful how you translate that maxim into English. And do not infer from
it that inability to translate is a sign of stupidity. But to assume that, because
you can speak two languages indifferently (as the French say) you can translate
between them is like saying "I think therefore I am a brain-surgeon".
A nameless Egyptian in the fifteenth century BC advised "Put writing in
your heart that you may protect yourself from hard labour of any kind".
Those scribes kept the writing system complicated: it was not in their interest
to develop a simple alphabet. In the fifteenth century AD, Confucian scholars
in Korea fought a furious rearguard action to defend the even more complex Sino-Korean
system against Han'gul, which must be the most simple, accurate and elegant
writing system there is. After ruling the roost since Gutenberg, the print workers
were replaced in the 1980s with the word processor. Surely it is only a matter
of time till translators go the way of the scribe and the typesetter?
I have another admission to make: my average speed over the year is nothing
like 400 words per hour. You see, I translated the Cervantes on a Saturday afternoon
when I was rested, I had read the text several times in the past, and above
all it made sense. At work it isn't always like that. Besides which, whereas
the narrator seems to have found the best and cheapest translator in history
in a few minutes and without any trouble, I spend much of my time and ingenuity
finding and keeping track of translators I can trust.
Translation has been described as the second oldest profession. I didn't choose
to become one, but once when I was down on my luck someone offered me a few
pounds for a quick job and one thing led to another. It wasn't the sort of work
I thought I had any emotional attachment to, until they spoke of replacing me,
or some of my functions, with a machine. "The Good Ship Venus" sprang
to mind: "The first mate's name was MacQueen, who invented a (certain)
machine". I don't recall all the words, but I remember it ended badly.
And it served him right.
II: The forgotten history of automatic translation
Machine translation is seen as a new discipline, something that began in the
aftermath of the Second World War. Many failures are forgiven a science in its
infancy - though a fifty-year infancy would stretch the patience of most parents.
But the theoretical basis of such schemes goes back not five decades but three-and-a-half
centuries.
I haven't finished reading Don Quixote, though this at least I have understood:
the author is debunking the impossible quest of the knight-errant. The 17th
century saw the final debunking of another impossible quest that had been going
on even longer - a millenium or two - not for the Holy Grail, but for the philosopher's
stone, the alcahest, the universal solvent, that which could transmute lead
into gold. One quip that catches the victory of chemistry over alchemy is the
one attributed to Robert Boyle, author of The Sceptical Chymist (1661). He pointed
out that if the fabled universal solvent were to be discovered, there wouldn't
be a container for it. (There is a solution to that, but this isn't the place
for it.) The mystical was out, the practical was in. Of greater relevance to
translation, the word was being ousted by the number. And surely number is something
of a universal language? From this point onward when people looked for the universal
solution to linguistic problems and differences, they turned not to real or
imagined links with Hebrew (as the language everyone spoke before the collapse
of the tower of Babel), but to numbers.
The likes of Athanasius Kircher pursued the quest for the Holy Grail in terms
of a procedure for turning one language into another without understanding it.
His system (published in 1663) is described by Umberto Eco Here is part of a
table for translation between Latin, Italian, Spanish, French and German:
Latin Italian Spanish French German
abalienare I.1 astenereI.4 abstenirI.4 abstenirI.4 abhaltenI.4
abdereI.2 abbracciare II.10 abbraçarII.10 abayerXII.35 abschneiden I.5
An enthusiast sent Kircher a message of congratulation using the method, but
Kircher couldn't decipher it.
It is here, perhaps for the first time, that we see that combination of translation
and code-breaking in mathematical operations or attempts at such that re-emerged
in an egregious memorandum on machine translation three centuries later (more
on this anon). And notice that with this emphasis on codes and secrecy, we are
talking of language as power first, communication second.
While the intellectual argument against the impossible quest of alchemy was
won, the urge, or one debased form of it, persisted. For the dream of turning
base metal into gold was so seductive to those who needed gold for whatever
reason that a certain category of alchemists realised that, in the theory itself,
they had a way of turning base metal into gold. They used it to talk money out
of their patrons. There is a painting by Rembrandt in the Louvre entitled "The
Philosopher in Meditation" - wrongly, I think, since it follows the iconography
of the alchemist: the little man at the bellows over the furnace in the foreground
is the "puffer", the alchemist's assistant who did all the dirty work
with mercury, sulphur etcetera. The irony is that the puffer and the financier
were often the same person.
In the course of the 17th century, the product on offer changed. Marin Mersenne
was a priest who took an interest in prime numbers, especially the ones that
can be expressed as 2p-1, where p is a prime number; this made him a fore-runner
of modern cryptography. In 1629, Mersenne told Descartes that a lawyer called
des Vallées had discovered the matrix language whereby all others could
be understood. Descartes smelled a rat: "... et si-tost que ie voy seulement
le mot d'arcanum en quelque proposition, ie commence à en avoir mauvaise
opinion". Richelieu asked des Vallées to print his project; des
Vallées stalled and asked for a state pension. Richelieu, who didn't
win his cardinal's hat in a raffle, let the matter drop. Cave Beck, author of
The Universal Character (1657) claimed that a universal language would benefit
mankind in terms of trade and would make great savings on interpreters' fees
; de Maimieux (Pasigraphie, 1797), claimed that his written-only language would
permit communication between Europe and Africa, serve to check translations,
and expedite diplomatic, civil and military operations.
While Descartes had dreamed of a language whose units would be the building
bricks of thought, one which would do what de Maimieux claimed his Pasigraphie
had achieved, he did realize it was a dream. The attempts of Leibniz to bring
it to fruition failed, basically because he found that there is no objective,
non-arbitrary classification of concepts. The encyclopaedia retreated to the
pragmatic practice of mapping out existing areas of knowledge. As D'Alembert
put it in the "Discours Préliminaire" to the Encyclopédie
(Paris, 1751), the order of the encyclopaedia "¼is a kind of map
intended to show the main countries, their positions and mutual relations, and
the shortest path from one to the next - a path that is often blocked by a thousand
obstacles ¼ which can be depicted only in highly detailed individual
maps. These individual maps are the various articles of the Encyclopedia...".
I have to insist on the importance of this failure, because it crippled automatic
translation from the outset. It was not a total failure - as we shall see. Perhaps
just as important as the failure itself is that many proponents of automatic
and later machine translation have remained blissfully unaware of it. As late
as 1889 Frederick William Dyer published his "Lingualumina", which
is reviewed in the Histoire de la langue universelle:
"M: quantity L: space
S: existence B: state
Z: personality K: relation
V: species or class J: 'interchange'" etc.
"The radicals are formed by joining various vowels to those consonants,
whether before or after them...
Li: space eil: limit
lee: line eela: point
lai: angle aila: side" etc.
"The verbs consist of 3 significant letters: the first shows whether
the subject is a person or a thing (remember that z = personality: the second
indicates number (les short vowels i, a, o, for the singular, the long vowels
ee, ah, au for the plural); the 3rd indicates the tense: b, past; d, present;
g, future. Thus the present tense of the verb to be is: zinda, zanda, zonda;
zeeda, zahda, zauda: I am, you are, etc. Past tense: zimba, zamba, zomba, etc.
Future: zinga, zanga, zonga, etc. There is also a perfect tense, produced by
devoicing the past tense: zimpa, zampa, zompa; zeepa, zarpa, zorpa.
"The author further complicates his conjugation with other "subtleties".
Obviously, this so-called logical system could not be more arbitrary, fantastic
and irregular. It has another drawback that derives from its author's nationality:
an Englishman will never be able to conceive of a correct, international phonetic
system, because of the execrable pronunciation his language accustoms him to.
For what could be more absurd than to pronounce the simple letter I as though
it were two vowels (aï), while rendering the simple I sound with two letters
(ea, ee)?". Incidentally, the anglophobia that comes out in this passage
seems to me to undermine the basic project of the authors, that of introducing
an auxiliary language for international communication. If you can't stand your
neighbour you are unlikely to agree with him on a new language to adopt.
The same authors point out that the attempt at pasigraphy, a written-only language
that would be understood by all, did produce some systems which, though arbitrary,
did win acceptance. The Dewey decimal bibliographical classification is one,
and they show how it could tend towards language:
"Here is how the classifying numbers are formed: the corpus of human
knowledge is divided into ten major classes, designated by the ten digits, 0
to 9:
000 Generalities
100 Philosophy and psychology
200 Religion
300 Social sciences
400 Language
500 Natural sciences and mathematics
600 Technology (applied science)
700 The arts
800 Literature and rhetoric
900 Geography and history.
"It is easy to see how this process of subdivision could be continued
until a given idea or subject were set in a class of its own; it would be designated
unambiguously by the series of figures that showed all the successive divisions.
Here is an example of such progressive determination or specification:
61 Medicine
612 Physiology
612.3 Digestion
612.31 Mouth
612.313 Salivary glands
612.313.6 Disorders of the salivary glands
612.313.63 Salivary microbes."
Thus, as the authors point out,
31 = statistics,
331.2 = wages,
677 = textile industry,
31:331.2:667 would thus mean: statistics on wages in the textile industry."
Of course, however refined, it would be of no help at all in most sentences:
this one, for example. (It's also dated rather quickly, as librarians will know;
who would give ten per cent to theology etc. these days?)
Wittgenstein's fresh attempt in the Tractatus (1921) to give rational order
to the world led to what seems an even greater failure, in that, by the time
of the Philosophical Investigations (1945, published 1953) he had given up on
a coherent scheme: "The best that I could write would never be more than
philosophical remarks; my thoughts were soon crippled if I tried to force them
on in any single direction against their natural inclination. - and this was,
of course, connected with the very nature of the investigation. For this compels
us to travel over a wide field of thought criss-cross in every direction. -
the philosophical remarks in this book are, as it were, a number of sketches
of landscapes which were made in the course of these long and involved journeyings.
"The same or almost the same points were always being approached afresh
from different directions."
If it seemed impossible even for one man to give coherence to his thoughts,
how could there be any interlinguistic system robust enough to bear the weight
of meaning in translation? Could the answer have been lost with the "Filene-Finlay
speech translator"?
I have come across a little volume from the library of the J.J. Rousseau Institute
in Geneva, called International Communication: A Symposium on the Language Problem,
by Herbert N. Shenton, Edward Sapir and Otto Jespersen (London, 1931).
Shenton, Professor of Sociology at Syracuse University, contributes an essay
entitled "A Social Problem", and asks "Can social engineers improve
the international language situation?" He writes:
"Various mechanical procedures have been devised such as ... the Filene-Finlay
Speech Translator, now regularly used by the International Labour Office as
a permanent feature of its conference machinery. This telephonic device for
the simultaneous translation of a speech into several languages was conceived
by Mr Edward A. Filene of Boston, and was developed by him in consultation with
Thomas A. Edison, General J.J. Carty, and others. It was finally perfected by
Professor Gordon Finlay, a British scientist." This wonderful machine had
been backed by J.J. Carty, a vice-president of AT&T, and Thomas Edison of
light-bulb fame. What had become of it? Who switched it off? ILO is just across
the road from where I work, so I skipped across one lunch-time. Filene was a
prominent Rotarian and founder of the Credit Union Movement. He wasgiven to
printing his thoughts on "Why Men Strike" and "The European Problem:
A Businessman's View".
I could find no trace of the machine at ILO. Shenton does provide a footnote:
"Described in detail in Commercial Standards Monthly, November, 1930 (a
US Government publication)". I have not been able to obtain it. Not even
the British Library has a copy.
Never mind, there's the IBM home page. It says:
1931¼ Accounting machines are introduced in Japan. New products: - 400-series
alphabetical accounting machines. 600-series calculating machines, which handle
multiplication and division. First permanent installation of the Filene-Finlay
Translator is set up at League of Nations in Geneva. 5% stock dividend declared."
All in a year's work for Big Blue. But wait! What is this machine? The League
of Nations archives provide the answer, in a letter of 9 November 1927 from
the Marquis G. Paulucci de Calboli Barone, Under Secretary-General of the League
of Nations, to Mr Ake W. Hj. Hammarskjoeld, Registrar, Permanent Court of International
Justice, The Peace Palace, The Hague, Holland. His excellency explains that
Filene's scheme for simultaneous telephonic interpretation was tried out on
a small scale in 1926 at the International Labour Conference. He goes on:
"The method employed at this year's [1927 International Labour] Conference
was roughly as follows. The interpreter sits in the hall itself at a short distance
from the speaker and hears the speech of course in the ordinary way as it is
pronounced. He then speaks his interpretation in a low voice into a microphone
which is mounted on his table and with which all the head-phones are connected.
It has been found in practice that by using a microphone in this way an interpreter
is able to dictate his translation in such a low voice that it is quite inaudible
to the speaker and does not interfere with him in any way ¼
" ¼ his system of interpreting is, of course, in many ways much
more difficult than the ordinary method as the interpreter has to listen to
one sentence while translating the preceding one. It was thought at first that
this difficulty might prove insurmountable, but several of the interpreters
at the Labour Conference achieved excellent results."
Which goes to show that you can't take everything you see on the IBM home page
as gospel. Not only did the Filene-Finlay translator not translate, but IBM
didn't invent it. However, I would like to pay tribute at this point to Edward
A. Filene, that "odd chap" as he was called in an internal memo of
the League of Nations. He pursued for many years, and against the better judgment
of the experts, a system without which international gatherings today could
hardly function. At the League of Nations, the head of the French translation
and interpreting section was given a report by one of his interpreters on 3
September 1930. Here is part of it:
"The interpretation is imperfect because the interpreters must speak at
the same time as the orator. This compels them to miss out sentences - those
pronounced whilst they are speaking themselves. They do not have the opportunity
to make an intelligent summary or to include the passages omitted. It may be
that the sentence pronounced by the speaker while they are translating and which
they do not hear is one of the most important. The general effect is incoherent,
and it is completely impossible to follow an idea or a line of thought in its
entirety ¼ then there is the obvious unhealthiness of working in those
little booths ¼"
Who could deny his arguments? But we now know that simultaneous interpretation
works; it's like riding a bicycle. He does, however, make a point that has come
home to roost: "One fears that the introduction of a telephonic system
might open the door to the demand for recognition of one or more additional
official languages, which would entail costs infinitely higher than those of
the installations themselves." Incidentally, it is generally believed that
simultaneous interpreting was first used on a large scale at the Nuremberg War
Tribunal; we can now see that it was used in major conferences more than twenty
years earlier.
Now, after three centuries of re-inventing the language cracker we come to the
document that is often quoted as the first step in machine translation: a memorandum
sent in 1947 by Warren Weaver, one of the founders of the discipline: "One
naturally wonders if the problem of translation could conceivably be treated
as a problem of cryptography. When I look at an article in Russian, I say 'This
is really written in English, but it has been coded in some strange symbols.
I will now proceed to decode.'" Apparently this nincompoop worked for the
Rockefeller Foundation. The addressee was a crystallographer.
The foreword to a recent book on machine translation admits that "it is
an open question whether the great investment that has been made in the enterprise
(of machine translation) since the first systems were put to use in the 1960s
has resulted in any real improvement". One begins to see why.
Of course, machine translation (MT) has produced results: a Catholic chaplain
on a cruise ship headed for the Antarctic wrote to Machine Translation Today:
Translating and the Computer to say he had used a translation machine to translate
his sermon into French and that it had gone down well with his French congregation.
We are not told whether he used it for confessions. Seriously, the Canadian
weather service has been using a machine to translate its weather reports for
a long time now - it always requires some revision, but it works. Systran is
available free on the Web, and I for one will use it rather than pay a translator
for a version of an article from German that I might or might not really need.
If I do really need it, then I'll see about a proper translation. This modest
service seems to be the best that MT can offer customers today: "The military
has always been a great believer in MT, especially the Pentagon. The Forward
Area Language Converter (FALCon) was developed by the American defense section
and is currently used by US Forces in Bosnia to assess the military significance
of documents and to determine whether they should be translated. Six prototype
systems are currently used in Bosnia by the Army's V Corps Forces and Special
Operations Forces to translate documents from Serbian and Croatian into English."
If better were available, the Pentagon would have it.
III Art or Science?
Consider what has happened in five fields since the publication of Don Quixote
in 1600 (the same year as Hamlet): literature, alchemy, chemistry, lexicography
and automatic translation. There has been no discernible progress in literature.
Indeed, the sight of the Hale-Bopp comet in the morning sky reminded me that
there has been no discernible progress in literature at all. The last time that
comet was near the earth, 4,200 years ago, was shortly before The Epic of Gilgamesh
was written, and considerably before the God of Abraham cleared his throat.
Gilgamesh poses the big questions just as sharply as Cervantes and Shakespeare.
No answer.
Alchemy has bitten the dust. Consider the following description of five illustrations
in an early 17th century alchemical manuscript:
"First, there is depicted a leper hanged on a golden gibbet: this is the
operation of calcination. Next, a leper with his hands tied behind his back
is about to be decapitated by the executioner, also leprous: this is distillation.
The leper attached to a gilded wheel represents coagulation, and the silver
chalice and three dice, solution. The fifth miniature, of a half-woman half-serpent
having a leprous bust and transfixing a leper with a golden lance, while a leprous
woman stands beneath the lance, represents the extraction of philosophers' mercury
from the prime matter by means of the philosophic fire. The whole ... represents
the exaltation of the common base metals, which are throughout considered allegorically
as in a state of sin".
Shed the mystical , allegorical and moral aspects and you have simple chemical
procedures (at least as regards the first four). Chemistry arises from the wreckage,
with much more modest aims: not Truth, but classification of phenomena in formulae
of clarity, concision and elegance. Its progress in three centuries is almost
unbelievable.
Another discipline arose, or gained maturity, in the same period: lexicography.
For this was the century of the dictionary, from the Accademia della Crusca,
the Académie Française and Dr Johnson. Just as chemistry is a
classification of material phenomena, lexicography defines words, ultimately
in a circular or systematic way, in terms of each other. Like chemistry, its
aims are modest, sceptical, and strictly non-metaphysical. The dictionary satisfies
itself with classification in mere alphabetical order, which is absurd but eminently
usable.
While lexicography and the exact sciences progressed by leaps and bounds from
then till now, automatic translation has never kept its promises. As the November
1996 issue of Scientific American puts it, "Few informed people still see
the original goal of fully automatic high-quality translation of arbitrary texts
as a realistic goal for the foreseeable future", writes Martin Kay, a longtime
machine-translation researcher at the Xerox Palo Alto Research Laboratories".
(p.24)
Why is this?
Because translation often confronts us with conundrums like that little question
which turn out to involve not one discipline but several. And whereas in chemistry
and lexicography a question is either tackled or deemed out of bounds, nothing
is beyond the bounds of translation: it is a constant to-ing and fro-ing among
systems, in order to guard against the "Chinese whispers" effect -
which of course is the sort of thing language machines produce all the time,
because they cannot tell which system or what situation a given utterance belongs
to. There is no algorism for solving problems which, ostensibly verbal, are
connected root and branch, sense and synapse, with what we do and with what
we say about it.
Language is social and historical: it takes two to talk, and that takes time.
The weird thing is that that gets forgotten. I am fed up reading one old saw
from Saussure. Steven Pinker's version is "Since a word is a pure symbol,
the relation between its sound and its meaning is utterly arbitrary". Arbitrary
in that there is no organic connection between the word "dog" and
the slavering quadruped. In social terms, though, it is not arbitrary but conventional:
there is a world of difference. And in semantic terms, if words are arbitrary,
there can be no etymology. No etymology means no history, no sense. Garbage
in, garbage out. (Incidentally, I disagree not only with Pinker's deduction
but also with Saussure's premise; but that's another story.)
Let's stay with Pinker: he's a revelation.
"But to get these languages of thought to subserve reasoning properly,
they would have to look much more like each other than either one does to its
spoken counterpart, and it is likely that they are the same: a universal mentalese."(p.82)
"But grouping words into phrases is also necessary to connect grammatical
sentences with their proper meanings, chunks of mentalese".(p.101) "Deep
structure is the interface between the mental dictionary and phrase structure."
(P.121) "At the very least I hope you are impressed at how syntax is a
Darwinian 'organ of extreme perfection and complication'." (P.124) "...children's
minds seem to be designed with the logic of word structure built in." (P.146)
"When memory has been emptied of all its incomplete dangling branches,
we experience the mental "click" that signals that we have just heard
a complete grammatical sentence." (P.200) "His results corroborate
the suggestion that this particular universal is caused by the way that morphological
rules are computed in the brain..." (P.236) "...grammars can hop among
the grooves made available by the universal grammar in everyone's mind."
(P.244)
Like the description of alchemical procedures quoted above, Pinker's book provides
genuine science (phonological in this case). Also a coherent grammar - Chomsky's
system, which is neither elegant as Panini's Sanskrit grammar nor manageable
as Kennedy's Latin Primer, but more universally applicable. From the evidence
provided, though, it is, like the other two, a descriptive/prescriptive grammar,
which means that its place is in the arts, not the sciences. For one reason
and another that isn't seen as good enough. So, as the above quotations indicate,
there are constant rhetorical claims that Chomsky's grammar is part of the brain.
Judging from the plaudits on the back of the book, many people are sold on the
idea. Just as early attempts were being made to model the brain on computer,
someone came along to say that the brain behaved like a computer. It was so
good, it had to be true. If, as a wise man once said "meaning is what essence
becomes when it is divorced from the object of reference and wedded to the word"
, there will always be those who want to have their cake and eat it, with meaning
and essence as one thing, particle and wave at once. It's alchemy again: there
is no end to it.
IV: Wax Fruit
Automatic translation - the attempt to treat translation as a branch of cryptography
- floundered along for over three centuries, and failed in its grander aims
when it became clear that lack of processing power was not the central problem.
What was the main problem? and what can be salvaged from the wreckage? For unless,
like Robert Boyle and the early chemists, its practitioners set clear and achievable
goals, it too will muddle on, promising clients what they want to hear, with
predictable results.
The central problem, the intrinsic problem of machine translation, has been
hidden by the extrinsic problems that must be referred to here, because of the
damage they do. Where demand is desperate and supply unlikely, charlatans thrive.
When in addition the discipline is one without a memory, those concerned are
unlikely to notice that history is recycling itself as neurodisney, as in Weaver's
memo, above, or as in the many schemes for synthetic languages that still appear
(Eurolingua is one I saw recently on the internet, with its elegant coinages
such as "weekfini" for weekend).
In addition to farce and outright fraud there is sharp practice, as evinced
in the following argument:
"Translators are naturally reluctant to be responsible for what they consider
an inferior product. Their instinct is to revise MT output to a quality expected
from human translators, and they are as concerned with "stylistic"
quality as with accuracy and intelligibility.
"In assessing MT they need to adopt a different attitude, to acknowledge
that perfectionism is neither always desirable nor always appreciated, particularly
if it results in higher charges. An MT system gives them the option of adjusting
"stylistic" quality to users' needs without sacrificing accuracy and
consistency." This seems to blame translators for doing their job well,
and to make a spurious distinction of style from accuracy and intelligibility
- which are the two main elements of style in translation. It is further stated
that, without an MT system, a translator is unable to adjust the type of translation
to the user's needs. This is pernicious nonsense.
The intrinsic problem of MT, the one that serious practitioners have to confront,
is that language is not code. Three demonstrations:
1 An interlingua (whether a matrix for translation like that of Des Vallées
or an international language) depends, semantically, on the establishing of
a non-arbitrary ordering of concepts and phenomena such as that attempted by
Descartes and Leibnitz and relinquished by the encyclopaedists, and by Wittgenstein.
It depends on such an ordering because otherwise, while anyone could use the
codes to translate from it, only its inventor could translate into it, since
only he or she would know where a particular concept was filed. As we have seen,
natural language covers many incommensurable fields, some of which evolve while
others apparently do not. To put it simply, people change, the world changes,
and any scheme for representing them must change: it must either lose its consistency
or lose its relevance. It must be a living language.
A living language will be full of inconsistencies. The most used constructions
will be the least regular. One word will gather a wild concatenation of meanings:
"Durman" in Russian denotes different flowers, depending on the area:
It also means a fool, a kind of tobacco, a type of Astrakhan grape, an impassably
thick forest, a strong wind on a lake, and the ore that contains "kamen'
samosvet", some kind of shiny stone that I can't find in the dictionary.
Obviously I could have found similar examples in English. I chose Russian because
it's from a dictionary that was begun in 1965 and whose 31st volume (part of
the letter P) has just appeared. The print run has fallen from 4,500 on volume
one to 1,200 on the latest volume, the paper of volume one will have rotted
before the last volume comes out, but it struggles on, and I wish it well. Now,
one can envisage a set of procedures whereby a machine would more often than
not distinguish an Astrakhan grape from a wind on the lake and put the correct
solution in a fruit bowl. But it would still be wax fruit.
2 The cryptographic approach to translation is statistical and abhors ambiguity.
The human approach is contextual and depends on ambiguity. When we don't know
or don't want to say we leave it vague; as diplomats and poets know, the latitude
of that vagueness can be very precisely determined. A translator has to know
a lot, not just about languages. And the one thing he must know better than
anything else is when he does not know. He has to go up to the edge of that
vagueness and look in. When it cannot be resolved to something definite, it
must be reproduced in translation.
3 A code is the transformation of a set of signs. A language is a transformation
of existence. Both code and language take the form of sign systems - hence the
confusion on which MT is based - but in transit from one language to another
the signs must run to earth. This can be a frightening process (read Sartre's
Nausea for symptoms), though even if, as usual, it entails nothing more dramatic
than passage through a translator's head, the sense can travel a long way from
the signs before reformulation, as the following excerpt from a humdrum procedural
text does show: "En effet, celles-ci seront en général amenées
au P.M.A. par simple brancardage. Il est donc souhaitable de faire en sorte
que la distance parcourue par les secouristes soit la plus courte possible,
ce qui entraînera pour eux une fatigue moindre et leur donnera une plus
grande disponibilité donc la possibilité de prendre en charge
chacun plus de victimes." Translation: "In general, patients have
to be carried to the medical outpost on stretchers, so the closer it is, the
better for the rescue workers, since it leaves them less tired and better able
to cope with a larger number of casualties."
V: The Cost
Let's return to Alcaná de Toledo and that miraculously cheap, accurate
and elegant text. How do you cut the cost of writing and translation? There
are four obvious ways. One is for everyone to use the same language. Perhaps
impatient with the many attempts to devise and impose a universal language,
one author pointed out that there already exists a perfect language. Why don't
we just drop the others and use it? Antoine de Rivarol was writing in 1785,
and he meant French (De l'universalité de la langue française),
but today's candidate is English. It has swept the board in the sciences, on
screen and on line, and in international air and sea traffic. Even the warheads
of Soviet SS-20 missiles are to be replaced with American talking heads. Perhaps
the most remarkable sign of the dominance of English was turned up recently
by my daughter, who showed me a book in French on how to train dogs, which advises
owners to issue the commands in English because English is more concise. On
the other hand, maybe the subtext is that English is a language fit for dogs...
I remember talking to one Cameroonian who saw this linguistic dominance as a
straightforward matter of progress: in Africa there are hundreds of languages,
in Europe dozens, and in the United States, one (or two).
Truth to tell, there are still over 100 languages in the United States. Linguistic
diversity is another of those resources we can destroy but could not replace.
English has hegemony and indeed monopoly in many areas, but remember the arrogance
of M. de Rivarol, and how his view of the world was to be overturned a few short
years after he put it in print. English threatens many other languages in the
world today - even languages such as French, Spanish, Arabic and Russian. But
English itself has broken up: the Germanic slant of American English will soon
be hispanicized too, and more importantly the different purposes it is used
for have produced powerful dialects of their own. It would seem prudent as well
as politic, therefore, to maintain the other international languages, against
the day when English becomes finally incoherent, so another can take the strain.
Those languages in their turn, should allow for the others - not by coining
vocabulary in them for abstruse areas of science, but by allowing them to live.
This age of number, which began in the 17th century, has now reached a pitch
where whatever can't be digitized has no value. With the speed of digital transactions,
value has left meaning far behind. But meaning will survive with any of the
neglected languages that escapes destruction.
A second cost-cutting option is the automatic generation of documents, which
is a much simpler procedure than automatic translation. Consider this:
"In principle, if the communicational purpose is sufficiently closely defined,
systematisation of the kind the European tradition developed makes possible
an entirely automatic generation of the discourses that may be required. In
practice, rhetoricians usually did not go this far since, quite apart from the
laborious analysis involved, they preferred to emphasise the creative aspects
of their art rather than its mechanical character. None the less, it is worth
pointing out that precisely this step, which is basic to modern theories of
generative grammar, was already taken within the scope of the ars dictaminis
at the beginning of the fourteenth century by Lawrence of Aquilegia. Lawrence
reduced letter-writing to a system which could easily be formulated as a set
of rewrite rules of the type employed in the phrase-structure part of a transformational
grammar. His system of communicational roles postulates seven types of addressee,
which cover a wide range from popes down to heretics. For each type of addressee
he provides a tabulation of alternative phrases and clauses, arranged consecutively
in groups, in such a way that one item from each group must be selected at each
successive stage in the derivation of a letter... This has the advantage, from
the scribe's point of view, that it is no longer necessary to invent any materials,
or consider the order of parts, or choice of words. Provided one can find one's
way through the epistolary chart provided, writing a letter, as one commentator
has pointed out, does not even require the writer to know the language."
The relevance of this to international organizations needs no elaboration.
The third way to cut the translation budget is simply to translate less: control
the traffic. As it is, the user of information is almost drowned in a mass of
junk. Automatic translation both increases the quantity of such rubbish and
reduces its residual cogency. It becomes increasingly difficult to find any
text of value. By contrast, there is no closer reading of a text than that done
by a translator; the only practice approaching it is that of advocates arguing
over the niceties of a contract in a court case, and for all the egregious expense
of translation, it is much cheaper than the law.
Before I go on to option number 4, which will bring us back to the Good Ship
Venus, I should explain where my conclusions come from, because this little
essay has been an attempt to put into perspective 18 years' experience as a
professional translator and 20-odd years' awareness that machine translation
was somewhere out there - the ultimate in vapourware or the ultimate solution.
I have not focussed on recent (machine) translation theory - partly I admit
because of the traditional hostility (to be found in many fields) between those
who do it and those who talk about it, and partly because the principles of
a discipline are most clearly seen in its initial phases; later they get submerged
in detail.
Option 4 is computerization, and the three different approaches it entails have
17th-century ancestors: linguistic databases derive from lexicography, machine
translation from cryptography, and computer-assisted translation from a compromise
between the two.
Computers are perfect for lexicography, which manipulates unambiguously defined
units in conventionally arranged (alphabetical) fields. Computer databases are
of immense and increasing help to translators.
In 350 years the automation of the translation process itself has not succeeded
in crossing the divide from code to language. It may take as long again to do
so, in might never succeed, or it might, through being used in any case, twist
human language towards itself. (I believe the third scenario is the most likely:
machine pidgin will be the vector of trade and the bane of my life.)
Computer-assisted translation - the compromise - will work better at the pole
of lexicography than at that of translation. When, in other words, the text
in hand approximates to a list, computer-assisted translation will be useful;
when it is a highly articulate text, CAT can be a liability: it's misleading
to be told that the version proposed by the computer is a 95% accurate rendering
of the text when the 5% that is different completely reverses or subtly qualifies
the meaning. It also slows you down when you are presented with six variously
defective proposals for a phrase instead of being given the space to think.
So all of you who are still looking for the anonymous Moor from Alcaná
de Toledo, stop wasting your time. If you want a text that talks to you, use
a human being, and pay him (or her) like one. If you want a quick and very rough
approximation use a machine translation programme. Remember that more and more
texts are composed more or less automatically and more or less carelessly by
cut and paste; nobody writes them and nobody reads them. It's perfectly apt
to use a machine: that way nobody translates them either.
To expedite both processes in the long run, put 95% of your info investment
into developing or buying the relevant databases and 5% into buying inexpensive
machine translation systems off the shelf. Give them to your translators and
watch what happens.
VI: The Future
I sent a draft of this article to the poet Edwin Morgan, one of the great
translators, and in a far more difficult field than mine. Part of his comment
was "I must admit I am divided in my mind about the whole business. I like
the Shelleyan, evolutionary, all-things-are-possible viewpoint, and I suppose
I engage in the 'impossibilities' of translation for that reason, in other words
we, everyone, even machines, can get better at it, and all avenues should be
kept open. On the other hand, the creative juxtapositions of language can be
simultaneously so unexpected and so powerful, especially in poetry, that one
is tempted to want to say, Keep it, keep off! - this is it, nothing else will
do!"
What can I say? You may have noticed by this stage the frightened undertone
of the threatened scribe in my words. I don't deny it's there. But I have tried
my best to see the matter clearly. The central question seems to be this: is
the dream of machine translation to be assimilated to that of perpetual motion
or to that of manned flight?
My answer is that the current approach - which uses the cryptographic metaphor
for language - falls into the former category. To demonstrate the point, I'd
like to conclude by considering the mirror of this argument: we've been asking
if numbers can cope with natural language. Can natural language cope with numbers?
I don't know if any of you has tried Albert Einstein's ABC of Relativity. I
have. I read the first few pages, quite enjoying myself, rather like the man
who is heard, as he falls past the windows after jumping from the top of a tall
building in New York muttering "so far so good, so far so good". I
met the same end when old Albert suddenly launched into a few formulae that
were at best Greek to me. But even if you go to someone like Richard Feynman
who, in addition to being a genius of a physicist was an astounding teacher,
you have problems. His QED: The Strange Theory of Light and Matter (1985) is
the best attempt I have ever seen by a scientist to put a difficult subject
across to the innumerate public. But even there, though I understood the lectures
step by step, I was never able to paraphrase what I thought I had understood.
You can only go so far without maths. Perhaps it would be more precise to say
that you can go only so far without realising that, if you want to go further
without wasting all your time translating every formula into words, you had
better learn some maths. I venture to suggest that the moral applies equally
and conversely to those who would use maths to render language: it works up
to a point, beyond which the game is not worth the candle.