Friday, 23 July 2010

TRANSLATING THE TRANSLATION 1

A Romanian company promoting its heavy industrial machinery in the German market; a Vietnamese travel company describing the stunning beauty of Halong Bay to a Japanese audience; or a Chinese company persuading a world audience to buy its solar energy panels- in the global market translations and translators are essential.


The quality of translation depends on what you want it for. If you translate simply for information, then some of the machine-based systems such as Google Translator are probably adequate. But if your aim is to publish material in another language, machine-based translators don’t do the job.


Tests of computer translators show between 50% and 65% accuracy. In other words machine-based translators usually can give readers the gist, the main points or ideas of a text. But machines are programmed to translate words; they are not programmed to understand complex grammar, the idiomatic use of language, like ‘pick up’ or nuances of vocabulary.


English, with its enormous vocabulary, is particularly difficult to translate. Take the word ‘dog’, for example. For most non-native users of English it’s a noun meaning a four legged animal descended from wolves. But it’s not that simple. The Merriam-Webster online dictionary lists 11 possible meanings for the word ‘dog’! The computer translator probably ‘knows’ 2 meanings of the word. So how well could a machine-based translation system translate this? (from a business report about an office computer networking system).


The most serious problems were mechanical. Trials of the system were dogged by breakdowns. In the main unit, for example, the central core was fastened to the main frame buy a plastic dog. The dog constantly came loose from the central core and the system stopped operating. After doggedly testing and repairing the system for a month, mechanical engineers described it as a ‘dog’.


Of course this is a made up (another idiom!) text but it shows how

a machine-based translation system would struggle to find

suitable words for these four uses of ‘dog’ and its derivatives.


And now a real translation! In the original Chinese this web page paragraph made sense but this is what Yahoo’s online translation system produced as the English version:

The company introduces United the peaceful abundant stationery industry Limited company to establish in 1997, was a fair stationery development, the production in a body's specialized company. The company since was established, gathered one group of professionals, introduces Taiwan most advanced complete set production equipment, specialized manufacture high quality folder, material book, organ package, capital feed bag and so on several series more than 200 variety work stationery products.


Apart from all of the other problems, the computer did not recognize that the words ‘abundant peace’ is the name of the company! The other difficulties include vocabulary such as ‘capital feed bag’, ‘set production’ and ‘fair stationery’; grammar- ‘to establish’ rather than ‘was established’ and non-English sentence structures. This is not the language you want splashed all over your expensive, state-of-the-art new website or colourful, glossy brochure.


The alternative is to contract a professional warm bodied human to translate your commercial documents and website. But wait! Before you rush out and Google ‘professional human translators’, beware – contracting human translators opens up another can of worms! 'Can of worms'!? Translate that! Into Dutch, no problem, ‘kan van wormen’. And Italian? ‘Latta delle viti senza fine’. And in English that is? Latta of the lives without end. OK- point made.


Wednesday, 21 July 2010

HOW COMMON?


English is a mixture of four main languages- German, French, Latin and Danish, so it has an enormous vocabulary. No-one can say for sure exactly how big the English vocabulary is, but there are around 250 000 distinct English words. Most educated native users of English, know about 50 000 words but for everyday communication English speakers use only about 2 500 words.

So some words are used a lot, and many others are not used much and maybe not known by most users of English. In writing for websites or any promotional writing, the language must be clear and easily understood by the target audience. But how do you know which words are common or used often and which are rare, uncommon, unfamiliar- see how many words express the same idea! How often words are used is 'word frequency ranking' (note that
'Word frequency' can also refer to how often words are used in a text.)

There are many web sites that provide word frequency rankings but some of these are quite technical. Two of the easiest to use are the 'Wikitionary frequency lists' which shows the most common words based on the Gutenberg Project*. Wordcount (http://wordcount.org/main.php) is a quick way to check the frequency of single words. To use Wordcount, type the word into the 'find word' box and then click the arrow. The word will appear with its frequency ranking.


Word frequency is a huge topic, but, for business writing in English, or any kind of writing for that matter, knowing a little about word frequency and word ranking helps writers to use words that their readers will understand. And avoid sentences like this (from a travel brochure): 'The style trades on a melding of familiar lines and the panache of vibrant tropical colours and appeal'. This is almost impossible to understand. Some of the words are not used correctly, but many of the words are simply very low frequency- that is they are not used often by native users of English. According to the Wordcount ranking system, 'vibrant' is ranked 1 424th. Most native users of English will know and understand this word. 'Panache' at 26 906th may be known by educated users of English. 'Melding', however, is ranked 60 940th! Very few native users of English are likely to know this word. The phrasal verb 'trades on' is also low frequency. Many of the people you want to read your material are, of course, not native users of English. They are people who use English as a second language. They have to read in English because about 80% of the published information in the world is in English. Non-native users of English do not, of course, have as big a vocabulary as native users. This is yet another reason to use familiar, low frequency words in your writing.

Does this mean that your writing will be characterless, lifeless, humdrum, stale, tedious? No, because English has such a rich vocabulary to choose from. I used five synonyms for 'boring'. The words 'boring', 'stale', 'lifeless' and 'tedious' all fall within the 20 000 word range and are likely to be known by almost all native users of English. 'Humdrum' and 'characterless' are lower frequency, 33 596th and 51 026th respectively.

Another tool that can help you create clear, easily read text is Textalyser (
http://textalyser.net/). This tool analyses a paragraph or even a whole text or website for word frequency. Note, however, that Textalyser checks a whole range of other features about the text, such as how often individual words are used and the average number of syllables used per word. Although this information is useful, you are really looking for a 'readability' score. Without going into detail, readability is simply how easy or difficult a text is to read. Textalyser gives two readability scores. The first one, Gunning-Fog Index is probably the most accurate and easy to understand. The range is basically from 1 to 20, with 1-6 being 'easy' and 20 or more 'hard'. I used Textalyser to check the first two paragraphs of this article and got a score of 10.2, i.e.about the middle of the range, which is about the level of readability that you should aim for.

So, armed with a Thesaurus, a word frequency ranking tool and a text analyser, you can create text that is rich and interesting, but easily read and understood by your target audience.

References:
Wordcount- see http://wordcount.org/main.php


Wikitionary- http://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

Textalsyer - http://textalyser.net/ Note that the 'Textalyser'
Note: The results of the Textalyser analysis are quite academic and some of the results may not be very reliable, however it does give a general idea of how easy it is for your readers to understand your writing.


*Project Gutenberg - an ongoing project to create a library of e-books. To date over 30 000 books have been digitalised for online reading (see http://www.gutenberg.org/wiki/Main_Page).


Wednesday, 14 July 2010

GOOD BAD AND UGLY

Good written English, on websites, in brochures, in emails, in documents such as contracts and proposals, is clear, accurate and to the point. The reader has no problems reading and understanding the text.

Ugly writing
Ugly English uses fancy words and/or long, difficult sentence structure. Readers need a map and a dictionary to find their way around the text! A good example of ugly English is this from a travel brochure:
As you stumble upon the exquisite little offerings left all over the island that materialise as if by magic, you'll see that their tiny tapestry of colours and textures is a metaphor of Bali itself. Did you keep going until the end of the sentence or did you give up? The 'exquisite little offerings' comes out of nowhere. There is no explanation in the previous sentences. And what 'materialises'- the island? The offerings? The sentence makes the reader struggle. Readers don't want to struggle. They just want the information.

When this sentence was analysed for readability* it was 21.6 points In other words it is difficult. I tried to rewrite the text:
People leave offerings to their gods all over the island. You find these suddenly in many places. The colours, feel and look of these offerings reminds you of the variety and richness of Bali itself.
It's not great but the readability of this text is 57.5 points, i.e.'quite easy'.

Word frequency and readability
There are about (it’s impossible to say exactly how many) 250 000 distinct words in English, but we only use about 10% of these words regularly. This is 'word frequency'- how often words are used. So, for example, 'that' is ranked the 8th most common word in English. 'Materialise', on the other hand is ranked 20 176th! There's a good chance a native English speaker would not know this word, and a second language English speaker would almost certainly not know 'materialise'. When you choose words to write, always keep the audience in mind and use words that they are likely to know.

Bad writing
Writing may be 'bad' because it is inaccurate, such as poor spelling, punctuation or grammar. Or it may be the wrong use of words, or words left out of sentences. For example,
We are cycling in a farm road through the pineapple and rice fields at 36 kilometres. We can guess that 'in' is meant to be 'on', and probably the pineapples and rice are not growing in the same field. But what about 'at 36 kilometres'? Is it 'at 36 kilometres per hour'? Or 'for 36 kilometres'? Bad writing makes readers confused. Also when the writing is not accurate, the reader has a low opinion of the company because the company hasn't bothered to proofread the material before it was published. Sometimes the writing is so bad that it is impossible to even guess what it means. For example, Real legends coming from the Tunnel are over human imaginativeness. When readers are faced with a sentence like this, they look for a rubbish bin or their mouse to click to a new website.

Translation
The sentence 'Real legends coming from the tunnel are over human imaginativeness' is probably a writer trying to translate from one language into another, in this case English. Transposing from one language to another doesn't work! Different languages usually don't work the same way.

City authorities in non-English-speaking countries have often tried unsuccessfully to translate information signs into English. One famous example is this in a Shanghai Metro station: 'After first under on, do riding with civility'. Apparently it means something like: Be polite. Let other passengers off before boarding (the train). But, of course, to a user of English it is meaningless. It is a direct translation using the Chinese words and sentence structure. Ideas can be translated, but usually not words.

So, when you write in English, always remember your audience. Before you publish, check
• will the audience be able to read and understand the words easily?
• is the writing accurate? (spelling, punctuation, grammar, word order)
• is this real English or a direct translation from my language?

* Readability simply means how easy or difficult a text is to read. It is based on vocabulary, sentence structure and accuracy (spelling, punctuation etc.)

OOPS! PROOFREADING DOES COUNT

The need for careful proofreading hit the headlines recently. The publisher Australian Penguin Group had to spend $18 000 reprinting 9 000 copies of a cook-book. The reason? One, yes one single word was spelled wrongly!

Many recipes in the Italian cook-book included the instruction to add "salt and freshly ground black pepper", but on one page, a recipe for tagliatelle, a pasta dish, the recipe read "salt and freshly ground black people."

Some readers complained to the company about the use of the words 'ground' and 'black people'; they complained that these words made the book sound racist. The company had to withdraw the book from sale and send the withdrawn books to be pulped- turned into paper porridge!


The publishers called the error, a 'silly mistake' and it seems to have been a simple, honest error. However, the audience for any published material are often quick to take offense or make judgments about errors.


And this week more books have had to be pulped because they had not been proof read. The publishers of American novelist, Jonathan Franzen's new novel had printed and started to sell the book when it was discovered that a draft of the book had been printed. There were only about 50 errors in the published book but that was enough for the publisher to withdraw the book and turn the 80 000 copies into paper porridge!


These events highlight the need for careful proofreading. It can save a lot of pain- and money!

THE CITY THAT ISN’T

English is international. People know English. Sometimes they think they know it too well. That's when non-native users of English can make mistakes. Big mistakes!

For example, there were the people hired by the Jerusalem city administration to give the city a much needed publicity boost. In 2006 the city hosted a music and culture show. The administration decided this was a good opportunity to put the city back on the international tourist map. Not an easy job.

Jerusalem is an important city for three of the world's major religions- Christians, Muslims and of course, Jews. It has been the focus of conflict for eons.

To overcome the city's poor reputation the city administration hired a public relations company to design and write a brochure advertising the cultural event. But more importantly, the PR company was set the task of promoting Jerusalem as a tourist destination to an international audience.

The PR company was asked to come up with a snappy, catchy slogan for the brochure. They did, but not in the way they intended.

Idiom is difficult to master in any language and English is trickier than most. If you want to say that something or somebody is very special, unique, you can say 'there is none better' i.e. it is the best. Or you can say 'one of a kind', or 'There is no city like it'. This was the idiomatic phrase the PR company writers were probably thinking about when they came up with their memorable slogan: 'Jerusalem: There is no such city!' Thousands of glossy brochures had to be recalled and angry Jews, Christians and Muslims placated.

The city could, of course, have saved much embarrassment and cost if they had given the brochure to a native user of English to proofread before it was published!

READING BLUES

You've spent hours getting the words right. Then editing. Suddenly the words begin to sparkle. The words say it all clearly and concisely. Web users quickly scanning the website page will pick up the message effortlessly Your blog is just about ready for publication. And then some designer techie guy suggests putting a blue, wavy wash over the whole website. "Then it'll stand out!" Which it does. But the readers feel as if they are drowning in wavy, blue water!

When readers have to struggle to read text they usually give up. Sometimes looking good and readability are not compatible. In which case, the look has to give way to the text.
Our eyes are used to reading black text against a white background. Of course we can also read the reverse combination, but white text against black soon tires the eyes. Any other combination of text and colour has to be thought about carefully. A light pastel shaded background can work well as long as there is no interference from background patterns. Again eyes tire quickly when having to quickly distinguish coded symbols (letters and words) from meaningless patterns. How many times is a good children's illustrated book spoiled by having text wandering into pictures? The children can't read it and adults reading aloud struggle to disentangle the letters from the roots of trees, or swirling smoke. The same applies to magazine articles where the text crawls into illustrations and the reader's eyes have to switch from reading black on white to reading white on black, or even worse some other colours such as red on green!

The words come first; looks second. Enhance the look of pages by whatever means designers and illustrators can create, but never sacrifice the word for the look. And never, never write anything in yellow!