You can achieve amazing results if you
commit a lot of computing horsepower and countless man-hours of
software engineers, linguists, coders and consultants with hybrid
backgrounds, and supporting personnel. It is even possible to achieve
some semblance of 'recognising' linguistic and communicative
contexts. Still, on the technical side, what the machine does is parse the original text against an automated checklist and perhaps subsequently validate the results in some sort of a QA procedure (e.g. another
checklist, back-translation and so on). When we get to the bottom of
it, it comes down to two things: 1) instructions, 2) data.
All that a computer does is execute logical operations on data. Computers can appear to 'design' their
further operations on their own, or to 'learn', except that they
don't really have a creative or learning process. They still execute
a program which — in result of going through a flow chart of
conditions and commands – generates another program with its own
set of conditions and commands. At the end of the day all of it rests
on some core instructions implanted by a human engineer. Any creative
spark came from that engineer.
A machine does not 'know' anything which you haven't previously either: 1) outright handed to it as data, or 2) equipped the machine to generate as the product of its procedures. Even those machines which supposedly 'learn' still operate under these same natural constraints. Try subjecting language to a mathematical analysis and you'll know how limiting it is when logical operations and input data are your sole guides to language. Or look what happens when humans attempt to reduce the law to a set of syllogistic rulers. Computers do 'think' much faster than humans do, except they don't think but compute, or calculate, which is the meaning of computing. On a conceptual level they are a glorified calculus. A tool. The rest is magical thinking on the part of their users and the lay public in general.
Thus, you can certainly take not even dictionaries but also grammar reference, convert it into flow charts — algorithms — and ultimately program code. And data banks. You can certainly process the Bescherelle or Murphy in this fashion, and dictionaries have long been transcribed at the expense of many, many hours of work — work which could have been used elsewhere, just like the money paid for that port could have been spent in other ways.
A machine does not 'know' anything which you haven't previously either: 1) outright handed to it as data, or 2) equipped the machine to generate as the product of its procedures. Even those machines which supposedly 'learn' still operate under these same natural constraints. Try subjecting language to a mathematical analysis and you'll know how limiting it is when logical operations and input data are your sole guides to language. Or look what happens when humans attempt to reduce the law to a set of syllogistic rulers. Computers do 'think' much faster than humans do, except they don't think but compute, or calculate, which is the meaning of computing. On a conceptual level they are a glorified calculus. A tool. The rest is magical thinking on the part of their users and the lay public in general.
Thus, you can certainly take not even dictionaries but also grammar reference, convert it into flow charts — algorithms — and ultimately program code. And data banks. You can certainly process the Bescherelle or Murphy in this fashion, and dictionaries have long been transcribed at the expense of many, many hours of work — work which could have been used elsewhere, just like the money paid for that port could have been spent in other ways.
Perhaps it's worth noting here that
machine translation may be cheaper than a human translator once you
have it, but it takes resources to get to the point where a machine
can even begin to offer a very imperfect and supervision-unfree
alternative to human translation. Many times more that time and those
other resources — with a geometrically diminishing return the further you pursue
perfection — would be required to get it to the point where it can
pose a serious challenge in terms of quality.
Those were hours of either menial
typing or inventive designing with the goal of avoiding menial labour
— where an engineer who designs ways of reducing the labour also
charges more for his services than a humbler labourer's wage — but
in any case it takes man hours and other resources. No matter if you
type, scan manually, construct a scanner with a conveyor belt and
huge throughput, improve the OCR software, it always comes down to
just having to commit the resources. MT doesn't grow on trees. Plus,
apart from the cost of any investment that does eventually yield a
profitable return, there is also the risk of a negative return when
you get side-tracked.
So, with all that huge expenditure and
risk, after being done with dictionaries and simpler grammar books
you could then proceed with more advanced, scholarly works on
grammar, syntax, punctuation and other aspects of either language or
translation, meaning — in practice — more complicated flow charts
for you and your machines, more nuanced rules, with even rules for
rule conflict resolution (meta rules), exceptions and overrides and
what else have you, all in order to avoid blunders (perhaps similarly
to how a non-native speaker or writer tries to avoid them) and
simulate the 'recognition' of non-standard usage and common mistakes
that a human would probably not be fooled easily by whereas a machine
— being a strict prescriptivist by nature, as all it does is
execute instructions — could totally misinterpret by applying
correct grammatical rules in an unbending fashion, failing to decode
even the most obvious substance where the form fails to comply.
Just as long as the rules of a given
language or translation method or procedure are logical and
explanation exists for them which is reasonable and lucid rather than
fuzzy, arbitrary and whimsical, then you should be all right most of
the time (well, what if you need to be safe all of the time?). This
is all the easier now that you have multi-core processors clocked in
the 2–4 GHz range, working with operational memory in the teens of
gigabytes and cheap terabytes of storage even for home compusters.
Still, even if the software hypothetically were to stop needing to be
baby-sat by a human, the fire would still burn on the root spark of
creativity left by a human engineer. Whole teams of such engineers,
to be precise.
Back to language, though. A computer does not get to work with the benefit of the social, cultural and other experience from start, which a human being starts learning even before its birth and continues till its death. A computer can operate a whole laboratory to analyse samples in compliance with predefined procedures and in this way expand its data banks and fill its memory. It certainly can 'analyse' faster than a human, often more reliably, without suffering from exhaustion or a limited attention span, but intuition, the subconscious, nope, it will have none of those.
You could make a computer simulate the
thought process of a person with a specific personality, education,
beliefs, mental disorders, but it will be… simulation. Which it was
supposed to be from the beginning. Bottom line, a machine, for all
its tremendous computing power, still runs on 1) algorithms written
by humans (or derived from such algorithms), or 2) data either either
filled in by humans outright or arrived at by the machine using
instructions left by humans. There is no 'awakening' point at which a
computer takes full possession of its enlightened self previously
dormant, i.e. taking a nap in its printed circuits.
Executing logical operations on data is actually not so far from how language works in a human brain – perhaps even more similar to a non-native speaker's reliance on interlanguage – but it's still just like Deep Blue's game of chess in the best cases. A chess program will typically beat its own creator at the game (due to the calculating advantage), but a computer still can't play chess unless you – the human – 'teach' it the rules, supply it with the data to crunch. You can even equip the machine with a checklist to analyse its records of games played, so that it will expand its repertoire with time and become better prepared to handle what human opponents can throw at it.
Executing logical operations on data is actually not so far from how language works in a human brain – perhaps even more similar to a non-native speaker's reliance on interlanguage – but it's still just like Deep Blue's game of chess in the best cases. A chess program will typically beat its own creator at the game (due to the calculating advantage), but a computer still can't play chess unless you – the human – 'teach' it the rules, supply it with the data to crunch. You can even equip the machine with a checklist to analyse its records of games played, so that it will expand its repertoire with time and become better prepared to handle what human opponents can throw at it.
If the machine brute-forces its way
through the 'analysis' of billions of possibilities, validating
hypothetical choices with further analysis before committing –
something which a human brain could possibly require millions of
years to process — then yes, the machine can do marvellous things.
It can already translate some easy texts on par with human
translators, even translate them more reliably than a messy or
underqualified translator, but this is all the product of countless
time invested by humans. And there are still limitations anyway
because — as we've already taken notice — a computer is a tool and not a
master, a glorified calculus.
To draw a different parallel, yes, you
could construct androids capable of typing on a keyboard. But it
would still be cheaper to hire even multiple typists, there are
obviously better ways of making two machines communicate and you
don't need an android for what a bot can do anyway. And it doesn't
even pay to get a bot to do something which is not repeatable.
Research and development costs money, and the investment must bring a
sizeable return. Otherwise it's a waste and it doesn't pay. This
applies to machine translation as well. Even if there were people who
really wanted to put human translators out of business — as opposed
to making translation cheaper, which is another matter — those guys
wouldn't be working with unlimited resources, their creative staff
and other workers, and equipment, would still cost them.
Back to the chess parallel again,
language is not a board of 64 fields populated by 32 pieces and
governed by a relatively short list of consistent and transparent
rules with great potential for analysis. Chess was designed for that
sort of nutcracking, language was not. Language wasn't.
Now, to deal with the fallibility of human translators, that fallibility often results from thinking just like a computer 'would'. Their blunders are not unlike a human player fooling the AI in a war game when knowing its routines. For example, peppering a line of infantry with some arrows from archers to bait them into loosening their formation so as to them up with a quick charge of cavalry in a wedge formation. When a human gets fooled into the wrong choice it isn't worlds apart from tricking the AI's check: IF condition THEN command. A case-insensitive computer check works just the same as a translator who fails to recognise proper capital letters in a text printed all in the uppercase: he will not know the VAT on wine from a wine vat, at least not without parsing the text for clues elsewhere.
Now, to deal with the fallibility of human translators, that fallibility often results from thinking just like a computer 'would'. Their blunders are not unlike a human player fooling the AI in a war game when knowing its routines. For example, peppering a line of infantry with some arrows from archers to bait them into loosening their formation so as to them up with a quick charge of cavalry in a wedge formation. When a human gets fooled into the wrong choice it isn't worlds apart from tricking the AI's check: IF condition THEN command. A case-insensitive computer check works just the same as a translator who fails to recognise proper capital letters in a text printed all in the uppercase: he will not know the VAT on wine from a wine vat, at least not without parsing the text for clues elsewhere.
In fact, such checks are not unlike QA
procedures already used in the translation 'industry'. Human
proofreaders also effectively parse their entrusted texts for errors,
occasionally producing false positives or false negatives when
something pops up which isn't covered by their procedures or doesn't
occur in their data banks, for example when their knowledge of the
applicable rules is not complete or when they fail to execute the
call to a reference book, as a programmer would say.
So, yeah, there are similarities. This
is definitely true. However, not unlike amateurs versus professionals
or junior professionals versus senior professionals (with individual
exceptions), computers are always one step behind humans. They don't
have anything they didn't first receive from humans or produced with
the procedures humans gave them. They can't fully simulate a human
being. Unlike human students and teachers, the divide between humans
and their computers is more fundamental. Guns send bullets, but guns
don't kill people.
For all the human frailty and computing
horsepower, MT will always need at least some PEMT-ing (post-editing
of machine translation), even if we give it thousands of years. Sure,
there will be fewer traditional human translators and more PEMTors.
Perhaps there will be a preference to use linguists and their skills
in MT design works and not in actual translation assignments, other
things being equal. Perhaps some companies interested in that kind of
thing will stubbornly overinvest in the pursuit of the goal of making
translators no longer needed, hoping for a huge long-term ROI in
spite of the continual loss in the short term…
… But no one in
his right mind is going to invest hundreds of hours of R&D in a
one-off bespoke job, and the ROI curve is nasty enough even on
somewhat repeatable or commonplace assignments in specialist
translation and copywriting. Ironically, we are just simply cheaper
than machine translation, as long as more than an appearance of translation, or a gist of meaning, is sought. The sad part is only that machine translation plus light post-editing may become the standard for easier tasks in popular languages within a couple of decades, perhaps sooner, and obviously not all of us would welcome a role change or the need to work solely on the harder cases, with an increased mental strain.
Why is it like that, I still don't know why the situation happened: Dịch thuật Hà Nội, Dịch thuật TPHCM, Dịch thuật Bắc Ninh, Dịch thuật Thanh Hóa, Dịch thuật Cần Thơ, Dịch thuật Hải Phòng, Dịch thuật Đà Nẵng, Dịch thuật Nghệ An, Dịch thuật Bình Dương, .........................
ReplyDelete