TRADUZIR INGLÊS
-
Upload
cadufranco -
Category
Documents
-
view
217 -
download
0
Transcript of TRADUZIR INGLÊS
-
8/12/2019 TRADUZIR INGLS
1/5
The OEC: Facts about thelanguage
The 20-volume historical Oxford English Dictionaryis the largest record of words used in English, past
and present. It contains words that are now obsolete or rare (such asxenagogue'a person who guides
strangers' and vicine'neighbouring or adacent'! in addition to the latest coinages such
asphishingandpodcast.
The second edition of the OED, published in "#$# and consisting of twent% volumes, contains more than
&",000 entries, and the third, available online, is epanding all the time, with batches of 2,00 new and
revised words and phrases being added in regular )uarterl% updates.
How many words are there in English?
It is a )uestion often as*ed, but not so easil% answered. Even the OEDdoes not set out to include ever%
speciali+ed technical term or slang or dialect epression ever used. ew words are constantl% being
invented, developed from eisting words, or adopted from other languages. ost will be used rarel%, or
onl% b% a small group of people. This means that an unlimited number of words ma% occur in speech and
writing which will never be recorded in even the largest dictionar%.
urthermore, what eactl% is a word/ learl% we should include single units such as catand dog. 1ut are
the plurals catsand dogsseparate words/ hould we include compounds such as walking stick, which are
made up of two eisting words/ There are an almost unlimited number of such two-word compounds,
which can't all be included in a dictionar%. 3nd what about abbreviations li*eBBCandDr, or proper
names such asLondon,Nelson, andHarry Potter4 are the% words/ 3s %ou can see, the )uestion is not a
straightforward one.
How many words do we use?
3lthough it ma% be impossible to *now the number of words in English, the 5ford English orpus can
help us assess the number of words in current use.
Instead of tal*ing about words, it's more useful in this contet to tal* about leas, a lemma being the
base form of a word. or eample, cli!s, cli!ing, and cli!edare all eamples of the one lemma cli!.
6ust ten different lemmas (the,!e, to, of, and, a, in, that, have, and"! account for a remar*able 27 of all
-
8/12/2019 TRADUZIR INGLS
2/5
the words used in the 5ford English orpus. If %ou were to read through the corpus, one word in four
(ignoring proper names! would be an eample of one of these ten lemmas. imilarl%, the "00 most
common lemmas account for 07 of the corpus, and the ",000 most common lemmas account for 87.
1ut to account for #07 of the corpus %ou would need a vocabular% of 8,000 lemmas, and to get to #7 the
figure would be around 0,000 lemmas.
The remaining 7 of the corpus consists of a ver% large number of lemmas which occur rarel%4 words
li*e oidoreorparados, which ma% occur onl% once ever% several million words. 9i*e all natural
languages, English consists of a small number of ver% common words, a larger number of intermediate
ones, and then an indefinitel% long 'tail' of ver% rare terms.
Vocabulary size (no.
lemmas)
% of content in
OECExample lemmas
10 25% the, of, and, to, that, have100 50% from, because, go, me, our, well, way
1000 75% girl, win, decide, huge, difficult, series
7000 90% tackle, peak, crude, purely, dude,
modest
50,000 95% saboteur, autocracy, caly, conformist
!1,000,000 99% laggardly, endobenthic, pomological
The long tail means that to account for ##7 of the 5ford English orpus %ou would need a vocabular% of
more than a million lemmas. This would include some words which ma% occur onl% once or twice in the
whole corpus4 highl% technical terms li*e chrondrogenesisor dicar!oxylate, and one-off coinages
li*e !ootlickinglyor unsurfworthythat people would probabl% understand but would be unli*el% to use.
If we decide that around #0-#7 of the corpus gives a reasonable idea of an average vocabular%, we are left
with a figure somewhere in the range of 8,000-0,000 lemmas4 sa%, 2,000. :hat does a vocabular% of
this si+e represent/ It represents the set of most significant words in English4 those which occur reasonabl%
fre)uentl% and which account for all but a small part of ever%thing we ma% encounter in speech or writing.
It includes all the words that we activel% use in general ever%da% life.
It's interesting to note that most reasonabl% si+ed dictionaries contain significantl% more than 2,000
lemmas.The ""th edition of the Concise Oxford English Dictionary, for eample, lists more than 8,000
single-word lemmas, which means that the maorit% of its entries must belong to the long tail of etremel%
rare words. This ma*es good sense4 such terms occur ver% infre)uentl%, but when the% do the% are li*el% to
be crucial to what's being said, and the reader might well want to loo* them up.The idea of a )uantifiable
-
8/12/2019 TRADUZIR INGLS
3/5
vocabular% should be seen in this light4 the words we ignore for the purposes of the eercise ma% be ver%
rare, but in contet the% ma% be ver% important.
What is the commonest word?
1ased on the evidence of the 5ford English orpus, which currentl% contains over 2 billion words, the "00
commonest English words found in writing around the world are as follows4
1 the
2 be
" to
# of
5 and
$ a
7 in
that9 have
10 &
11 it
12 for
1" not
1# on
15 with
1$ he
17 as
1 you
19 do20 at
21 this
22 but
2" his
2# by
25 from
2$ they
27 we
2 say
29 her
"0 she
"1 or
"2 an
"" will"# my
"5 one
"$ all
"7 would
" there
"9 their
#0 what
#1 so
#2 up
#" out
## if#5 about
#$ who
#7 get
# which
#9 go
50 me
51 when
52 make
5" can
5# like
55 time
5$ no
57 'ust
5 him59 know
$0 take
$1 people
$2 into
$" year
$# your
$5 good
$$ some
$7 could
$ them
$9 see70 other
71 than
72 then
7" now
7# look
75 only
7$ come
77 its
7 over
79 think
0 also
1 back
2 after
" use# two
5 how
$ our
7 work
first
9 well
90 way
91 even
92 new
9" want
9# because95 any
9$ these
97 give
9 day
99 most
100 us
It's noticeable that man% of the most fre)uentl% used words are short ones whose main purpose is to oin
other, longer words rather than determine the meaning of a sentence. These are *nown as 'function words'.
It could be said that it's more interesting to eplore the fre)uenc% of 'content words', as shown in the list
below4
Nouns Verbs Adjecties
1 time
2 person
" year
# way
5 day
$ thing7 man
1 be
2 have
" do
# say
5 get
$ make7 go
1 good
2 new
" first
# last
5 long
$ great7 little
-
8/12/2019 TRADUZIR INGLS
4/5
world
9 life
10 hand
11 part
12 child
1" eye1# woman
15 place
1$ work
17 week
1 case
19 point
20 government
21 company
22 number
2" group
2# problem
25 fact
know
9 take
10 see
11 come
12 think
1" look1# want
15 give
1$ use
17 find
1 tell
19 ask
20 work
21 seem
22 feel
2" try
2# leave
25 call
own
9 other
10 old
11 right
12 big
1" high1# different
15 small
1$ large
17 net
1 early
19 young
20 important
21 few
22 public
2" bad
2# same
25 able
Nouns
The commonest nouns are tie,person, and year, followed b% wayand day(onthis ;0th!. The maorit%
of the top 2 nouns ("! are from 5ld English, and of the remainder, most came into medieval English from
5ld rench, and before that from 9atin. otice that man% of these words are ver% common because the%
have more than one meaning4 wayandpart, for eample, are listed in the Concise OEDas having "$ and
"& different meanings respectivel%. The% often also form part of common phrases4 some of the fre)uenc%
of tie, for eample, comes from its use in adverbial phrases li*e on tie, in tie,last tie, next tie, this
tie, etc.
Verbs
3s %ou would epect, the commonest verbs epress basic concepts. tri*ingl%, the 2 most fre)uent verbs
are all one-s%llable words< the first two-s%llable verbs are !ecoe(2&th! and include(28th!. 5f these 2,
20 are 5ld English words, and three more, get, see, and want, entered English from 5ld orse in the
earl% medieval period. 5nl% tryand usecame from 5ld rench. It seems that English prefers terse, ancient
words to describe actions or occurrences.
Adjectives
3gain, most of the top adectives are one-s%llable words, and "8 out of 2 derive from 5ld English4
onl% different, large, andiportantare from 9atin. In terms of the words' meanings, greatis higher in the
ran*ing than !ig, probabl% because of its informal sense 'ver% good'.Littleis surprisingl% high at 8, as
-
8/12/2019 TRADUZIR INGLS
5/5
compared with sallat ".Badis unepectedl% low at 2=4 is this because we have such a large choice of
s%non%ms available for epressing 'bad things'/