TRADUZIR INGLÊS

download TRADUZIR INGLÊS

of 5

Transcript of TRADUZIR INGLÊS

  • 8/12/2019 TRADUZIR INGLS

    1/5

    The OEC: Facts about thelanguage

    The 20-volume historical Oxford English Dictionaryis the largest record of words used in English, past

    and present. It contains words that are now obsolete or rare (such asxenagogue'a person who guides

    strangers' and vicine'neighbouring or adacent'! in addition to the latest coinages such

    asphishingandpodcast.

    The second edition of the OED, published in "#$# and consisting of twent% volumes, contains more than

    &",000 entries, and the third, available online, is epanding all the time, with batches of 2,00 new and

    revised words and phrases being added in regular )uarterl% updates.

    How many words are there in English?

    It is a )uestion often as*ed, but not so easil% answered. Even the OEDdoes not set out to include ever%

    speciali+ed technical term or slang or dialect epression ever used. ew words are constantl% being

    invented, developed from eisting words, or adopted from other languages. ost will be used rarel%, or

    onl% b% a small group of people. This means that an unlimited number of words ma% occur in speech and

    writing which will never be recorded in even the largest dictionar%.

    urthermore, what eactl% is a word/ learl% we should include single units such as catand dog. 1ut are

    the plurals catsand dogsseparate words/ hould we include compounds such as walking stick, which are

    made up of two eisting words/ There are an almost unlimited number of such two-word compounds,

    which can't all be included in a dictionar%. 3nd what about abbreviations li*eBBCandDr, or proper

    names such asLondon,Nelson, andHarry Potter4 are the% words/ 3s %ou can see, the )uestion is not a

    straightforward one.

    How many words do we use?

    3lthough it ma% be impossible to *now the number of words in English, the 5ford English orpus can

    help us assess the number of words in current use.

    Instead of tal*ing about words, it's more useful in this contet to tal* about leas, a lemma being the

    base form of a word. or eample, cli!s, cli!ing, and cli!edare all eamples of the one lemma cli!.

    6ust ten different lemmas (the,!e, to, of, and, a, in, that, have, and"! account for a remar*able 27 of all

  • 8/12/2019 TRADUZIR INGLS

    2/5

    the words used in the 5ford English orpus. If %ou were to read through the corpus, one word in four

    (ignoring proper names! would be an eample of one of these ten lemmas. imilarl%, the "00 most

    common lemmas account for 07 of the corpus, and the ",000 most common lemmas account for 87.

    1ut to account for #07 of the corpus %ou would need a vocabular% of 8,000 lemmas, and to get to #7 the

    figure would be around 0,000 lemmas.

    The remaining 7 of the corpus consists of a ver% large number of lemmas which occur rarel%4 words

    li*e oidoreorparados, which ma% occur onl% once ever% several million words. 9i*e all natural

    languages, English consists of a small number of ver% common words, a larger number of intermediate

    ones, and then an indefinitel% long 'tail' of ver% rare terms.

    Vocabulary size (no.

    lemmas)

    % of content in

    OECExample lemmas

    10 25% the, of, and, to, that, have100 50% from, because, go, me, our, well, way

    1000 75% girl, win, decide, huge, difficult, series

    7000 90% tackle, peak, crude, purely, dude,

    modest

    50,000 95% saboteur, autocracy, caly, conformist

    !1,000,000 99% laggardly, endobenthic, pomological

    The long tail means that to account for ##7 of the 5ford English orpus %ou would need a vocabular% of

    more than a million lemmas. This would include some words which ma% occur onl% once or twice in the

    whole corpus4 highl% technical terms li*e chrondrogenesisor dicar!oxylate, and one-off coinages

    li*e !ootlickinglyor unsurfworthythat people would probabl% understand but would be unli*el% to use.

    If we decide that around #0-#7 of the corpus gives a reasonable idea of an average vocabular%, we are left

    with a figure somewhere in the range of 8,000-0,000 lemmas4 sa%, 2,000. :hat does a vocabular% of

    this si+e represent/ It represents the set of most significant words in English4 those which occur reasonabl%

    fre)uentl% and which account for all but a small part of ever%thing we ma% encounter in speech or writing.

    It includes all the words that we activel% use in general ever%da% life.

    It's interesting to note that most reasonabl% si+ed dictionaries contain significantl% more than 2,000

    lemmas.The ""th edition of the Concise Oxford English Dictionary, for eample, lists more than 8,000

    single-word lemmas, which means that the maorit% of its entries must belong to the long tail of etremel%

    rare words. This ma*es good sense4 such terms occur ver% infre)uentl%, but when the% do the% are li*el% to

    be crucial to what's being said, and the reader might well want to loo* them up.The idea of a )uantifiable

  • 8/12/2019 TRADUZIR INGLS

    3/5

    vocabular% should be seen in this light4 the words we ignore for the purposes of the eercise ma% be ver%

    rare, but in contet the% ma% be ver% important.

    What is the commonest word?

    1ased on the evidence of the 5ford English orpus, which currentl% contains over 2 billion words, the "00

    commonest English words found in writing around the world are as follows4

    1 the

    2 be

    " to

    # of

    5 and

    $ a

    7 in

    that9 have

    10 &

    11 it

    12 for

    1" not

    1# on

    15 with

    1$ he

    17 as

    1 you

    19 do20 at

    21 this

    22 but

    2" his

    2# by

    25 from

    2$ they

    27 we

    2 say

    29 her

    "0 she

    "1 or

    "2 an

    "" will"# my

    "5 one

    "$ all

    "7 would

    " there

    "9 their

    #0 what

    #1 so

    #2 up

    #" out

    ## if#5 about

    #$ who

    #7 get

    # which

    #9 go

    50 me

    51 when

    52 make

    5" can

    5# like

    55 time

    5$ no

    57 'ust

    5 him59 know

    $0 take

    $1 people

    $2 into

    $" year

    $# your

    $5 good

    $$ some

    $7 could

    $ them

    $9 see70 other

    71 than

    72 then

    7" now

    7# look

    75 only

    7$ come

    77 its

    7 over

    79 think

    0 also

    1 back

    2 after

    " use# two

    5 how

    $ our

    7 work

    first

    9 well

    90 way

    91 even

    92 new

    9" want

    9# because95 any

    9$ these

    97 give

    9 day

    99 most

    100 us

    It's noticeable that man% of the most fre)uentl% used words are short ones whose main purpose is to oin

    other, longer words rather than determine the meaning of a sentence. These are *nown as 'function words'.

    It could be said that it's more interesting to eplore the fre)uenc% of 'content words', as shown in the list

    below4

    Nouns Verbs Adjecties

    1 time

    2 person

    " year

    # way

    5 day

    $ thing7 man

    1 be

    2 have

    " do

    # say

    5 get

    $ make7 go

    1 good

    2 new

    " first

    # last

    5 long

    $ great7 little

  • 8/12/2019 TRADUZIR INGLS

    4/5

    world

    9 life

    10 hand

    11 part

    12 child

    1" eye1# woman

    15 place

    1$ work

    17 week

    1 case

    19 point

    20 government

    21 company

    22 number

    2" group

    2# problem

    25 fact

    know

    9 take

    10 see

    11 come

    12 think

    1" look1# want

    15 give

    1$ use

    17 find

    1 tell

    19 ask

    20 work

    21 seem

    22 feel

    2" try

    2# leave

    25 call

    own

    9 other

    10 old

    11 right

    12 big

    1" high1# different

    15 small

    1$ large

    17 net

    1 early

    19 young

    20 important

    21 few

    22 public

    2" bad

    2# same

    25 able

    Nouns

    The commonest nouns are tie,person, and year, followed b% wayand day(onthis ;0th!. The maorit%

    of the top 2 nouns ("! are from 5ld English, and of the remainder, most came into medieval English from

    5ld rench, and before that from 9atin. otice that man% of these words are ver% common because the%

    have more than one meaning4 wayandpart, for eample, are listed in the Concise OEDas having "$ and

    "& different meanings respectivel%. The% often also form part of common phrases4 some of the fre)uenc%

    of tie, for eample, comes from its use in adverbial phrases li*e on tie, in tie,last tie, next tie, this

    tie, etc.

    Verbs

    3s %ou would epect, the commonest verbs epress basic concepts. tri*ingl%, the 2 most fre)uent verbs

    are all one-s%llable words< the first two-s%llable verbs are !ecoe(2&th! and include(28th!. 5f these 2,

    20 are 5ld English words, and three more, get, see, and want, entered English from 5ld orse in the

    earl% medieval period. 5nl% tryand usecame from 5ld rench. It seems that English prefers terse, ancient

    words to describe actions or occurrences.

    Adjectives

    3gain, most of the top adectives are one-s%llable words, and "8 out of 2 derive from 5ld English4

    onl% different, large, andiportantare from 9atin. In terms of the words' meanings, greatis higher in the

    ran*ing than !ig, probabl% because of its informal sense 'ver% good'.Littleis surprisingl% high at 8, as

  • 8/12/2019 TRADUZIR INGLS

    5/5

    compared with sallat ".Badis unepectedl% low at 2=4 is this because we have such a large choice of

    s%non%ms available for epressing 'bad things'/