[[FrontPage]]

ALLT2112: David Tugwell

**What is Sketch Engine? [#x1b0af01]

-David is the original author of word sketch at Univ. of Brighton

-42 language corpora so far:
--20 billion Russian corpora is the biggest one.
-65 preloaded corpora
--Balanced corpora: BNC
--Specialized corpora: CHILDES, BASE, BAWE
--Web corpora: de-duplicated, cleaned
---range of "ten-ten" 10^10=10 billion
-Chinese
--Chinese GigaWord, Chinese TaiwanWaC
-Load your own corpora
--automatic lemmatization, tagging, word sketches
-WebBootCat
--low-density languages & subject areas
--seed collection process
--results are cleaned and processed in a new corpus

**Use of SkE [#da605c08]
-Lexicography
--Collins, Macmillan, CUP, OUP, Le Robert, Cornelsen Verlag, Shogakukan, Instituut voor Nederlandse
-Language Lerning
-
**WebBootCat [#p9446d22]

-To create seedword list
--You can try Google translators to insert a list of seed words in English and translate them into the target language






トップ   編集 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS