[[FrontPage]] **言語教育学演習2014(木3 前・後期) [#s66093f3] #contents ***教室 [#h9ccd2fd] -201教室に変更になりました(5月15日より) ***講義題目(和文) [#p3f948d9] --コーパスを用いた学習者英語分析と教材作成への応用 (J) ***授業の目標 [#qb27a80b] --コーパス言語学の基礎的概念の習得とその応用分野としての CEFR 準拠の学習者データのコーパス分析の概要を知る。 ***授業の概要 [#ef5fd5b5] --コーパス言語学の基礎概念をテキストを購読しながら学ぶと同時に,コーパス処理の基礎を習得する。 ***授業の計画 [#aaf24698] --テキストを用いた専門知識の理解と Sketch Engine というコーパス検索システムの使用法に習熟する。 第1回(4/10) イントロダクション:コーパスと言語教材作成 第2回(4/17) コーパス言語学の基礎(1):コーパスの定義と歴史的変遷 (CBLS Unit A1) 第3回(5/ 1) コーパス言語学の基礎(2):代表性・バランス・標本 (CBLS Unit A2) + 宿題(Unit A5・A8 --> summary in English) 第4回(5/15) コーパス言語学の基礎(3):Markup & annotation (CBLS Unit A3 & A4) + 宿題(Unit A7 --> summary in English) 第5回(5/29) コーパス言語学の基礎(4):コーパスと応用分野 (CBLS Unit A10) 第6回(6/ 5) コーパス言語学の基礎(5):コーパスと辞書 (CBLS Unit C1) + BNCWeb の基本的な使い方とコロケーション統計 第7回(6/12) コーパス言語学の基礎(6):コーパスと文法研究 (CBLS Unit C2) + Sketch Engine を使った help (to) の検索実習 第8回(6/19) コーパス言語学の基礎(7):コーパスと言語習得研究 (CBLS Unit C3) 第9回(6/26) コーパス言語学の基礎(8):コーパスと翻訳研究 (CBLS Unit C6) 第10回(7/ 3) CEFR準拠の学習者データの分析(1):CEFR の概要と English Profile 第11回(7/10) CEFR準拠の学習者データの分析(2):English Profile の学習者データ分析 の概要 第12回 CEFR準拠の学習者データの分析(3):エラータグ付与の実際 第13回 CEFR準拠の学習者データの分析(4):基準特性抽出の研究概観 第14回 CEFR準拠の学習者データの分析(5):学習者コーパスの種類と基礎的な処理 演習 第15回 まとめと検索処理のテスト ***テキスト [#d3f4c1c2] |ISBN|0415286239| |書名|Corpus-based language studies : an advanced resource book| |著者名|McEnery, T., Xiao, R., & Tono, Y.| |出版社|Routledge| |出版年|2006| ***PDF [#sdd6219f] テキストがくるまで以下をダウンロードして利用して下さい: -[[Chapter A1>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/CBLS-A1.pdf]] -[[Chapter A2>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/CBLS-A2.pdf]] -[[Chapter A3>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/CBLS-A3.pdf]] -[[Chapter A4>http://www.corpus4u.org/forum/upload/forum/2005052212175880.pdf]] **READING: [#a8ffc990] ***Discussion questions [#ha205653] -Chapter 1: --What is a corpus? Discuss some common features by comparing different definitions. --Why use computers to study language? What is your intuitive answer to this? What other reasons did you find in the text? --Discuss the use of corpora and the use of intuition. Are they mutually exclusive? --Is corpus linguistics a methodology or a theory? --How different are corpus-based vs. corpus-driven approaches? Can you think of any concrete examples? -Chapter 2: -2.2 --What is "representativeness"? --What does it mean when Biber says "Representativeness refers to the extent to which a sample includes the full range of variabilityin a population." (p.13) --What are "internal" and "external" criteria used to select texts for a corpus? (p.14) --The authors say that it is problematical to use internal criteria as the primary parameters for the selection of corpus data. Why? (p.14) --Explain what Biber calls a 'cyclical fashion'? (p.14) --Static sample corpora, if resampled, may also allow the study of language change over time. (p.15) How? -2.3 --What are "general" vs. "specialized" corpora? How is representativeness achieved in these corpora? -2.4 --How is the acceptable balance of a corpus determined? --Any claim of corpus balance is largely an act of faith. (p.16) What does this mean? --Explain the design of the British National Corpus, using the terms 'domain', 'time', 'medium', 'demographic' and 'context-governed'. How is it balanced? --Elaborate on the following statements: ---Representativeness links to research questions. (p.18) ---Representativeness is a fluid concept. (p.18) -2.5 --Explain the notion of sampling using the following terms: ---sample/ population/ sampling unit/ sampling frame --What is the difference between 'simple random sampling' and 'stratified random sampling'? --Describe pros and cons of 'full text samples' -Chapter 3 -3.2 --What are the three reasons for corpus mark-up? Discuss each case with complete examples. -3.3 --Here, you should at least familiarize yourself with the following schemes: ---COCOA (dated) ---TEI (current standard) << [[website>http://www.tei-c.org/index.xml]] >> --> header vs. body Q1. What does the TEI header specify? Q2. What kind of information is in the TEI body? --Corpus Encoding Standard (CES) & XCES << [[website>http://www.xces.org/]] >> -3.4 --Please read the following webpage for your reference: ---[[Introduction to character encoding>http://www.dotnetnoob.com/2011/12/introduction-to-character-encoding.html]] -Chapter 4 --What is corpus annotation and how is it different from corpus mark-up? -4.2 --What are the four advantages for corpus annotation? --What are some of the criticisms against corpus annotation? What is the authors' response? -4.3 -Look at concrete examples for each type of annotation: --POS tagging ---[[Online tagging system by University of Illinois, Urbana Champaign>http://cogcomp.cs.illinois.edu/demo/pos/?id=4]] --Lemmatization ---[[Online stemmer and lemmatizer (Python NLTK) >http://text-processing.com/demo/stem/]] --Parsing ---[[Online parser (Stanford)>http://nlp.stanford.edu:8080/parser/]] --Semantic annotation ---[[Lancaster USAS tag>http://ucrel.lancs.ac.uk/usas/]] --Coreference annotation ---[[Image of coreference annotation>http://www.bart-coref.org/images/mmax_muc.png]] --Pragmatic annotation ---[[Examples (MICASE pragmatic tags)>http://www.ualberta.ca/~aacl2009/PDFs/NesiAhmadIbrahim2009AACL.pdf]] --Stylistic annotation ---[[Example: Speech, Thought & Presentation>http://stylistics.minb.de/index.php?c=Speech%20and%20Thought%20Presentation]] --Error tagging ---[[Example: Granger (2003)>http://calico.org/html/article_289.pdf]] --Problem-oriented annotation -Chapter 5-9 --Make a summary on your own -Chapter 10 --Summarize the use of corpus data in the following areas briefly --The major areas of linguistics ---lexicographic and lexical studies (10.2) ---grammatical studies (10.3) ---register variation and genre analysis (10.4) ---dialect distinction and language variety (10.5) ---contrastive and translation studies (10.6) ---diachronic study and language change (10.7) ---language learning and teaching (10.8) --Other areas which have started to use corpus data ---Semantics (10.9) ---Pragmatics (10.10) ---Sociolinguistics (10.11) ---Discourse analysis (10.12) ---Stylistics and literary studies (10.13) ---Forensic linguistics (10.14) --What is the limitation of corpus data? (10.15) **Sketch Engine CQL memo [#me8bd703] -help + bare infinitives --Brown Family の CLAWS7 tagset の検索式: ---[lemma="help"] [tag="VVI|VV0"] -help + to + V ---[lemma="help"] [lemma="to"] [tag="VVI"] -make + a + adj(optional) + NOUN ---[lemma="make"] [word="a"] [tag="JJ"]? [tag="NN1"] **カイ2乗検定など [#hdcbd091] -[[js-STAR>http://www.kisnet.or.jp/nappa/software/star/]] --web で計算できる統計 **Sample data for error tagging [#maa3863a] -[[Download>http://www.tufs.ac.jp/ts/personal/corpuskun/data/sample_A2.zip]] -[[Raw file Download>http://www.tufs.ac.jp/ts/personal/corpuskun/data/sample_A2.zip]] -[[Tagged file Download>http://www.tufs.ac.jp/ts/personal/corpuskun/data/sample_A2_tagged.zip]] **TUTORIALS: Corpus query tools [#j37d45ad] -[[Sketch Engine>http://www.sketchengine.co.uk]] --学内では IP auth をクリックすれば,どこからでもアクセスできます -Antconc (for Windows & Mac) --[[Download>http://www.antlab.sci.waseda.ac.jp/antconc_index.html]] -Casualconc (for Mac) --[[Download>https://sites.google.com/site/casualconcj/]] ***Concordance [#m5895ce7] -[[Tutorial 資料(6/12)>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/SkETutorial-04.pdf]] -[[Tutorial 資料(5/29)>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/SkETutorial-03.pdf]] -[[Tutorial 資料(5/1)>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/SkETutorial-02.pdf]] -[[Tutorial 資料(4/17)>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/SkETutorial-01.pdf]] -[[練習問題(5/1)>http://www.tufs.ac.jp/ts/personal/corpuskun/pdf/2014/Drill-01.pdf]]