言語教育学演習2014

FrontPage

言語教育学演習2014（木３　前・後期） †

言語教育学演習2014（木３　前・後期）
READING:
- Discussion questions
Sketch Engine CQL memo
カイ2乗検定など
Sample data for error tagging
TUTORIALS: Corpus query tools
- Concordance

↑

教室 †

201教室に変更になりました（5月15日より）

↑

講義題目（和文） †

コーパスを用いた学習者英語分析と教材作成への応用 (J)

↑

授業の目標 †

コーパス言語学の基礎的概念の習得とその応用分野としての CEFR 準拠の学習者データのコーパス分析の概要を知る。

↑

授業の概要 †

コーパス言語学の基礎概念をテキストを購読しながら学ぶと同時に，コーパス処理の基礎を習得する。

↑

授業の計画 †

テキストを用いた専門知識の理解と Sketch Engine というコーパス検索システムの使用法に習熟する。

第1回(4/10)　イントロダクション：コーパスと言語教材作成
第2回(4/17)　コーパス言語学の基礎（１）：コーパスの定義と歴史的変遷　(CBLS Unit A1)
第3回(5/ 1)　コーパス言語学の基礎（２）：代表性・バランス・標本      (CBLS Unit A2) + 宿題（Unit A5・A8 --> summary in English)
第4回(5/15)　コーパス言語学の基礎（３）：Markup & annotation         (CBLS Unit A3　& A4) + 宿題（Unit A7 --> summary in English)
第5回(5/29)　コーパス言語学の基礎（４）：コーパスと応用分野          (CBLS Unit A10)
第6回(6/ 5)　コーパス言語学の基礎（５）：コーパスと辞書              (CBLS Unit C1) + BNCWeb の基本的な使い方とコロケーション統計
第7回(6/12)　コーパス言語学の基礎（６）：コーパスと文法研究          (CBLS Unit C2) + Sketch Engine を使った help (to) の検索実習
第8回(6/19)　コーパス言語学の基礎（７）：コーパスと言語習得研究      (CBLS Unit C3)
第9回(6/26)　コーパス言語学の基礎（８）：コーパスと翻訳研究          (CBLS Unit C6)
第10回(7/ 3)　CEFR準拠の学習者データの分析（１）：CEFR の概要と English Profile
第11回(7/10)　CEFR準拠の学習者データの分析（２）：English Profile の学習者データ分析 の概要
第12回　CEFR準拠の学習者データの分析（３）：エラータグ付与の実際
第13回　CEFR準拠の学習者データの分析（４）：基準特性抽出の研究概観
第14回　CEFR準拠の学習者データの分析（５）：学習者コーパスの種類と基礎的な処理 演習
第15回　まとめと検索処理のテスト

↑

テキスト †

ISBN	0415286239
書名	Corpus-based language studies : an advanced resource book
著者名	McEnery?, T., Xiao, R., & Tono, Y.
出版社	Routledge
出版年	2006

↑

PDF †

テキストがくるまで以下をダウンロードして利用して下さい：

↑

READING: †

↑

Discussion questions †

Chapter 1:

What is a corpus? Discuss some common features by comparing different definitions.
Why use computers to study language? What is your intuitive answer to this? What other reasons did you find in the text?
Discuss the use of corpora and the use of intuition. Are they mutually exclusive?
Is corpus linguistics a methodology or a theory?
How different are corpus-based vs. corpus-driven approaches? Can you think of any concrete examples?

Chapter 2:

What is "representativeness"?
What does it mean when Biber says "Representativeness refers to the extent to which a sample includes the full range of variabilityin a population." (p.13)
What are "internal" and "external" criteria used to select texts for a corpus? (p.14)
The authors say that it is problematical to use internal criteria as the primary parameters for the selection of corpus data. Why? (p.14)
Explain what Biber calls a 'cyclical fashion'? (p.14)
Static sample corpora, if resampled, may also allow the study of language change over time. (p.15) How?

What are "general" vs. "specialized" corpora? How is representativeness achieved in these corpora?

How is the acceptable balance of a corpus determined?
Any claim of corpus balance is largely an act of faith. (p.16) What does this mean?
Explain the design of the British National Corpus, using the terms 'domain', 'time', 'medium', 'demographic' and 'context-governed'. How is it balanced?
Elaborate on the following statements:
- Representativeness links to research questions. (p.18)
- Representativeness is a fluid concept. (p.18)

Explain the notion of sampling using the following terms:
- sample/ population/ sampling unit/ sampling frame
What is the difference between 'simple random sampling' and 'stratified random sampling'?
Describe pros and cons of 'full text samples'

Chapter 3

3.2
- What are the three reasons for corpus mark-up? Discuss each case with complete examples.

3.3
- Here, you should at least familiarize yourself with the following schemes:
  - COCOA (dated)
  - TEI (current standard) << website >>
```
   --> header vs. body
Q1. What does the TEI header specify?
Q2. What kind of information is in the TEI body?
```

Corpus Encoding Standard (CES) & XCES << website >>

3.4
- Please read the following webpage for your reference:
  - Introduction to character encoding

Chapter 4

What is corpus annotation and how is it different from corpus mark-up?

4.2
- What are the four advantages for corpus annotation?
- What are some of the criticisms against corpus annotation? What is the authors' response?

4.3
Look at concrete examples for each type of annotation:

POS tagging
- Online tagging system by University of Illinois, Urbana Champaign

Lemmatization
- Online stemmer and lemmatizer (Python NLTK)

Parsing
- Online parser (Stanford)

Semantic annotation
- Lancaster USAS tag

Coreference annotation
- Image of coreference annotation

Pragmatic annotation
- Examples (MICASE pragmatic tags)

Stylistic annotation
- Example: Speech, Thought & Presentation

Error tagging
- Example: Granger (2003)

Problem-oriented annotation

Chapter 5-9

Make a summary on your own

Chapter 10

Summarize the use of corpus data in the following areas briefly

The major areas of linguistics
- lexicographic and lexical studies (10.2)
- grammatical studies (10.3)
- register variation and genre analysis (10.4)
- dialect distinction and language variety (10.5)
- contrastive and translation studies (10.6)
- diachronic study and language change (10.7)
- language learning and teaching (10.8)

Other areas which have started to use corpus data
- Semantics (10.9)
- Pragmatics (10.10)
- Sociolinguistics (10.11)
- Discourse analysis (10.12)
- Stylistics and literary studies (10.13)
- Forensic linguistics (10.14)