要旨

「JLPTUFS作文コーパスの構築について ―全学日本語プログラムで学ぶ日本語学習者の作文データベース化―」

Building the JLPTUFS Composition Corpus
―Turning compositions written by learners of Japanese enrolled in the Japanese Language Program of Tokyo University of Foreign Studies into electronic data―

SUZUKI Tomomi NAKAMURA Akira HAN Jinzhu

We have given an interim report (as of fall 2009) on the project of building the JLPTUFS Composition Corpus (academic years 2008 to 2010).

The outline of this project is as follows:

(1) The Purpose: By turning a large number of compositions written by learners of Japanese into electronic data, we aim to analyze usages of grammatical items, vocabulary, kanji, etc., and to investigate what kinds of correlations exist between the compositions that learners write and their native languages and level of Japanese proficiency. We shall utilize our findings to improve Japanese language education.

(2) The Contents: Of all the compositions written for courses of the Japanese Language Program of Tokyo University of Foreign Studies, those compositions of writers who have given us permission will be turned into data, with which Japanese learners' composition corpus will be built.

(3) The Budget: The Development of Global Japanese Language Standards (Tokyo University of Foreign Studies), which was selected as one of the Programs for Promoting High-Quality University Education by the Ministry of Education, Culture, Sports and Technology in 2008. (From fall, 2008)

"JLPTUFS Composition Corpus" is composed of the list containing information regarding compositions (an Excel file), compositions (text files) and compositions (in PDF) that are cross-linked to one another. Writers' names and other personal names found in the compositions are deleted when they are made into electronic data.

In the academic year 2008, we made preparations such as designing the corpus and deciding on the procedure. We started collecting compositions in the academic year 2009, and have been incorporating them into the electronic database. In the spring term of 2009, we were able to collect about 500 compositions. We plan to collect compositions and incorporate them into the database following the same procedure in the fall term of 2009 and in 2010 as well.

キーワード:
作文コーパス、全学日本語プログラム、テキストファイル、教育GP「世界的基準となる日本語スタンダーズの構築」