1. Download TreeTagger from [[here>ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-windows-3.2.zip]]

2. Download the English parameter file from [[here>ftp://ftp.ims.uni-stuttgart.de/pub/corpora/english-par-linux-3.2.bin.gz]]

3. Uncompress 1. and 2. and read INSTALL.txt (below)


1. Install a Perl interpreter (if you have not already installed one).
   You can download a Perl interpreter for Windows for free at

2. Move the TreeTagger directory to the root directory of drive C:.

3. Download the PC parameter files for the languages you need, decompress
   them (e.g. using Winzip or 7zip) and move them to the subdirectory lib.
   Rename the parameter files to <language>.par
   Example: Rename french-par-linux-3.1.bin to french.par

4. Add the path C:\TreeTagger\bin to the PATH environment variable.

5. Open a shell and type the command
   set PATH=C:\TreeTagger\bin;%PATH%

6. Change to the directory C:\TreeTagger

7. Now you can test the tagger, e.g. by analyzing this file with the command
   tag-english INSTALL.txt

 c:\TreeTagger>set PATH=C:\TreeTagger\bin;%PATH%
 c:\TreeTagger>tag-english INSTALL.txt > INSTALL.tag.txt
        reading parameters ...
        tagging ...

Chunker (for Mac-Intel)
english-chunker-par-linux-3.2.bin.gz --> should be downloaded

-unzip the parameter file and move it to lib directory:
 gzip -cd english-chunker-par-linux-3.2.bin.gz > lib/english-chunker.par

 $ echo 'I was reading a book.' | cmd/tagger-chunker-english > hello.txt
 $ cat hello.txt
 I	PP	I 
 was	VBD	be
 reading	VBG	read
 a	DT	a
 book	NN	book
 .	SENT	.

For Linux and Mac, Penn TreeBank tagset is default. If you want VV for lexical verbs, VB for be-verbs, and VH for have, you should edit the file 'tree-tagger-english' as follows:

 #last line
 perl -pe 's/\tV[BDHV]/\VB/;s/IN\/that/\tIN/;'
 --> perl -pe 's/IN\/that/\tIN/;'

If you want to process multiple files with a batch. You can use the following command for a batch file:

 @echo off
 for %%A in (%1) do tag-english %%A > %%A.tag

トップ   編集 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS