1. Download TreeTagger from here

2. Download the English parameter file from here

3. Uncompress 1. and 2. and read INSTALL.txt (below)


1. Install a Perl interpreter (if you have not already installed one).

  You can download a Perl interpreter for Windows for free at

2. Move the TreeTagger directory to the root directory of drive C:.

3. Download the PC parameter files for the languages you need, decompress

  them (e.g. using Winzip or 7zip) and move them to the subdirectory lib.
  Rename the parameter files to <language>.par
  Example: Rename french-par-linux-3.1.bin to french.par

4. Add the path C:\TreeTagger\bin to the PATH environment variable.

5. Open a shell and type the command

  set PATH=C:\TreeTagger\bin;%PATH%

6. Change to the directory C:\TreeTagger

7. Now you can test the tagger, e.g. by analyzing this file with the command

  tag-english INSTALL.txt


c:\TreeTagger>set PATH=C:\TreeTagger\bin;%PATH%

c:\TreeTagger>tag-english INSTALL.txt > INSTALL.tag.txt
       reading parameters ...
       tagging ...

Chunker (for Mac-Intel)

english-chunker-par-linux-3.2.bin.gz --> should be downloaded

  • unzip the parameter file and move it to lib directory:
    gzip -cd english-chunker-par-linux-3.2.bin.gz > lib/english-chunker.par
  • test:
    $ echo 'I was reading a book.' | cmd/tagger-chunker-english > hello.txt
    $ cat hello.txt
    I	PP	I 
    was	VBD	be
    reading	VBG	read
    a	DT	a
    book	NN	book
    .	SENT	.


For Linux and Mac, Penn TreeBank? tagset is default. If you want VV for lexical verbs, VB for be-verbs, and VH for have, you should edit the file 'tree-tagger-english' as follows:

#last line
perl -pe 's/\tV[BDHV]/\VB/;s/IN\/that/\tIN/;'

--> perl -pe 's/IN\/that/\tIN/;'

If you want to process multiple files with a batch. You can use the following command for a batch file:

@echo off
for %%A in (%1) do tag-english %%A > %%A.tag

トップ   編集 凍結 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS
Last-modified: 2013-01-14 (月) 23:50:52 (2597d)