[[FrontPage]]

Download
------------

1. Download TreeTagger from [[here>ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-windows-3.2.zip]]

2. Download the English parameter file from [[here>ftp://ftp.ims.uni-stuttgart.de/pub/corpora/english-par-linux-3.2.bin.gz]]

3. Uncompress 1. and 2. and read INSTALL.txt (below)

Installation
------------

1. Install a Perl interpreter (if you have not already installed one).
   You can download a Perl interpreter for Windows for free at
   http://www.activestate.com/activeperl/

2. Move the TreeTagger directory to the root directory of drive C:.

3. Download the PC parameter files for the languages you need, decompress
   them (e.g. using Winzip or 7zip) and move them to the subdirectory lib.
   Rename the parameter files to <language>.par
   Example: Rename french-par-linux-3.1.bin to french.par

N.B. As of 2020, there are two parameter files PENN Tagset and BNC Tagset, please be careful when installing using the .sh file. Both English parameter files will be called "english.par" and the one you downloaded will become a default parameter file. You cannot switch Penn tagset and BNC tagset on the fly. If you want to switch to a different tagset, you should copy the parameter file you want and run the shell script once again.

4. Add the path C:\TreeTagger\bin to the PATH environment variable.

5. Open a shell and type the command
   set PATH=C:\TreeTagger\bin;%PATH%

6. Change to the directory C:\TreeTagger

7. Now you can test the tagger, e.g. by analyzing this file with the command
   tag-english INSTALL.txt


Examples
------------
 c:\TreeTagger>set PATH=C:\TreeTagger\bin;%PATH%
 
 c:\TreeTagger>tag-english INSTALL.txt > INSTALL.tag.txt
        reading parameters ...
        tagging ...
        finished.

Chunker (for Mac-Intel)
------------
english-chunker-par-linux-3.2.bin.gz --> should be downloaded

-unzip the parameter file and move it to lib directory:
 gzip -cd english-chunker-par-linux-3.2.bin.gz > lib/english-chunker.par

-test:
 $ echo 'I was reading a book.' | cmd/tagger-chunker-english > hello.txt
 
 $ cat hello.txt
 <NC>
 I	PP	I 
 </NC>
 <VC>
 was	VBD	be
 reading	VBG	read
 </VC>
 <NC>
 a	DT	a
 book	NN	book
 </NC>
 .	SENT	.



Notes
------------
For Linux and Mac, Penn TreeBank tagset is default. If you want VV for lexical verbs, VB for be-verbs, and VH for have, you should edit the file 'tree-tagger-english' as follows:

 #last line
 perl -pe 's/\tV[BDHV]/\VB/;s/IN\/that/\tIN/;'
 
 --> perl -pe 's/IN\/that/\tIN/;'

------------
If you want to process multiple files with a batch. You can use the following command for a batch file:

 @echo off
 for %%A in (%1) do tag-english %%A > %%A.tag


トップ   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS