FrontPage

Download


1. Download TreeTagger from here

2. Download the English parameter file from here

3. Uncompress 1. and 2. and read INSTALL.txt (below)

Installation


1. Install a Perl interpreter (if you have not already installed one).

  You can download a Perl interpreter for Windows for free at
  http://www.activestate.com/activeperl/

2. Move the TreeTagger directory to the root directory of drive C:.

3. Download the PC parameter files for the languages you need, decompress

  them (e.g. using Winzip or 7zip) and move them to the subdirectory lib.
  Rename the parameter files to <language>.par
  Example: Rename french-par-linux-3.1.bin to french.par

N.B. As of 2020, there are two parameter files PENN Tagset and BNC Tagset, please be careful when installing using the .sh file. Both English parameter files will be called "english.par" and the one you downloaded will become a default parameter file. You cannot switch Penn tagset and BNC tagset on the fly. If you want to switch to a different tagset, you should copy the parameter file you want and run the shell script once again.

4. Add the path C:\TreeTagger\bin to the PATH environment variable.

5. Open a shell and type the command

  set PATH=C:\TreeTagger\bin;%PATH%

6. Change to the directory C:\TreeTagger

7. Now you can test the tagger, e.g. by analyzing this file with the command

  tag-english INSTALL.txt

Examples


c:\TreeTagger>set PATH=C:\TreeTagger\bin;%PATH%

c:\TreeTagger>tag-english INSTALL.txt > INSTALL.tag.txt
       reading parameters ...
       tagging ...
       finished.

Chunker (for Mac-Intel)


english-chunker-par-linux-3.2.bin.gz --> should be downloaded

  • unzip the parameter file and move it to lib directory:
    gzip -cd english-chunker-par-linux-3.2.bin.gz > lib/english-chunker.par
  • test:
    $ echo 'I was reading a book.' | cmd/tagger-chunker-english > hello.txt
    
    $ cat hello.txt
    <NC>
    I	PP	I 
    </NC>
    <VC>
    was	VBD	be
    reading	VBG	read
    </VC>
    <NC>
    a	DT	a
    book	NN	book
    </NC>
    .	SENT	.

Notes


For Linux and Mac, Penn TreeBank? tagset is default. If you want VV for lexical verbs, VB for be-verbs, and VH for have, you should edit the file 'tree-tagger-english' as follows:

#last line
perl -pe 's/\tV[BDHV]/\VB/;s/IN\/that/\tIN/;'

--> perl -pe 's/IN\/that/\tIN/;'

If you want to process multiple files with a batch. You can use the following command for a batch file:

@echo off
for %%A in (%1) do tag-english %%A > %%A.tag

トップ   編集 凍結 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS
Last-modified: 2020-03-05 (木) 22:36:35 (1512d)