Download
1. Download TreeTagger from here
2. Download the English parameter file from here
3. Uncompress 1. and 2. and read INSTALL.txt (below)
Installation
1. Install a Perl interpreter (if you have not already installed one).
You can download a Perl interpreter for Windows for free at http://www.activestate.com/activeperl/
2. Move the TreeTagger directory to the root directory of drive C:.
3. Download the PC parameter files for the languages you need, decompress
them (e.g. using Winzip or 7zip) and move them to the subdirectory lib. Rename the parameter files to <language>.par Example: Rename french-par-linux-3.1.bin to french.par
4. Add the path C:\TreeTagger\bin to the PATH environment variable.
5. Open a shell and type the command
set PATH=C:\TreeTagger\bin;%PATH%
6. Change to the directory C:\TreeTagger
7. Now you can test the tagger, e.g. by analyzing this file with the command
tag-english INSTALL.txt
Examples
c:\TreeTagger>set PATH=C:\TreeTagger\bin;%PATH% c:\TreeTagger>tag-english INSTALL.txt > INSTALL.tag.txt reading parameters ... tagging ... finished.
Chunker (for Mac-Intel)
english-chunker-par-linux-3.2.bin.gz --> should be downloaded
gzip -cd english-chunker-par-linux-3.2.bin.gz > lib/english-chunker.par
$ echo 'I was reading a book.' | cmd/tagger-chunker-english > hello.txt $ cat hello.txt <NC> I PP I </NC> <VC> was VBD be reading VBG read </VC> <NC> a DT a book NN book </NC> . SENT .
Notes
For Linux and Mac, Penn TreeBank? tagset is default. If you want VV for lexical verbs, VB for be-verbs, and VH for have, you should edit the file 'tree-tagger-english' as follows:
#last line perl -pe 's/\tV[BDHV]/\VB/;s/IN\/that/\tIN/;' --> perl -pe 's/IN\/that/\tIN/;'
If you want to process multiple files with a batch. You can use the following command for a batch file:
@echo off for %%A in (%1) do tag-english %%A > %%A.tag