[[FrontPage]] Download ------------ 1. Download TreeTagger from [[here>ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-windows-3.2.zip]] 2. Download the English parameter file from [[here>ftp://ftp.ims.uni-stuttgart.de/pub/corpora/english-par-linux-3.2.bin.gz]] 3. Uncompress 1. and 2. and read INSTALL.txt (below) Installation ------------ 1. Install a Perl interpreter (if you have not already installed one). You can download a Perl interpreter for Windows for free at http://www.activestate.com/activeperl/ 2. Move the TreeTagger directory to the root directory of drive C:. 3. Download the PC parameter files for the languages you need, decompress them (e.g. using Winzip or 7zip) and move them to the subdirectory lib. Rename the parameter files to <language>.par Example: Rename french-par-linux-3.1.bin to french.par N.B. As of 2020, there are two parameter files PENN Tagset and BNC Tagset, please be careful when installing using the .sh file. Both English parameter files will be called "english.par" and the one you downloaded will become a default parameter file. You cannot switch Penn tagset and BNC tagset on the fly. N.B. As of 2020, there are two parameter files PENN Tagset and BNC Tagset, please be careful when installing using the .sh file. Both English parameter files will be called "english.par" and the one you downloaded will become a default parameter file. You cannot switch Penn tagset and BNC tagset on the fly. If you want to switch to a different tagset, you should copy the parameter file you want and run the shell script once again. 4. Add the path C:\TreeTagger\bin to the PATH environment variable. 5. Open a shell and type the command set PATH=C:\TreeTagger\bin;%PATH% 6. Change to the directory C:\TreeTagger 7. Now you can test the tagger, e.g. by analyzing this file with the command tag-english INSTALL.txt Examples ------------ c:\TreeTagger>set PATH=C:\TreeTagger\bin;%PATH% c:\TreeTagger>tag-english INSTALL.txt > INSTALL.tag.txt reading parameters ... tagging ... finished. Chunker (for Mac-Intel) ------------ english-chunker-par-linux-3.2.bin.gz --> should be downloaded -unzip the parameter file and move it to lib directory: gzip -cd english-chunker-par-linux-3.2.bin.gz > lib/english-chunker.par -test: $ echo 'I was reading a book.' | cmd/tagger-chunker-english > hello.txt $ cat hello.txt <NC> I PP I </NC> <VC> was VBD be reading VBG read </VC> <NC> a DT a book NN book </NC> . SENT . Notes ------------ For Linux and Mac, Penn TreeBank tagset is default. If you want VV for lexical verbs, VB for be-verbs, and VH for have, you should edit the file 'tree-tagger-english' as follows: #last line perl -pe 's/\tV[BDHV]/\VB/;s/IN\/that/\tIN/;' --> perl -pe 's/IN\/that/\tIN/;' ------------ If you want to process multiple files with a batch. You can use the following command for a batch file: @echo off for %%A in (%1) do tag-english %%A > %%A.tag