~nolda/makedulko

A build system for learner corpora annotated with the EXMARaLDA (Dulko) tools.
~nolda/makedulko

New ticket tracker added

19 days ago
~nolda/makedulko

New hg repository added

19 days ago

#makeDulko

This repository provides a build system for learner corpora which generates ANNIS data from EXMARaLDA sources annotated with the EXMARaLDA (Dulko) tools.

#Prerequisites

makeDulko requires a Unix-like system such as Linux, MacOS, or the Windows Linux Subsystem with the following prerequisites:

  • GNU make (provided by the Debian package make)
  • GNU sed (provided by the Debian package sed)
  • Java (provided by the Debian package openjdk-8-jre)
  • rsync (provided by the Debian package rsync)
  • zip (provided by the Debian package zip)
  • exb2exb.sh (part of EXMARaLDA (Dulko))
  • exb2metadata.sh (part of EXMARaLDA (Dulko))
  • the XSLT stylesheets exb2exb.xsl, exb2exb-annis.xsl, exb2exb-tiers.xsl, exb2metadata.xsl, and metadata.xsl, required by exb2exb.sh and exb2metadata.sh (part of EXMARaLDA (Dulko))
  • the Dulko template dulko.template.exb, required by the XSLT stylesheets (part of EXMARaLDA (Dulko))
  • Pepper (available at https://corpus-tools.org/pepper/)

In order to run ANNIS with the generated data, you will have to install:

makeDulko has been tested with Java 1.8.0, Pepper 3.2.7, PostgreSQL 9.6, and ANNIS 3.5.1.

#Usage

  1. Copy or link your EXMARaLDA sources to src/exmaralda/corpus/.

  2. Optionally, edit CORPUS and VERSION in src/Makefile.

  3. Open a terminal and cd to src/.

  4. Run make or make all in order to generate ANNIS data in annis/ from your EXMARaLDA sources in src/exmaralda/corpus/.

  5. Optionally, run make dist in order to generate a ZIP file in dist/ with the ANNIS data in annis/. Running make src will generate an additional ZIP file with the build system and the EXMARaLDA sources.

  6. Run the ANNIS kickstarter and import either the ANNIS data in annis/ or the corresponding ZIP file in dist/.

  7. Optionally, run make clean in order to remove intermediate build files. Running make distclean also removes ANNIS data and ZIP files, if any.

Andreas Nolda (andreas@nolda.org)