Using the Semantic Answer Extractor

Ephyra comes with a semantic answer extractor for factoid and list questions that is not enabled by default. This approach extracts answers with high precision and usually improves the overall system performance by a few percent, but it is slow and it uses the ASSERT semantic role labeling system which requires Linux. This tutorial describes how you can set up Ephyra to use the semantic answer extractor on a Linux machine.

  1. Download the ASSERT semantic role labeling system, unzip it, and run the script install.sh.
  1. Set the environment variable ASSERT to point to your ASSERT directory.
  1. Add the pipeline components for the semantic answer extractor to the initFactoid() method in the main class that you are running. The semantic extractor will then be used for both factoid and list questions. Note that these components may already have been added to the factoid pipeline for you, but they do not take effect if ASSERT is not installed or the ASSERT environment variable is not set correctly.
    • Add the following line to the gery generation part of the init method:
      • QueryGeneration.addQueryGenerator(new PredicateG());
        (Generates a query string from predicate-argument structures extracted from the question.)
    • Add the following three lines to the answer extraction part of the init method (these filters must come before any answer selection or projection filters; if you are uncertain about the order, add them at the top of the list):
      • AnswerSelection.addFilter(new WebDocumentFetcherFilter());
        (Retrieves web documents that contain the search engine snippets.)
      • AnswerSelection.addFilter(new PredicateExtractionFilter());
        (Extracts predicate-argument structures from candidate sentences.)
      • AnswerSelection.addFilter(new FactoidsFromPredicatesFilter());
        (Extracts factoid answers from predicate-argument structures.)

Comments about this tutorial? Please email Nico Schlaefer.