Overview of the Ephyra Architecture
Ephyra is a modular and extensible framework for question answering. This document gives an overview of the pipeline layout and the main data structures used in this pipeline.
Overall Pipeline
The system is organized as a pipeline of standardized components for question analysis, query generation, search, and answer extraction and selection. These components can be combined and arranged arbitrarily, which facilitates experimenting with different setups and finding the most effective configuration. Furthermore, multiple approaches and knowledge sources can be combined in one system, and components can be shared among different approaches. An overview of a typical pipeline setup is shown below.
Note that in Ephyra's pipeline, the answer extraction and selection stages have been combined, which is different from most other QA systems. This takes into account that both stages perform similar operations on the same data structures, and that an interleaved execution of answer extraction and selection components may be beneficial.
Main Data Structures
The following data structures are used to pass information along the different pipeline stages.
AnalyzedQuestion
An AnalyzedQuestion represents a asyntactic and semantic analysis of a question. It serves as an interace between the question analysis and query generation stages.
Query
A Query is a search engine query generated at the query generation stage and executed at the search stage.
Result
A Result is a document retrieved at the search stage or an answer candidate in the answer extraction and answer selection stages.
![(please configure the [header_logo] section in trac.ini)](http://www.cs.cmu.edu/~nico/ephyra/doc/images/ephyra.jpg)



