Grid of Points

JASIST 2017

Towards an Anatomy of Search Engine Component Performances

Download the paper Editor's paper version

In this paper, we start to overcome the limitations of the current experimental methodology and we provide the means to estimate the effects of the different components of an IR system. In particular, we develop a methodology, based on General Linear Mixed Model (GLMM) and ANalysis Of VAriance (ANOVA), which makes use of a Grid of Points (GoP) containing all the possible combinations of inspected components.

We create extensive GoPs covering 6 different stop lists, 6 types of stemmers, 8 flavors of n- grams, and 17 distinct IR models, basically representing nearly all the state-of-the-art components which constitute the common denominator almost always present in any IR system for English retrieval. Then, the proposed methodology allows us to break down the system performances into the contributions of these stops lists, stemmers or n-grams and IR models, as well as to study their interaction.

The main contributions of this work are:

the methodology for breaking down component effects and analysing the GoPs across multiple evaluation measures;
the GoPs themselves, a valuable resource which can be exploited also for other kinds of analyses and it is available to the research community1. They also provide a very extensive sample of the most commonly used components for English retrieval;
the application of the proposed methodology to two different search tasks, namely news search and Web search, and their thorough analysis in order to derive insights on English retrieval.

We considered three main components of an IR system: stop list, Lexical Unit Generator (LUG) and IR model. We selected a set of alternative implementations of each component and by using the Terrier open source system we created a run for each system defined by combining the available components in all possible ways. The components we selected are:

stop list: nostop, indri, lucene, smart, terrier
LUG: nolug, weak Porter, Porter, Krovetz, Lovins, 4grams, 5grams, 6grams, 7grams, 8grams, 9grams, 10grams;
model: BB2, BM25, DFIZ, DFRee, DirichletLM, DLH, DPH, HiemstraLM, IFB2, InB2, InL2, InexpB2, Js KLs, LemurTFIDF, LGD, PL2, TFIDF.

Note that some stemmers and the n-grams are not natively implemented by Terrier 4.1. It is possible to download the extensions to Terrier 4.1 here: Get Terrier extensions.

Get the data

Get the code for replicating the experiments in the paper

SIGIR 2016

A General Linear Mixed Models Approach to Study System Component Effects

Download the paper Download the presentation slides

We face the problem of studying system variance in order to better understand how much system components contribute to overall performances. We propose a methodology based on General Linear Mixed Model (GLMM) to develop statistical models able to isolate system variance, component effects as well as their interaction.

We apply the proposed methodology to the analysis of TREC Ad-hoc data in order to show how it works and discuss some interesting outcomes of this new kind of analysis. Finally, we extend the analysis to different evaluation measures, showing how they impact on the sources of variance.

We selected a set of alternative implementations of each component and by using the Terrier open source system we created a run for each system defined by combining the available components in all possible ways.

stop list: nostop, indri, lucene, smart, terrier
LUG: nolug, weak Porter, Porter, Krovetz, Lovins, 4grams, 5grams;
model: BB2, BM25, DFRBM25, DFRee, DLH, DLH13, DPH, HiemstraLM, IFB2, InL2, InexpB2, InexpC2, LGD, LemurTFIDF, PL2, TFIDF.

Note that some stemmers and the n-grams are not natively implemented by Terrier 4.1. It is possible to download the extensions to Terrier 4.1 here: Get Terrier extensions.

Get the data

Get the code for replicating the experiments in the paper

CLEF 2016

The CLEF Monolingual Grid of Points

Download the paper Download the presentation slides

In this paper we run a systematic series of experiments for creating a grid of points where many combinations of retrieval methods and components adopted by MultiLingual Information Access (MLIA) systems are represented. This grid of points has the goal to provide insights about the effectiveness of the different components and their interaction and to identify suitable baselines with respect to which all the comparisons can be made.

Full reference: Nicola Ferro and Gianmaria Silvello (2016). The CLEF Monolingual Grid of Points. In Fuhr, N., Quaresma, P., Gonçalves, T., Larsen, B., Balog, K., Macdonald, C., Cappellato, L., and Ferro, N., editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Seventh International Conference of the CLEF Association (CLEF 2016), pages 13-24. Lecture Notes in Computer Science (LNCS) 9822, Springer, Heidelberg, Germany

Get the data

Get the code for replicating the experiments in the paper

Grid of Points

Grid of points for component-based evaluation in information retrieval

Grid of Points (GoP):
A deeper look into the components of IR systems and their interactions

Code

JASIST 2017

Towards an Anatomy of Search Engine Component Performances

Get the data

SIGIR 2016

A General Linear Mixed Models Approach to Study System Component Effects

Get the data

CLEF 2016

The CLEF Monolingual Grid of Points

Get the data

Credits and License

Grid of Points

Grid of points for component-based evaluation in information retrieval

Grid of Points (GoP): A deeper look into the components of IR systems and their interactions

Code

JASIST 2017

Towards an Anatomy of Search Engine Component Performances

Get the data

SIGIR 2016

A General Linear Mixed Models Approach to Study System Component Effects

Get the data

CLEF 2016

The CLEF Monolingual Grid of Points

Get the data

Credits and License

Grid of Points (GoP):
A deeper look into the components of IR systems and their interactions