The FedWeb 2012 dataset has been created to support research on federated search for the web.
If you are interested in this dataset, you might also want to have a look at the TREC FedWeb track, organized in 2013.

Description

The dataset contains the following:

Sample files

Obtaining the dataset

Fill out and sign the license agreement, scan it and e-mail it to the address below. After approval of your application, you will receive details how to obtain the dataset as soon as possible.

Contact

Djoerd Hiemstra ()
Database Group
University of Twente
The Netherlands
Phone: +31 53 4892335 / +31 53 4893690

References

Please cite the following paper when using the dataset

Dong Nguyen, Thomas Demeester, Dolf Trieschnigg, and Djoerd Hiemstra. "Federated Search in the Wild: The Combined Power of over a Hundred Search Engines". In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM), 2012 [pdf]

Papers using the dataset

Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, and Djoerd Hiemstra (2012). "What Snippets Say About Pages in Federated Web Search". In Proceedings of the 8th Asia Information Retrieval Society Conference (AIRS), 2012 [pdf]

Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, Chris Develder and Djoerd Hiemstra (2013). "Snippet-based Relevance Predictions for Federated Web Search". In Proceedings of the 35th European Conference on Information Retrieval (ECIR), 2013 [pdf]