The FedWeb 2012 dataset has been created to support research on federated search for the web.
If you are interested in this dataset, you might also want to have a look at the TREC FedWeb track, organized in 2013.
The dataset contains the following:
- 108 resources (web search engines)
- Resource samplings (pages/snippets, random/zipf/top)
- Top 10 results of each resource and corresponding relevance judgements for the TREC web track 2010 queries
Obtaining the datasetFill out and sign the license agreement, scan it and e-mail it to the address below. After approval of your application, you will receive details how to obtain the dataset as soon as possible.
Djoerd Hiemstra ()
University of Twente
Phone: +31 53 4892335 / +31 53 4893690
ReferencesPlease cite the following paper when using the dataset
Dong Nguyen, Thomas Demeester, Dolf Trieschnigg, and Djoerd Hiemstra. "Federated Search in the Wild: The Combined Power of over a Hundred Search Engines". In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM), 2012 [pdf]Papers using the dataset
Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, and Djoerd Hiemstra (2012). "What Snippets Say About Pages in Federated Web Search". In Proceedings of the 8th Asia Information Retrieval Society Conference (AIRS), 2012 [pdf]
Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, Chris Develder and Djoerd Hiemstra (2013). "Snippet-based Relevance Predictions for Federated Web Search". In Proceedings of the 35th European Conference on Information Retrieval (ECIR), 2013 [pdf]