Since the focus of this workshop is on systems, we invite participants to build new systems or to adapt their existing systems to demonstrate various aspects of collaboration. We leave the choice of collections to the participants, but offer the following suggestions of promising collections for exploring collaborative search:
- The ICSWM blog dataset that covers Jan-Feb 2011. For more information, see the ICSWM web site.
- The TREC Blog dataset hosted by the University of Glasgow. See University of Glasgow site for more information.
- Other TREC datasets.
- For those interested, we may also offer access to the CiteSeer dataset. This data, made available by CiteSeer under the CC NC SA license, is about 40GB in size, and includes text and metadata, covering academic papers from a variety of sources. The data can be obtained by permission from CiteSeer, or from us. Please contact Gene to request the FTP site for the dataset.
- Medline databases, including abstracts, are available from the NLM.
These datasets offer sufficiently rich content to support a variety of recall-oriented collaborative search tasks. We expect participants to demonstrate how their systems can be used to help teams find information, and to have workshop attendees try to use others’ tools. This kind of hands-on experience should improve our collective understanding of the possibilities for improving collaboration in information seeking.