TREC Conversational Assistance Track (CAsT)

There are currently few datasets appropriate for training and evaluating models for Conversational Information Seeking (CIS). The main aim of TREC CAsT is to advance research on conversational search systems. The goal of the track is to create a reusable benchmark for open-domain information centric conversational dialogues.

The track will run in 2019 and establish a concrete and standard collection of data with information needs to make systems directly comparable.

This is the first year of TREC CAsT, which will run as a track in TREC. This year we aim to focus on candidate information ranking in context:

Read the dialogue context: Track the evolution of the information need in the conversation, identifying salient information needed for the current turn in the conversation
Retrieve Candidate Response Information: Perform retrieval over a large collection of paragraphs (or knowledge base content) to identify relevant information

NEW: Year 1 Task Guidelines

Year 1 task guidelines
Comments and feedback are welcome.

Data

Topics

Training topics year 1 V1.0 - 30 example training topics
Coming soon: Partial judgment data for a subset of training topics
Additional resources: MS MARCO Conversational Search Sessions Conversational Search data and train data is released.

Collection

The corpus is a combination of three standard TREC collections: MARCO Ranking passages, Wikipedia (TREC CAR), and News (Washington Post)
The MS MARCO Passage Ranking collection
The TREC CAR paragraph collection v2.0
The TREC Washington Post Corpus: Note requires an organizational agreement.

Document ID format

The collection id is [collection_id_paragraph_id] with collection and paragraph separated by an underscore.
The collection ids are in the set: {MARCO, CAR, WAPO}.
The paragraph ids are: standard provided by MARCO and CAR. For WAPO the paragraph ID is [article_id-paragraph_index] where the paragraph_index is the 0-based position index of the paragraph using the provided paragraph markup separated by a single dash.
Example WaPo combined document id: [WAPO_903cc1eab726b829294d1abdd755d5ab-1], or CAR: [CAR_6869dee46ab12f0f7060874f7fc7b1c57d53144a]

Code and tools

TREC-CAsT Tools repository with code and scripts for processing data.
Note: This will evolve over time, it currently contains topic definition files.

Year 1 Planning slides

Year 1 planning information
Comments and feedback are welcome.

Information Needs

~50-100 topics with manually defined trajectories
Start from initial general topic
Conversation evolves across ‘realistic’ facets for ~10 turns
Manually created topics from crowdsourcing

News

May 23: Training data released
April 18th: Guidelines released
November 13: Announcement that the track will run next year.
March 19: Sample topic data for conversational and MARCO sessions available
May 1st: Track guidelines are released

Contact

Twitter: @treccast
Slack: treccast.slack.com
Google groups [email protected]

Important Dates

Training data release: May 23rd
Test topic release: June 12th
Run submission: August 16th

Evaluation

Forthcoming

Organizers

Jeff Dalton, University of Glasgow
Chenyan Xiong, Microsoft Research
Jamie Callan, Carnegie Mellon University

Advisory Committee

Laura Dietz, University of New Hamsphire
Jimmy Lin, University of Waterloo
Julia Kiseleva, Microsoft Research
Vanessa Murdock, Amazon Research
Paul Bennett, Microsoft Research
Zhiting Hu, CMU
Anton Leuski, USC

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
2019/data		2019/data
CNAME		CNAME
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TREC Conversational Assistance Track (CAsT)

NEW: Year 1 Task Guidelines

Data

Topics

Collection

Document ID format

Code and tools

Year 1 Planning slides

News

Contact

Important Dates

Evaluation

Organizers

Advisory Committee

About

Releases

Packages

jamiecallan/treccastweb

Folders and files

Latest commit

History

Repository files navigation

TREC Conversational Assistance Track (CAsT)

NEW: Year 1 Task Guidelines

Data

Topics

Collection

Document ID format

Code and tools

Year 1 Planning slides

News

Contact

Important Dates

Evaluation

Organizers

Advisory Committee

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages