Skip to content

jamiecallan/treccastweb

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 

Repository files navigation

TREC Conversational Assistance Track (CAsT)

There are currently few datasets appropriate for training and evaluating models for Conversational Information Seeking (CIS). The main aim of TREC CAsT is to advance research on conversational search systems. The goal of the track is to create a reusable benchmark for open-domain information centric conversational dialogues.

The track will run in 2019 and establish a concrete and standard collection of data with information needs to make systems directly comparable.

This is the first year of TREC CAsT, which will run as a track in TREC. This year we aim to focus on candidate information ranking in context:

  • Read the dialogue context: Track the evolution of the information need in the conversation, identifying salient information needed for the current turn in the conversation
  • Retrieve Candidate Response Information: Perform retrieval over a large collection of paragraphs (or knowledge base content) to identify relevant information

NEW: Year 1 Task Guidelines

Data

Topics

Collection

Document ID format

  • The collection id is [collection_id_paragraph_id] with collection and paragraph separated by an underscore.
  • The collection ids are in the set: {MARCO, CAR, WAPO}.
  • The paragraph ids are: standard provided by MARCO and CAR. For WAPO the paragraph ID is [article_id-paragraph_index] where the paragraph_index is the 0-based position index of the paragraph using the provided paragraph markup separated by a single dash.
  • Example WaPo combined document id: [WAPO_903cc1eab726b829294d1abdd755d5ab-1], or CAR: [CAR_6869dee46ab12f0f7060874f7fc7b1c57d53144a]

Code and tools

  • TREC-CAsT Tools repository with code and scripts for processing data.
  • Note: This will evolve over time, it currently contains topic definition files.

Year 1 Planning slides

Information Needs

  • ~50-100 topics with manually defined trajectories
  • Start from initial general topic
  • Conversation evolves across ‘realistic’ facets for ~10 turns
  • Manually created topics from crowdsourcing

News

  • May 23: Training data released
  • April 18th: Guidelines released
  • November 13: Announcement that the track will run next year.
  • March 19: Sample topic data for conversational and MARCO sessions available
  • May 1st: Track guidelines are released

Contact

Important Dates

  • Training data release: May 23rd
  • Test topic release: June 12th
  • Run submission: August 16th

Evaluation

Forthcoming

Organizers

Advisory Committee

  • Laura Dietz, University of New Hamsphire
  • Jimmy Lin, University of Waterloo
  • Julia Kiseleva, Microsoft Research
  • Vanessa Murdock, Amazon Research
  • Paul Bennett, Microsoft Research
  • Zhiting Hu, CMU
  • Anton Leuski, USC

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published