Stanford Question Answering Dataset (SQuAD)

Submitted by

on Dec 11 2019 } Suggest Revision

By: Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, Percy Liang

Project: https://rajpurkar.github.io/SQuAD-explor...

From: Raj Purkar

Paper: https://arxiv.org/abs/1606.05250

Summary
Comments (0)

Resource Type:

Data

License:

Language:

Data Format:

Description

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable. SQuAD 1.1, the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles. New SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 new, unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD2.0 is a challenging natural language understanding task for existing models, and we release SQuAD2.0 to the community as the successor to SQuAD1.1.

Categorized in: Machine Learning | Natural Language | Question Answering

Post comment

Cancel