Update files from the datasets library (from 1.7.0)

Release notes: https://github.com/huggingface/datasets/releases/tag/1.7.0
This commit is contained in:
system 2022-01-25 16:43:48 +01:00
parent 2e1341868c
commit 7b6ddf6458

@ -1,4 +1,5 @@
--- ---
paperswithcode_id: imdb-movie-reviews
--- ---
# Dataset Card for "imdb" # Dataset Card for "imdb"
@ -6,12 +7,12 @@
## Table of Contents ## Table of Contents
- [Dataset Description](#dataset-description) - [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary) - [Dataset Summary](#dataset-summary)
- [Supported Tasks](#supported-tasks) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
- [Languages](#languages) - [Languages](#languages)
- [Dataset Structure](#dataset-structure) - [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances) - [Data Instances](#data-instances)
- [Data Fields](#data-fields) - [Data Fields](#data-fields)
- [Data Splits Sample Size](#data-splits-sample-size) - [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation) - [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale) - [Curation Rationale](#curation-rationale)
- [Source Data](#source-data) - [Source Data](#source-data)
@ -42,7 +43,7 @@
Large Movie Review Dataset. Large Movie Review Dataset.
This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
### Supported Tasks ### Supported Tasks and Leaderboards
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
@ -78,7 +79,7 @@ The data fields are the same among all splits.
- `text`: a `string` feature. - `text`: a `string` feature.
- `label`: a classification label, with possible values including `neg` (0), `pos` (1). - `label`: a classification label, with possible values including `neg` (0), `pos` (1).
### Data Splits Sample Size ### Data Splits
| name |train|unsupervised|test | | name |train|unsupervised|test |
|----------|----:|-----------:|----:| |----------|----:|-----------:|----:|
@ -92,10 +93,22 @@ The data fields are the same among all splits.
### Source Data ### Source Data
#### Initial Data Collection and Normalization
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
#### Who are the source language producers?
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Annotations ### Annotations
#### Annotation process
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
#### Who are the annotators?
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Personal and Sensitive Information ### Personal and Sensitive Information