-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Column name input_col not in the dataset. Current columns in the dataset: [] #291
Comments
Thanks for your bug report!
This means your dataset is an empty dataset, right? 🤔 Could you check if your two datasets, |
@zhaochenyang20 Here are examples with input questions and context passages, along with their expected outputs: input="Question: What city did Super Bowl 50 take place in? Context: Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50." input="Question: What river runs through Warsaw? Context: Warsaw (Polish: Warszawa [varˈʂava] ( listen); see also other names) is the capital and largest city of Poland. It stands on the Vistula River in east-central Poland, roughly 260 kilometres (160 mi) from the Baltic Sea and 300 kilometres (190 mi) from the Carpathian Mountains. Its population is estimated at 1.740 million residents within a greater metropolitan area of 2.666 million residents, which makes Warsaw the 9th most-populous capital city in the European Union. The city limits cover 516.9 square kilometres (199.6 sq mi), while the metropolitan area covers 6,100.43 square kilometres (2,355.39 sq mi)." input="Question: The Ottoman empire controlled territory on three continents, Africa, Asia and which other? Context: The Ottoman Empire was an imperial state that lasted from 1299 to 1923. During the 16th and 17th centuries, in particular at the height of its power under the reign of Suleiman the Magnificent, the Ottoman Empire was a powerful multinational, multilingual empire controlling much of Southeast Europe, Western Asia, the Caucasus, North Africa, and the Horn of Africa. At the beginning of the 17th century the empire contained 32 provinces and numerous vassal states. Some of these were later absorbed into the empire, while others were granted various types of autonomy during the course of centuries." Then it shows: However, it finally reports error: After that, it report error like this. |
Oops. Let me see! |
BTW, we are developing our colab demo. Maybe it would be easier to use. |
Hi, I don't think this is fixed. I just ran the
|
Which dataset? |
This line: Line 321 in e11144e
|
I am pretty familiar with Have you checked it? |
When I run python cli_demo.py, it reports errors:
Generating examples: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 26273.52it/s]
The generated dataset is ready.
The model has not been trained.
Processing datasets.
Traceback (most recent call last):
File "/home/bufang/prompt2model/cli_demo.py", line 435, in
main()
File "/home/bufang/prompt2model/cli_demo.py", line 321, in main
t5_modified_dataset_dicts = t5_processor.process_dataset_dict(
File "/home/bufang/prompt2model/prompt2model/dataset_processor/base.py", line 100, in process_dataset_dict
dataset_dict[dataset_split]
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 563, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 528, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2901, in map
return self.remove_columns(remove_columns)
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 563, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 528, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/fingerprint.py", line 511, in wrapper
out = func(dataset, *args, **kwargs)
File "/home/bufang/yes/envs/pt2model/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2064, in remove_columns
raise ValueError(
ValueError: Column name input_col not in the dataset. Current columns in the dataset: []
How to fix this error? Thanks
The text was updated successfully, but these errors were encountered: