Issues Previewing/exporting Instagram data collected with Zeeschuimer #387

leelum · 2023-09-07T10:53:56Z

Describe the bug
Upon importing a collection of Instagram data collected through Zeeschuimer, I have been unable to preview, nor export the data to CSV.

Trying to explore the data results in an Internal Server Error
Trying to download as CSV results in a failed download.
Attempting to use the Convert NDJSON file to CSV tool results in the conversion hanging, with the following log:

Thu Sep 7 10:49:59 2023: Processing 'Convert NDJSON file to CSV' started for dataset 7bbfdf83551737468f517850d178113b
Thu Sep 7 10:49:59 2023: Processing data
Thu Sep 7 10:50:01 2023: Converting file
Thu Sep 7 10:50:01 2023: Processor crashed ('NoneType' object is not subscriptable), trying again later

To Reproduce
Collect data from Instagram using Zeeschuimer, focusing predominantly on more than one accounts Instagram Reels via Instagram.com//reels.

4CAT Environment

Own server/desktop
If accessing via your own server/desktop, what is the environment and are you using Docker?: MacOS, using Docker Desktop.

Relevant error log from Docker:

2023-09-07 11:50:01 4cat_backend | 07-09-2023 10:50:01 | ERROR at search_instagram.py:200: Processor convert-ndjson-csv raised TypeError while processing dataset 7bbfdf83551737468f517850d178113b (via 18ece519fe48b0b80c835f4d386d5def) in ndjson_to_csv.py:52->dataset.py:348->search_instagram.py:64->search_instagram.py:200:
2023-09-07 11:50:01 4cat_backend | 'NoneType' object is not subscriptable
2023-09-07 11:50:01 4cat_backend |
2023-09-07 11:50:02 4cat_backend | 07-09-2023 10:50:02 | INFO at processor.py:159: Running processor convert-ndjson-csv on dataset 59a4884ac3ca752755f85bac47c0cd4d
2023-09-07 11:50:04 4cat_backend | 07-09-2023 10:50:04 | ERROR at search_instagram.py:200: Processor convert-ndjson-csv raised TypeError while processing dataset 59a4884ac3ca752755f85bac47c0cd4d (via 18ece519fe48b0b80c835f4d386d5def) in ndjson_to_csv.py:52->dataset.py:348->search_instagram.py:64->search_instagram.py:200:
2023-09-07 11:50:04 4cat_backend | 'NoneType' object is not subscriptable
2023-09-07 11:50:04 4cat_backend |

dale-wahl · 2023-09-07T11:21:05Z

Odd, that looks like ZeeSchuimer collected a post without a username and 4CAT is failing since it expects every post to have one.

We are working on a fix that will skip bad items like that, update the log to let you know, and allow you to at least preview the data. For the moment you can download the NDJSON file to view the collected data (you could even find the invalid post), but it is a bit more complicated to fix it in a way 4CAT could still run processors until we are able to deploy the previously mentioned fix.

As to why a post has no username, that's something Zeeschuimer will need to address. I just made an issue there. I think it might help if you posted the NDJSON (if it is not too large) or, better yet, just the offending post if possible.

leelum · 2023-09-07T12:58:23Z

Brill - thanks for looking into this!

dale-wahl · 2024-01-08T13:14:05Z

1101a0a fixes this. Now 4CAT will skip items that do not map correctly (such as this example of a post without a username) and notify the user (and update the dataset log) as well as the administrator of 4CAT.

dale-wahl mentioned this issue Sep 7, 2023

Instagram collecting posts without username digitalmethodsinitiative/zeeschuimer#18

Closed

dale-wahl closed this as completed Jan 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues Previewing/exporting Instagram data collected with Zeeschuimer #387

Issues Previewing/exporting Instagram data collected with Zeeschuimer #387

leelum commented Sep 7, 2023 •

edited

Loading

dale-wahl commented Sep 7, 2023

leelum commented Sep 7, 2023

dale-wahl commented Jan 8, 2024

Issues Previewing/exporting Instagram data collected with Zeeschuimer #387

Issues Previewing/exporting Instagram data collected with Zeeschuimer #387

Comments

leelum commented Sep 7, 2023 • edited Loading

dale-wahl commented Sep 7, 2023

leelum commented Sep 7, 2023

dale-wahl commented Jan 8, 2024

leelum commented Sep 7, 2023 •

edited

Loading