[twitter] '"replies": "self"' keeping '"quoted": true' from downloading quotetweeted replies? #4007
-
Ever since I got my issues setting up gallery-dl fully sorted out and started downloading months ago, I've had my Twitter config setup contain this combination of these specific settings (on the advice and help of mikf, see #3552):
with the However, in the past couple weeks, I realized that (I think only since updating to 1.25.1) for a while, this setup had been broken compared to how it worked before (presumably due either to some change in gallery-dl between 1.24.4 and 1.25.1, some part of Twitter's changes to its API, or some interaction therebetween), and instead gallery-dl was now grabbing a bunch of text replies from people other than the OP (while skipping media replies, but also grabbing quoted tweets from replies). This was filling up my gallery-dl downloads with hundreds of thousands of twitter user folders with nothing but one or a few text tweet .txts in each. So, I went to check the configuration page here on github to see if anything had changed with twitter options, assuming that maybe the thing my However, just today I noticed a new problem arising with this changed setup: if, when downloading a quotetweet along with the tweet it's quoting, the quoted tweet is a reply to some other tweet rather than being its own standalone tweet, gallery-dl will skip downloading it on the basis that it is a reply. This is a problem for my usecase, where I really only want to be excluding all the replies to the input tweet (unless posted by the same person as the input tweet), while still downloading any quoted tweet no matter if it's a reply or not. Is there any change I can make to the |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
NOTE 11/3/2024: Saw this over a year later while browsing for help for an unrelated problem, realizing I never actually marked this as solved when I should have, so decided to move my solution from the question into a comment and mark it as the solution so it's properly marked as closed. Original text below: I think I actually might have an idea what's going wrong now. Specifically, I think it's the "postprocessors" option being set to ["content"]. I was messing around with the image-filter to get it to try to work again, and the most recent time was after having combed through the changes to the twitter extractor's code between 1.24.4 and 1.25.1 to find any kind of inspiration for a change to the image-filter that would work, and I noticed the bits where "self" was defined as being when the user is the same as the author, presumably for things like "replies": "self" to work. So, I thought, maybe "self" is defined as a term to be used generally in something like the image-filter. I set "image-filter": "quote_id or retweet_id or self" and ran the program. It errored and failed to download the file of the input URL, which told me I was wrong about "self" being defined like that, but I noticed it downloaded the text of the tweet I input anyway. I checked back in my config, noticed: "#": "write text content for all tweets", which had been unchanged since I initially downloaded the file and had been ignoring thinking it was irrelevant to this problem, but especially with the wording of the comment on it, in combination with it downloading the txt of the text of the input tweet while the main downloading function failed, I think this means the postprocessor is working almost separately from the rest of the options of the extractor, and so isn't being affected by the image-filter. I don't know if this is a bug for the postprocessor or not, but I think now I could possibly add a filter into the postproccessor's options to make it work right at the very least. |
Beta Was this translation helpful? Give feedback.
NOTE 11/3/2024: Saw this over a year later while browsing for help for an unrelated problem, realizing I never actually marked this as solved when I should have, so decided to move my solution from the question into a comment and mark it as the solution so it's properly marked as closed. Original text below:
I think I actually might have an idea what's going wrong now.
Specifically, I think it's the "postprocessors" option being set to ["content"]. I was messing around with the image-filter to get it to try to work again, and the most recent time was after having combed through the changes to the twitter extractor's code between 1.24.4 and 1.25.1 to find any kind of inspiration for a chan…