Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Twitter client does not pick up URLs that are tweeted #46

Open
rowan08 opened this issue Jul 19, 2019 · 4 comments
Open

Twitter client does not pick up URLs that are tweeted #46

rowan08 opened this issue Jul 19, 2019 · 4 comments

Comments

@rowan08
Copy link
Member

rowan08 commented Jul 19, 2019

Received a message querying a book tweet that was not saved by the Altmetrics service:
• "We found a tweet about one of our books but the API doesn’t show it: https://twitter.com/OpenEditionNews/status/1140910422431797250?s=20"

After some investigation, this tweet was not picked up because it references the book in question by its URL (books.openedition.org/oep/8999), yet we only search for books based on DOI. After adding the URL to the Twitter search, it was still unable to find the tweet. The following combinations were tested:

DOI and all URLs:
keywords = ['"https://books.openedition.org/oep/8999"', '"https://books.openedition.org/oep/pdf/8999"', '"https://books.openedition.org/oep/epub/8999"', '"10.4000/books.oep.8999"']
Only the relevant URL
Keywords = ['"https://books.openedition.org/oep/8999"']

Only the relevant URL without 'https://'
Keywords = ['"books.openedition.org/oep/8999"']

None of these searches with the Twitter client returned anything.
Ideally, we should be able to search for tweets about a book that is mentioned by its URL, not just its DOI.

@rowan08
Copy link
Member Author

rowan08 commented Jul 23, 2019

Update; The problem seems to be because twitter converts all tweeted urls to a tiny url; https://help.twitter.com/en/using-twitter/how-to-tweet-a-link; so the original URL does not form part of the tweet text, so will not be matched when searching tweets using the URL as a keyword.

Moreover, the URLs are not predictable, not unique (i.e. there can be more than 1 timy URL for a given expanded URL).

@rowan08
Copy link
Member Author

rowan08 commented Jul 23, 2019

It looks like twitter does allow you to search for the expanded URLs, by prefixing a keyword with 'url:'. I have tested this and had limited success. It does, however, work with some URLs at least, so we should probably include it.

Based on: https://stackoverflow.com/questions/3584482/how-to-find-tweets-that-contain-a-url

Just a note: Searching with the https:// prefix seems to work fine; but combining more than one URL in a search tends to return nothing.
i.e., the following will work: (at time of writing)

Keywords = ['url:books.openedition.org/oep/9068']  # or 'url:https://books.openedition.org/oep/9068'

but the following will not work:

Keywords = [
    'url:http://books.openedition.org/oep/9068',
    'url:https://books.openedition.org/oep/9068',
]

@rowan08
Copy link
Member Author

rowan08 commented Jul 24, 2019

Worth noting: After some testing, it looks like the keywords above act as a DB filter, i.e. we will need a separate search for DOIs and URLs.

This may require a change in search strategy, considering the twitter rate limit.

@yoannspace
Copy link
Contributor

@rowan08 Is this fair to say it was solved? The Twitter example is available on the API now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants