This repository has been archived by the owner on Apr 20, 2023. It is now read-only.
v2.0.0
New features
Below is a list of new features
scrape tools
json_from_soup
Returns JSON Instagram data from BeautifulSoup
flatten_dict
Returns a flattened dictionary of all leaf nodes in a tree of JSON data
- New
flatten
argument for json_from_* functions, returns a flattened dictionary
scrapers
- New
inplace
argument for thescrape
method
Similar to the
pandas
inplace
parameter except the default isTrue
as opposed topandas
'sFalse
. By default, scrape will modify an instance inplace, setting attributes equal to the scraped data. IfFalse
, the current instance will remain untouched andscrape
will instead return another instance with the scraped data. Useful for chaining methods
- New 'session
parameter for the
scrape` method
Allows passing of a custom session object
- New
webdriver
parameter for thescrape
method
Uses a webdriver for scraping the data instead of a session
Fixes
- fixed
Post
scraper KeyError that was occuring on all scrapes
Breaking changes
Below is a list of breaking changes to the library
- Renamed
instascrape.scrapers.json_tools
toinstascrape.scrapers.scrape_tools
- Renamed
parse_json_from_mapping
function toparse_data_from_json
- Removed FlatJSONDict, replaced with the
flatten_dict
function inscrape_tools
that will flatten any dictionary json_from_*
functions now return a list of all JSON dictionary's from the page as opposed to just the first dictionary.
Non-breaking changes behind the scenes
Below is a list of everything that changed behind the scenes that has no bearing on the API
- refactored out a lot of complexity from
instascrape.core._static_scraper._StaticHtmlScraper
's implementation, greatly improving code readability - Changed imports to reflect file moves
- Reimplemented to rely more on reusable functions as opposed to static methods unnecessarily bound to classes
- Changed how data is loaded into namespace when using the
scrape
method to make room for theinplace
argument.inplace
is defaulted asTrue
so this doesn't break any existing code but instead provides a new alternative. - updated documentation with docstrings