Skip to content
This repository has been archived by the owner on Apr 20, 2023. It is now read-only.

v2.0.0

Compare
Choose a tag to compare
@chris-greening chris-greening released this 17 Jan 04:34
· 23 commits to master since this release
78255b5

New features

Below is a list of new features

scrape tools

  • json_from_soup

Returns JSON Instagram data from BeautifulSoup

  • flatten_dict

Returns a flattened dictionary of all leaf nodes in a tree of JSON data

  • New flatten argument for json_from_* functions, returns a flattened dictionary

scrapers

  • New inplace argument for the scrape method

Similar to the pandas inplace parameter except the default is True as opposed to pandas's False. By default, scrape will modify an instance inplace, setting attributes equal to the scraped data. If False, the current instance will remain untouched and scrape will instead return another instance with the scraped data. Useful for chaining methods

  • New 'sessionparameter for thescrape` method

Allows passing of a custom session object

  • New webdriver parameter for the scrape method

Uses a webdriver for scraping the data instead of a session

Fixes

  • fixed Post scraper KeyError that was occuring on all scrapes

Breaking changes

Below is a list of breaking changes to the library

  • Renamed instascrape.scrapers.json_tools to instascrape.scrapers.scrape_tools
  • Renamed parse_json_from_mapping function to parse_data_from_json
  • Removed FlatJSONDict, replaced with the flatten_dict function in scrape_tools that will flatten any dictionary
  • json_from_* functions now return a list of all JSON dictionary's from the page as opposed to just the first dictionary.

Non-breaking changes behind the scenes

Below is a list of everything that changed behind the scenes that has no bearing on the API

  • refactored out a lot of complexity from instascrape.core._static_scraper._StaticHtmlScraper's implementation, greatly improving code readability
  • Changed imports to reflect file moves
  • Reimplemented to rely more on reusable functions as opposed to static methods unnecessarily bound to classes
  • Changed how data is loaded into namespace when using the scrape method to make room for the inplace argument. inplace is defaulted as True so this doesn't break any existing code but instead provides a new alternative.
  • updated documentation with docstrings