Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add AuthorizedSession #22

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,18 @@ data_link_log = logging.getLogger("nasdaqdatalink")
data_link_log.setLevel(logging.DEBUG)
```

### Session

By default, every API request will create a new session; This will have a performance impact when you wish to make multiple requests(see #16). You can use `AuthorizedSession` to take advantage of the reusing session:

```python
import nasdaqdatalink
session = nasdaqdatalink.AuthorizedSession()
data1 = session.get_table('ZACKS/FC', ticker='AAPL')
data2 = session.get_table('ZACKS/FC', ticker='MFST')
data3 = session.get_table('ZACKS/FC', ticker='NVDA')
```

### Detailed Usage

Our API can provide more than just data. It can also be used to search and provide metadata or to programmatically retrieve data. For these more advanced techniques please follow our [Detailed Method Guide](./FOR_DEVELOPERS.md).
Expand Down
1 change: 1 addition & 0 deletions nasdaqdatalink/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from .model.point_in_time import PointInTime
from .model.data import Data
from .model.merged_dataset import MergedDataset
from .model.authorized_session import AuthorizedSession
from .get import get
from .bulkdownload import bulkdownload
from .export_table import export_table
Expand Down
23 changes: 23 additions & 0 deletions nasdaqdatalink/api_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,18 @@ class ApiConfig:
retry_status_codes = [429] + list(range(500, 512))
verify_ssl = True

def read_key(self, filename=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is adding read_key as a method to ApiConfig necessary? I'm not sure it's ever used. If users define their own session and pass in ApiConfig object, how should the api key poking be reconciled? It appears this ApiConfig.read_key() only considers file-based API Keys, and is never called anywhere -- am I missing something?

__init__ will call read_key() and kick off how to read the user's key from Env or a file, preferring ENV vars first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be called read_key_from_file. I was thinking it would be nice to allow each ApiConfig instances to read keys from different files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but even so it still duplicates the namespace method, which I'd like to avoid having multiple methods that do the same thing. Ideally the behaviour for finding a key is to:

  • check if the env var exists, and return what it is so we can set it;
  • otherwise, check if a file (user defined or default) exists and read from it: if a value is provided set it;
  • we couldn't detect an api key so we must fail

I didn't catch any logic that attempted to call ApiConfig.read_key() or what you propose should be ApiConfig.read_key_from_file() -- so I'm curious whether that's the correct level to provide those helper methods.

ApiConfig is just being used as a dataclass anyway, and at some point in the future we should convert it to one. We can have helper methods in it, for sure, but we need to make sure that it should be responsible for doing those tasks.

Copy link
Contributor Author

@runawaycoast runawaycoast Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was think something like this:

api_config1 = ApiConfig()
# api_config1 will have whatever is inside the ApiConfig class
ndl.get('table', api_config=api_config1)
api_config2 = ApiConfig()
api_config2.read_key_from_file('test2.txt')
ndl.get('table', api_config=api_config2)

Basically, i was trying to avoid using ApiConfig as a mutable storage of global state, but I understand it does introduce additional complexity to the users by enforcing a context creation beforehand.

Copy link
Contributor

@couture-ql couture-ql Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we'd need to go through some trouble to make ApiConfig immutable. I'm okay with the mutability of ApiConfig, since we should provide a consistent means of allowing users to set their api key if they need more configuration options.

ApiConfig() on create should do heavy lifting of finding API Key in env var or in a file, using defaults.
We should provide convenience methods to allow callers to if they wish to override defaults. We should encourage people to simply use the defaults.

We can modify __init__ to no longer call read_key() as this must being taken care of for callers: if one does not define api_config kw-arg then use our default object state.

We could move the helper methods into ApiConfig instead and avoid duplicating definitions. I would prefer we have consistent logic in one place, rather than maintain multiple definitions of the same logic.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only invoked in the tests?

if not os.path.isfile(filename):
raise_empty_file(filename)

with open(filename, 'r') as f:
apikey = get_first_non_empty(f)

if not apikey:
raise_empty_file(filename)

self.api_key = apikey


def create_file(config_filename):
# Create the file as well as the parent dir if needed.
Expand Down Expand Up @@ -102,3 +114,14 @@ def read_key(filename=None):
read_key_from_environment_variable()
elif config_file_exists(filename):
read_key_from_file(filename)


def get_config_from_kwargs(kwargs):
result = ApiConfig
if isinstance(kwargs, dict):
params = kwargs.get('params')
if isinstance(params, dict):
result = params.get('api_config')
if not isinstance(result, ApiConfig):
result = ApiConfig
return result
32 changes: 24 additions & 8 deletions nasdaqdatalink/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,17 @@

from .util import Util
from .version import VERSION
from .api_config import ApiConfig
from .api_config import ApiConfig, get_config_from_kwargs
from nasdaqdatalink.errors.data_link_error import (
DataLinkError, LimitExceededError, InternalServerError,
AuthenticationError, ForbiddenError, InvalidRequestError,
NotFoundError, ServiceUnavailableError)

KW_TO_REMOVE = [
'session',
'api_config'
]


class Connection:
@classmethod
Expand All @@ -22,31 +27,37 @@ def request(cls, http_verb, url, **options):
headers = options['headers']
else:
headers = {}
api_config = get_config_from_kwargs(options)

accept_value = 'application/json'
if ApiConfig.api_version:
accept_value += ", application/vnd.data.nasdaq+json;version=%s" % ApiConfig.api_version
if api_config.api_version:
accept_value += ", application/vnd.data.nasdaq+json;version=%s" % api_config.api_version

headers = Util.merge_to_dicts({'accept': accept_value,
'request-source': 'python',
'request-source-version': VERSION}, headers)
if ApiConfig.api_key:
headers = Util.merge_to_dicts({'x-api-token': ApiConfig.api_key}, headers)
if api_config.api_key:
headers = Util.merge_to_dicts({'x-api-token': api_config.api_key}, headers)

options['headers'] = headers

abs_url = '%s/%s' % (ApiConfig.api_base, url)
abs_url = '%s/%s' % (api_config.api_base, url)

return cls.execute_request(http_verb, abs_url, **options)

@classmethod
def execute_request(cls, http_verb, url, **options):
session = cls.get_session()
session = options.get('params', {}).get('session', None)
if session is None:
session = cls.get_session()

api_config = get_config_from_kwargs(options)

cls.options_kw_strip(options)
try:
response = session.request(method=http_verb,
url=url,
verify=ApiConfig.verify_ssl,
verify=api_config.verify_ssl,
**options)
if response.status_code < 200 or response.status_code >= 300:
cls.handle_api_error(response)
Expand Down Expand Up @@ -118,3 +129,8 @@ def handle_api_error(cls, resp):
klass = d_klass.get(code_letter, DataLinkError)

raise klass(message, resp.status_code, resp.text, resp.headers, code)

@classmethod
def options_kw_strip(self, options):
for kw in KW_TO_REMOVE:
options.get('params', {}).pop(kw, None)
5 changes: 3 additions & 2 deletions nasdaqdatalink/get_point_in_time.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ def get_point_in_time(datatable_code, **options):

data = None
page_count = 0
api_config = options.get('api_config', ApiConfig)
while True:
next_options = copy.deepcopy(options)
next_data = PointInTime(datatable_code, pit=pit_options).data(params=next_options)
Expand All @@ -32,10 +33,10 @@ def get_point_in_time(datatable_code, **options):
else:
data.extend(next_data)

if page_count >= ApiConfig.page_limit:
if page_count >= api_config.page_limit:
raise LimitExceededError(
Message.WARN_DATA_LIMIT_EXCEEDED % (datatable_code,
ApiConfig.api_key
api_config.api_key
)
)

Expand Down
5 changes: 3 additions & 2 deletions nasdaqdatalink/get_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ def get_table(datatable_code, **options):

data = None
page_count = 0
api_config = options.get('api_config', ApiConfig)
while True:
next_options = copy.deepcopy(options)
next_data = Datatable(datatable_code).data(params=next_options)
Expand All @@ -23,10 +24,10 @@ def get_table(datatable_code, **options):
else:
data.extend(next_data)

if page_count >= ApiConfig.page_limit:
if page_count >= api_config.page_limit:
raise LimitExceededError(
Message.WARN_DATA_LIMIT_EXCEEDED % (datatable_code,
ApiConfig.api_key
api_config.api_key
)
)

Expand Down
57 changes: 57 additions & 0 deletions nasdaqdatalink/model/authorized_session.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import nasdaqdatalink
from nasdaqdatalink.api_config import ApiConfig
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
import requests
import urllib


def get_retries(api_config=nasdaqdatalink.ApiConfig):
retries = None
if not api_config.use_retries:
return Retry(total=0)

Retry.BACKOFF_MAX = api_config.max_wait_between_retries
retries = Retry(total=api_config.number_of_retries,
connect=api_config.number_of_retries,
read=api_config.number_of_retries,
status_forcelist=api_config.retry_status_codes,
backoff_factor=api_config.retry_backoff_factor,
raise_on_status=False)
return retries


class AuthorizedSession:
def __init__(self, api_config=ApiConfig) -> None:
super(AuthorizedSession, self).__init__()
if not isinstance(api_config, ApiConfig):
api_config = ApiConfig
self._api_config = api_config
self._auth_session = requests.Session()
retries = get_retries(self._api_config)
adapter = HTTPAdapter(max_retries=retries)
self._auth_session.mount(api_config.api_protocol, adapter)

proxies = urllib.request.getproxies()
if proxies is not None:
self._auth_session.proxies.update(proxies)

def get(self, dataset, **kwargs):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of this tickets scope, but I would like to see us passing connection details (session, api_config) as explicit arguments, and leave kwargs to params we should send to the API call. This would allow us to avoid all the kwarg popping down the line

nasdaqdatalink.get(dataset, session=self._auth_session,
api_config=self._api_config, **kwargs)

def bulkdownload(self, database, **kwargs):
nasdaqdatalink.bulkdownload(database, session=self._auth_session,
api_config=self._api_config, **kwargs)

def export_table(self, datatable_code, **kwargs):
nasdaqdatalink.export_table(datatable_code, session=self._auth_session,
api_config=self._api_config, **kwargs)

def get_table(self, datatable_code, **options):
nasdaqdatalink.get_table(datatable_code, session=self._auth_session,
api_config=self._api_config, **options)

def get_point_in_time(self, datatable_code, **options):
nasdaqdatalink.get_point_in_time(datatable_code, session=self._auth_session,
api_config=self._api_config, **options)
13 changes: 7 additions & 6 deletions nasdaqdatalink/model/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from six.moves.urllib.parse import urlencode, urlparse

import nasdaqdatalink.model.dataset
from nasdaqdatalink.api_config import ApiConfig
from nasdaqdatalink.api_config import get_config_from_kwargs
from nasdaqdatalink.connection import Connection
from nasdaqdatalink.errors.data_link_error import DataLinkError
from nasdaqdatalink.message import Message
Expand All @@ -21,15 +21,16 @@ def get_code_from_meta(cls, metadata):
return metadata['database_code']

def bulk_download_url(self, **options):
api_config = get_config_from_kwargs(options)
url = self._bulk_download_path()
url = ApiConfig.api_base + '/' + url
url = api_config.api_base + '/' + url

if 'params' not in options:
options['params'] = {}
if ApiConfig.api_key:
options['params']['api_key'] = ApiConfig.api_key
if ApiConfig.api_version:
options['params']['api_version'] = ApiConfig.api_version
if api_config.api_key:
options['params']['api_key'] = api_config.api_key
if api_config.api_version:
options['params']['api_version'] = api_config.api_version

if list(options.keys()):
url += '?' + urlencode(options['params'])
Expand Down
5 changes: 3 additions & 2 deletions nasdaqdatalink/utils/request_type_util.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from urllib.parse import urlencode
from nasdaqdatalink.api_config import ApiConfig
from nasdaqdatalink.api_config import get_config_from_kwargs


class RequestType(object):
Expand All @@ -13,7 +13,8 @@ class RequestType(object):
@classmethod
def get_request_type(cls, url, **params):
query_string = urlencode(params['params'])
request_url = '%s/%s/%s' % (ApiConfig.api_base, url, query_string)
api_config = get_config_from_kwargs(params)
request_url = '%s/%s/%s' % (api_config.api_base, url, query_string)
if RequestType.USE_GET_REQUEST and (len(request_url) < cls.MAX_URL_LENGTH_FOR_GET):
return 'get'
else:
Expand Down
55 changes: 55 additions & 0 deletions test/test_api_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,3 +132,58 @@ def test_read_key_from_file_with_tab(self):
def test_read_key_from_file_with_multi_newline(self):
given = "keyfordefaultfile\n\nanotherkey\n"
self._read_key_from_file_helper(given, TEST_DEFAULT_FILE_CONTENTS)

def test_default_instance_will_have_share_values_with_singleton(self):
os.environ['NASDAQ_DATA_LINK_API_KEY'] = 'setinenv'
ApiConfig.api_key = None
read_key()
api_config = ApiConfig()
self.assertEqual(api_config.api_key, "setinenv")
# make sure change in instance will not affect the singleton
api_config.api_key = None
self.assertEqual(ApiConfig.api_key, "setinenv")

def test_get_config_from_kwargs_return_api_config_if_present(self):
api_config = get_config_from_kwargs({
'params': {
'api_config': ApiConfig()
}
})
self.assertTrue(isinstance(api_config, ApiConfig))

def test_get_config_from_kwargs_return_singleton_if_not_present_or_wrong_type(self):
api_config = get_config_from_kwargs(None)
self.assertTrue(issubclass(api_config, ApiConfig))
self.assertFalse(isinstance(api_config, ApiConfig))
api_config = get_config_from_kwargs(1)
self.assertTrue(issubclass(api_config, ApiConfig))
self.assertFalse(isinstance(api_config, ApiConfig))
api_config = get_config_from_kwargs({
'params': None
})
self.assertTrue(issubclass(api_config, ApiConfig))
self.assertFalse(isinstance(api_config, ApiConfig))

def test_instance_read_key_should_raise_error(self):
api_config = ApiConfig()
with self.assertRaises(TypeError):
api_config.read_key(None)
with self.assertRaises(ValueError):
api_config.read_key('')

def test_instance_read_key_should_raise_error_when_empty(self):
save_key("", TEST_KEY_FILE)
api_config = ApiConfig()
with self.assertRaises(ValueError):
# read empty file
api_config.read_key(TEST_KEY_FILE)

def test_instance_read_the_right_key(self):
expected_key = 'ilovepython'
save_key(expected_key, TEST_KEY_FILE)
api_config = ApiConfig()
api_config.api_key = ''
api_config.read_key(TEST_KEY_FILE)
self.assertEqual(ApiConfig.api_key, expected_key)


Loading