diff --git a/.gitignore b/.gitignore
index 11f3bb7..f37ec52 100644
--- a/.gitignore
+++ b/.gitignore
@@ -163,4 +163,11 @@ cython_debug/
# Custom
data_bridges_api_config.yaml
-ROADMAP.md
\ No newline at end of file
+.Rproj.user
+.RData
+.Rhistory
+*.Rproj
+*.yaml
+sandbox.py
+*.csv
+.vscode
\ No newline at end of file
diff --git a/LICENSE.md b/LICENSE
similarity index 83%
rename from LICENSE.md
rename to LICENSE
index be3f7b2..bc08fe2 100644
--- a/LICENSE.md
+++ b/LICENSE
@@ -1,21 +1,23 @@
- GNU AFFERO GENERAL PUBLIC LICENSE
- Version 3, 19 November 2007
+ GNU GENERAL PUBLIC LICENSE
+ Version 3, 29 June 2007
- Copyright (C) 2007 Free Software Foundation, Inc.
+ Copyright (C) 2007 Free Software Foundation, Inc.
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
- The GNU Affero General Public License is a free, copyleft license for
-software and other kinds of works, specifically designed to ensure
-cooperation with the community in the case of network server software.
+ The GNU General Public License is a free, copyleft license for
+software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
-our General Public Licenses are intended to guarantee your freedom to
+the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
-software for all its users.
+software for all its users. We, the Free Software Foundation, use the
+GNU General Public License for most of our software; it applies also to
+any other work released this way by its authors. You can apply it to
+your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
@@ -24,34 +26,44 @@ them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
- Developers that use our General Public Licenses protect your rights
-with two steps: (1) assert copyright on the software, and (2) offer
-you this License which gives you legal permission to copy, distribute
-and/or modify the software.
-
- A secondary benefit of defending all users' freedom is that
-improvements made in alternate versions of the program, if they
-receive widespread use, become available for other developers to
-incorporate. Many developers of free software are heartened and
-encouraged by the resulting cooperation. However, in the case of
-software used on network servers, this result may fail to come about.
-The GNU General Public License permits making a modified version and
-letting the public access it on a server without ever releasing its
-source code to the public.
-
- The GNU Affero General Public License is designed specifically to
-ensure that, in such cases, the modified source code becomes available
-to the community. It requires the operator of a network server to
-provide the source code of the modified version running there to the
-users of that server. Therefore, public use of a modified version, on
-a publicly accessible server, gives the public access to the source
-code of the modified version.
-
- An older license, called the Affero General Public License and
-published by Affero, was designed to accomplish similar goals. This is
-a different license, not a version of the Affero GPL, but Affero has
-released a new version of the Affero GPL which permits relicensing under
-this license.
+ To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights. Therefore, you have
+certain responsibilities if you distribute copies of the software, or if
+you modify it: responsibilities to respect the freedom of others.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received. You must make sure that they, too, receive
+or can get the source code. And you must show them these terms so they
+know their rights.
+
+ Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+
+ For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software. For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+
+ Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the manufacturer
+can do so. This is fundamentally incompatible with the aim of
+protecting users' freedom to change the software. The systematic
+pattern of such abuse occurs in the area of products for individuals to
+use, which is precisely where it is most unacceptable. Therefore, we
+have designed this version of the GPL to prohibit the practice for those
+products. If such problems arise substantially in other domains, we
+stand ready to extend this provision to those domains in future versions
+of the GPL, as needed to protect the freedom of users.
+
+ Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary. To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
@@ -60,7 +72,7 @@ modification follow.
0. Definitions.
- "This License" refers to version 3 of the GNU Affero General Public License.
+ "This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
@@ -537,45 +549,35 @@ to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
- 13. Remote Network Interaction; Use with the GNU General Public License.
-
- Notwithstanding any other provision of this License, if you modify the
-Program, your modified version must prominently offer all users
-interacting with it remotely through a computer network (if your version
-supports such interaction) an opportunity to receive the Corresponding
-Source of your version by providing access to the Corresponding Source
-from a network server at no charge, through some standard or customary
-means of facilitating copying of software. This Corresponding Source
-shall include the Corresponding Source for any work covered by version 3
-of the GNU General Public License that is incorporated pursuant to the
-following paragraph.
+ 13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
-under version 3 of the GNU General Public License into a single
+under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
-but the work with which it is combined will remain governed by version
-3 of the GNU General Public License.
+but the special requirements of the GNU Affero General Public License,
+section 13, concerning interaction through a network will apply to the
+combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
-the GNU Affero General Public License from time to time. Such new versions
-will be similar in spirit to the present version, but may differ in detail to
+the GNU General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
-Program specifies that a certain numbered version of the GNU Affero General
+Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
-GNU Affero General Public License, you may choose any version ever published
+GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
-versions of the GNU Affero General Public License can be used, that proxy's
+versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
@@ -615,47 +617,3 @@ reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
-
- END OF TERMS AND CONDITIONS
-
- How to Apply These Terms to Your New Programs
-
- If you develop a new program, and you want it to be of the greatest
-possible use to the public, the best way to achieve this is to make it
-free software which everyone can redistribute and change under these terms.
-
- To do so, attach the following notices to the program. It is safest
-to attach them to the start of each source file to most effectively
-state the exclusion of warranty; and each file should have at least
-the "copyright" line and a pointer to where the full notice is found.
-
-
- Copyright (C)
-
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU Affero General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU Affero General Public License for more details.
-
- You should have received a copy of the GNU Affero General Public License
- along with this program. If not, see .
-
-Also add information on how to contact you by electronic and paper mail.
-
- If your software can interact with users remotely through a computer
-network, you should also make sure that it provides a way for users to
-get its source. For example, if your program is a web application, its
-interface could display a "Source" link that leads users to an archive
-of the code. There are many ways you could offer source, and different
-solutions will be better for different programs; see section 13 for the
-specific requirements.
-
- You should also get your employer (if you work as a programmer) or school,
-if any, to sign a "copyright disclaimer" for the program, if necessary.
-For more information on this, and how to apply and follow the GNU AGPL, see
-.
diff --git a/README.md b/README.md
index e69de29..b7be5dc 100644
--- a/README.md
+++ b/README.md
@@ -0,0 +1,110 @@
+# Data Bridges Connect
+
+This Python module allows you to get data from the WFP Data Bridges API, including household survey data, market prices, exchange rates, GORP (Global Operational Response Plan) data, and food security data (IPC equivalent). It is a wrapper for the [Data Bridges API Client](https://github.com/WFP-VAM/DataBridgesAPI), providing an easier way to data analysts to get VAM and monitoring data using their language of choice (Python, R and STATA).
+
+## Installation
+
+> NB This is the dev version of the data_bridges_utils and API client package, it is frequently updated yet not stable.
+
+You can install the `data_bridges_utils` package using `pip` and the Git repository URL:
+
+```
+pip install --force-reinstall git+https://github.com/WFP-VAM/DataBridgesConnect.git@dev
+```
+
+## Configuration
+1. Create a ```data_bridges_api_config.yaml``` in the main folder you're running your core from.
+2. The structure of the file is:
+ ```
+ NAME: ''
+ VERSION : ''
+ KEY: ''
+ SECRET: ''
+ SCOPES:
+ - ''
+ - ''
+ ```
+1. Replace your_api_key and your_api_secret with your actual API key and secret from the Data Bridges API. Update the SCOPES list with the required scopes for your use case.
+2. (For WFP users) Credentials and scopes for DataBridges API can be requested by opening a ticket with the [TEC Digital Core team](https://dev.azure.com/worldfoodprogramme/Digital%20Core/_workitems). See [documentation](https://docs.api.wfp.org/consumers/index.html#application-accounts)
+3. External users can reach out to [wfp.vaminfo@wfp.org](mailto:wfp.vaminfo@wfp.org) for support on getting the API credentials.
+
+### Python
+Run the following code to extract household survey data.
+
+```python
+from data_bridges_utils import DataBridgesShapes
+
+CONFIG_PATH = "data_bridges_api_config.yaml"
+
+client = DataBridgesShapes(CONFIG_PATH)
+
+# Get household data for survey id
+survey_data = client.get_household_survey(survey_id=3329, access_type='full')
+print(survey_data.head())
+```
+A sample python file with additional examples for other endpoints is provided in the repo.
+
+### STATA
+1. Make sure you declare where your Python instance is by setting ```python set exec "path/to/python/env"```
+2. Run the following code to extract household survey data and loading it into STATA as a flat dataset with value labels. Make sure to edit your ```stata_path```and ```stata_version``` to match the one installed in your system.
+
+```stata
+python set exect "path/to/python/env"
+
+python:
+
+"""
+Read a 'base' Household dataset from Data Bridges and load it into STATA.
+Only works if user has STATA 18+ installed and added to PATH.
+"""
+
+from data_bridges_utils import DataBridgesShapes, map_value_labels
+from data_bridges_utils.load_stata import load_stata
+import stata_setup
+
+# set installation path for STATA
+stata_path = r"C:/Program Files/Stata18"
+# set stata version
+stata_version = "se"
+
+stata_setup.config(stata_path, stata_version)
+from sfi import Data, Macro, SFIToolkit, Frame, Datetime as dt
+
+# Path to YAML file containing Data Bridges API credentials
+CONFIG_PATH = r"data_bridges_api_config.yaml"
+
+# Example dataset and questionnaire from 2023 Congo CFSVA
+CONGO_CFSVA = {
+ 'questionnaire': 1509,
+ 'dataset': 3094
+}
+
+# Initialize DataBridges client with credentials from YAML file
+client = DataBridgesShapes(CONFIG_PATH)
+
+# Get houhold data for survey id
+survey_data = client.get_household_survey(survey_id=CONGO_CFSVA["dataset"], access_type='base') # base is the standardized-only dataset
+questionnaire = client.get_household_questionnaire(CONGO_CFSVA["questionnaire"])
+
+# Map the categories to survey_data
+mapped_survey_data = map_value_labels(survey_data, questionnaire)
+
+# Get variable labels
+variable_labels = get_column_labels(questionnaire)
+# Get value labels
+value_labels = get_value_labels(questionnaire)
+
+# Return flat dataset with value labels
+survey_data_with_value_labels = map_value_labels(survey_data, questionnaire)
+
+# Load into STATA dataframe
+ds = load_stata(survey_data_with_value_labels, stata_path, stata_version)
+
+end
+```
+
+## Contributing
+Contributions are welcome! Please open an issue or submit a pull request if you have any improvements or bug fixes.
+
+## License
+This project is licensed under the AGPL 3.0 License.
diff --git a/ROADMAP.md b/ROADMAP.md
new file mode 100644
index 0000000..80f20a9
--- /dev/null
+++ b/ROADMAP.md
@@ -0,0 +1,62 @@
+# Roadmap for DataBridgesUtils
+
+This document outlines the planned features and improvements for the `DataBridgesUtils` package, which provides a wrapper for the WFP Data Bridges API.
+
+## Upcoming Release: 1.0.0 (DEV)
+
+### Wrapper Endpoints
+
+The following endpoints will be added or improved in the upcoming release:
+
+- [X] Exchange rates
+- [X] Food security (IPC)
+- [X] GORP (Global Operational Response Plan)
+- [X] Market prices
+- [X] Surveys
+- [X] XSL Forms
+
+### Wrapper Endpoints
+- [X] Get variable labels
+- [X] Get value labels
+- [ ] Output dta with value and variable labels
+
+### Examples and Documentation
+
+- [X] Provide an example file in Python demonstrating the usage of available endpoints
+- [X] Provide an example file in STATA demonstrating the usage of available endpoints
+ - [ ] Test
+- [ ] Provide an example file in R demonstrating the usage of available endpoints
+ - [ ] Test
+- [X] Update the README file with Python usage examples
+- [X] Add documentation for STATA users in the README file
+
+### Repository Setup
+
+- [X] Create a GitHub repository for the `DataBridgesUtils` package
+- [X] Configure optional dependencies in the project files
+- [X] Set up the package installation process
+
+## Improvements (1.1.0)
+
+## Bug fixing
+- [ ] DPO change for GORP
+- [ ] DPO change for XLSForm
+- [ ] Fix optional dependencies for STATA
+- [ ] Handle SSL certificate error
+
+## Wrapper points
+- [ ] Economic data
+- [ ] Commodities
+- [ ] Commodity units
+- [ ] Markets
+- [ ] RPME (Resource Planning and Monitoring Environment)
+
+## Future Releases (2.0.0)
+
+- [ ] Improve error handling and logging
+- [ ] Add unit tests and integration tests
+- [ ] Enhance documentation and provide more usage examples
+- [ ] Optimize performance and improve code efficiency
+- [ ] Implement additional features based on user feedback and requirements
+
+Please note that this roadmap is subject to change, and the priorities may be adjusted based on the project's needs and available resources.
diff --git a/data_bridges_api_config_sample.yaml b/data_bridges_api_config_sample.yaml
index ffabff0..3cb1aab 100644
--- a/data_bridges_api_config_sample.yaml
+++ b/data_bridges_api_config_sample.yaml
@@ -2,8 +2,6 @@ NAME: ''
VERSION : ''
KEY: ''
SECRET: ''
-KEY3.0: ''
-SECRET3.0: ''
SCOPES:
- ''
- ''
\ No newline at end of file
diff --git a/data_bridges_utils/__init__.py b/data_bridges_utils/__init__.py
index 53207b2..21f08ca 100644
--- a/data_bridges_utils/__init__.py
+++ b/data_bridges_utils/__init__.py
@@ -1,10 +1,10 @@
# encoding: utf-8
"""
-Wrapper for DataBridges client, making it easier to load data in Python, R and STATA.
+Wrapper for DataBridges client.
"""
from .get_data import DataBridgesShapes
-from .load_stata import load_stata
+from .labels import get_column_labels, get_value_labels, map_value_labels
-__all__ = ['DataBridgesShapes', 'load_stata']
+__all__ = ['DataBridgesShapes', 'labels']
diff --git a/data_bridges_utils/get_data.py b/data_bridges_utils/get_data.py
index 46acb73..a444eec 100644
--- a/data_bridges_utils/get_data.py
+++ b/data_bridges_utils/get_data.py
@@ -2,13 +2,24 @@
import logging
from datetime import timedelta, date
import pandas as pd
+import numpy as np
import yaml
from data_bridges_client.rest import ApiException
from data_bridges_client.token import WfpApiToken
import data_bridges_client
+
+logname = "data_bridges_api_calls.log"
+logging.basicConfig(filename=logname,
+ filemode='a',
+ format='%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
+ datefmt='%H:%M:%S',
+ level=logging.INFO)
+
logger = logging.getLogger(__name__)
+
+
class DataBridgesShapes:
"""
Retrieves survey data using the specified configuration and access type.
@@ -24,8 +35,10 @@ class DataBridgesShapes:
"""
- def __init__(self, yaml_config_path):
+ def __init__(self, yaml_config_path, env='prod'):
self.configuration = self.setup_configuration_and_authentication(yaml_config_path)
+ self.env = env
+ self.data = {}
def __repr__(self):
return "DataBridgesShapes(yamlpath='%s')" % self.configuration.host
@@ -46,7 +59,7 @@ def setup_configuration_and_authentication(self, yaml_config_path):
version = databridges_config['VERSION']
uri = "https://api.wfp.org/vam-data-bridges/"
host = str(uri + version)
-
+
logger.info("DataBridges API: %s", host)
token = WfpApiToken(api_key=key, api_secret=secret)
@@ -55,7 +68,7 @@ def setup_configuration_and_authentication(self, yaml_config_path):
)
return configuration
- def get_household_survey(self, survey_id, access_type, page_size=200):
+ def get_household_survey(self, survey_id, access_type, page_size=600):
"""Retrieves survey data using the specified configuration and access type.
Args:
@@ -77,9 +90,10 @@ def get_household_survey(self, survey_id, access_type, page_size=200):
page += 1
with data_bridges_client.ApiClient(self.configuration) as api_client:
api_instance = data_bridges_client.IncubationApi(api_client)
- env = 'prod'
+ env = self.env
try:
+ logger.info(f"Calling get_household_survey for survey {survey_id}")
# Select appropriate API call based on access_type
api_survey = {
'': api_instance.household_public_base_data_get,
@@ -87,7 +101,8 @@ def get_household_survey(self, survey_id, access_type, page_size=200):
'draft': api_instance.household_draft_internal_base_data_get,
'official': api_instance.household_official_use_base_data_get,
'public': api_instance.household_public_base_data_get
- }.get(access_type)(survey_id=survey_id, page=page, env=env)
+ }.get(access_type)(survey_id=survey_id, page=page, page_size=page_size, env=env)
+
logger.info("Fetching page %s", page)
logger.info("Items: %s", len(api_survey.items))
responses.extend(api_survey.items)
@@ -97,9 +112,12 @@ def get_household_survey(self, survey_id, access_type, page_size=200):
except ApiException as e:
logger.error("Exception when calling Household data-> %s%s\n", access_type, e)
- exit()
+ raise
df = pd.DataFrame(responses)
+
+ df.apply(lambda x: pd.to_numeric(x, errors='coerce', downcast='integer').fillna(9999).astype(np.int64 if x.dtype == 'int64' else x.dtype))
+ df = df.replace({9999: None})
return df
def get_prices(self, country_iso3, survey_date, page_size=1000):
@@ -128,7 +146,7 @@ def get_prices(self, country_iso3, survey_date, page_size=1000):
page += 1
with data_bridges_client.ApiClient(self.configuration) as api_client:
api_instance = data_bridges_client.MarketPricesApi(api_client)
- env = 'prod'
+ env = self.env
try:
api_prices = api_instance.market_prices_price_monthly_get(
@@ -141,8 +159,10 @@ def get_prices(self, country_iso3, survey_date, page_size=1000):
time.sleep(1)
except ApiException as e:
logger.error("Exception when calling Market price data->market_prices_price_monthly_get: %s\n", e)
- exit()
+ raise
+
df = pd.DataFrame(responses)
+ df = df.replace({np.nan: None})
return df
def get_exchange_rates(self, country_iso3, page_size=1000):
@@ -165,7 +185,7 @@ def get_exchange_rates(self, country_iso3, page_size=1000):
page += 1
with data_bridges_client.ApiClient(self.configuration) as api_client:
api_instance = data_bridges_client.CurrencyApi(api_client)
- env = 'prod'
+ env = self.env
try:
api_exchange_rates = api_instance.currency_usd_indirect_quotation_get(
@@ -178,37 +198,138 @@ def get_exchange_rates(self, country_iso3, page_size=1000):
time.sleep(1)
except ApiException as e:
logger.error("Exception when calling Exchange rates data->household_full_data_get: %s\n", e)
- exit()
+ raise
df = pd.DataFrame(responses)
+ df = df.replace({np.nan: None})
return df
+
+
+ def get_gorp(self, data_type, page=None):
+ """
+ Retrieves data from the Global Operational Response Plan (GORP) API.
+
+ Args:
+ data_type (str): The type of GORP data to retrieve. Can be one of 'country_latest', 'global_latest', 'latest', 'list', or 'regional_latest'.
+ page (int, optional): The page number for paginated results. Defaults to None.
+ env (str, optional): The environment to use. Can be 'prod' or 'dev'. Defaults to 'prod'.
- def get_ipc_equivalent(self, country_iso3: str, year: int, page_size=1000):
+ Returns:
+ The requested GORP data.
"""
- Retrieves food security data for a given country ISO3 code from the Data Bridges API.
+ with data_bridges_client.ApiClient(self.configuration) as api_client:
+ gorp_api_instance = data_bridges_client.GorpApi(api_client)
+ env = self.env
+
+ responses = []
+
+ try:
+ if data_type == 'country_latest':
+ gorp_data = gorp_api_instance.gorp_country_latest_get(env=env)
+ elif data_type == 'global_latest':
+ gorp_data = gorp_api_instance.gorp_global_latest_get(env=env)
+ elif data_type == 'latest':
+
+ gorp_data = gorp_api_instance.gorp_latest_get(page=page, env=env)
+ elif data_type == 'list':
+ gorp_data = gorp_api_instance.gorp_list_get(page=page, env=env)
+ elif data_type == 'regional_latest':
+ gorp_data = gorp_api_instance.gorp_regional_latest_get(env=env)
+ else:
+ raise ValueError(f"Invalid data_type: {data_type}")
+ except ApiException as e:
+ logger.error("Exception when calling Exchange rates data->household_full_data_get: %s\n", e)
+ raise
+
+ if "GorpGlobalApiDto" in gorp_data.__doc__:
+ responses.extend(item for item in gorp_data)
+ else:
+ try:
+ responses.extend(item.to_dict() for item in gorp_data.items)
+ except AttributeError:
+ responses.extend(item.to_dict() for item in gorp_data)
+
+ df = pd.DataFrame(responses)
+ df = df.replace({np.nan: None})
+ return df
+
+ def get_food_security(self, country_iso3=None, year=None, page=None, env='prod'):
"""
- responses = []
- total_items = 20
- max_item = 0
- page = 0
- while total_items > max_item:
- page += 1
- with data_bridges_client.ApiClient(self.configuration) as api_client:
- api_instance = data_bridges_client.FoodSecurityApi(api_client)
- env = 'prod'
+ Retrieves food security data from the Data Bridges API.
+ Args:
+ country_iso3 (str, optional): The ISO3 code of the country to retrieve data for. Defaults to None.
+ year (int, optional): The year to retrieve data for. Defaults to None.
+ page (int, optional): The page number for paginated results. Defaults to None.
+ env (str, optional): The environment to use. Can be 'prod' or 'dev'. Defaults to 'prod'.
+
+ Returns:
+ The requested food security data.
+ """
+ responses =[]
+ with data_bridges_client.ApiClient(self.configuration) as api_client:
+ food_security_api_instance = data_bridges_client.FoodSecurityApi(api_client)
+
+ try:
+ food_security_data = food_security_api_instance.food_security_list_get(
+ iso3=country_iso3,
+ year=year,
+ page=page,
+ env=env
+ )
+ except data_bridges_client.ApiException as e:
+ logger.error(f"Exception when calling Food Security data: {e}")
+ raise
+
+ responses.extend(item.to_dict() for item in food_security_data.items)
+ return pd.DataFrame(responses)
+
+ def get_household_questionnaire(self, xls_form_id, env='prod', page_size=200):
+ """
+ This function fetches questionnaire data for a given form ID from the Data Bridges API.
+
+ Args:
+ form_id (int): The ID of the questionnaire form to retrieve data for.
+ page_size (int, optional): The maximum number of items to retrieve per API call. Defaults to 200.
+
+ Returns:
+ pandas.DataFrame: A DataFrame containing the fetched questionnaire data.
+
+ Raises:
+ ApiException: If an error occurs while calling the Data Bridges API.
+ """
+
+ page = 0
+ with data_bridges_client.ApiClient(self.configuration) as api_client:
+ api_instance = data_bridges_client.IncubationApi(api_client)
+ env = self.env
+ responses = []
+ try:
+ # Select appropriate API call based on access_type
+ api_survey = api_instance.xls_forms_definition_get(xls_form_id=xls_form_id, env=env)
+ page += 1
try:
- api_food_security = api_instance.food_security_list_get(
- country_iso3=country_iso3, year=year, page=page, env=env,
- )
- responses.extend(item.to_dict() for item in api_food_security.items)
- total_items = api_food_security.total_items
- logger.info("Fetching page %s", page)
- max_item = page * page_size
- time.sleep(1)
- except ApiException as e:
- logger.error("Exception when calling Food security data->food_security_list_get: %s\n", e)
+ logger.info(f"Fetching page {page} from XLSForm definition")
+ responses.extend(item.to_dict() for item in api_survey.items)
+ except AttributeError:
+ responses.extend(item.to_dict() for item in api_survey)
+ time.sleep(1)
+
+ except ApiException as e:
+ logger.error("Exception when calling Household questionnaire-> %s%s\n", xls_form_id, e)
+ raise
+
df = pd.DataFrame(responses)
- return df
+ df = df.replace({np.nan: None})
+
+ questionnaire = pd.DataFrame(list(df.fields)[0])
+ self.data[xls_form_id] = questionnaire
+ return questionnaire
+
+ def get_choice_list(self, xls_form_id):
+ questionnaire = self.data[xls_form_id]
+ choiceList = pd.json_normalize(questionnaire['choiceList']).dropna()
+ choices = choiceList.explode('choices')
+ return choices
if __name__ == "__main__":
diff --git a/data_bridges_utils/labels.py b/data_bridges_utils/labels.py
new file mode 100644
index 0000000..b6c21f0
--- /dev/null
+++ b/data_bridges_utils/labels.py
@@ -0,0 +1,57 @@
+import pandas as pd
+
+def get_value_labels(df):
+ choiceList = pd.json_normalize(df['choiceList'])
+ choiceList = choiceList.rename(columns={"name": "choice_name"})
+ choiceList = choiceList.join(df["name"]).dropna()
+ choices = choiceList.explode('choices')
+
+ categories_dict = {}
+ for _, row in choices.iterrows():
+ name = row["name"]
+ choice = row["choices"]
+ if name in categories_dict:
+ categories_dict[name].update({int(choice["name"]): choice["label"]})
+ else:
+ categories_dict[name] = {int(choice["name"]): choice["label"]}
+ return categories_dict
+
+def get_column_labels(df):
+ labels_dict = {}
+
+ for _, row in df.iterrows():
+ name = row["name"]
+ label = row["label"]
+ if name in labels_dict:
+ labels_dict[name].update(label)
+ elif label == "":
+ labels_dict[name] = name
+ else:
+ labels_dict[name] = label
+ return labels_dict
+
+# Map values if int
+def map_value_labels(survey_data, questionnaire):
+ choiceList = pd.json_normalize(questionnaire['choiceList'])
+ choiceList = choiceList.rename(columns={"name": "choice_name"})
+ choiceList = choiceList.join(questionnaire["name"]).dropna()
+ choices = choiceList.explode('choices')
+
+ categories_dict = dict()
+ for _, row in choices.iterrows():
+ name = row["name"]
+ choice = row["choices"]
+ if name in categories_dict:
+ categories_dict[name].update({int(choice["name"]): choice["label"]})
+ else:
+ categories_dict[name] = {int(choice["name"]): choice["label"]}
+
+ # Map the categories to survey_data
+ survey_data_value_labels = survey_data.copy()
+ for col in survey_data_value_labels.columns:
+ if col in categories_dict:
+ category_dict = categories_dict[col]
+ survey_data_value_labels[col] = survey_data_value_labels[col].apply(lambda x: category_dict.get(x, x))
+
+ return survey_data_value_labels
+
diff --git a/data_bridges_utils/load_stata.py b/data_bridges_utils/load_stata.py
index 334e7bb..42cea23 100644
--- a/data_bridges_utils/load_stata.py
+++ b/data_bridges_utils/load_stata.py
@@ -1,18 +1,15 @@
import stata_setup
-try:
- stata_setup.config('C:/Program Files/Stata18', 'se')
- from sfi import Data, Macro, SFIToolkit, Frame, Datetime as dt
-except OSError:
- print("Stata executable not found. Please install Stata and add it to your PATH.")
+def load_stata(df, stata_path="C:/Program Files/Stata18", stata_version="se"):
+ stata_setup.config(stata_path, stata_version)
+ from sfi import Data, Macro, SFIToolkit, Frame, Datetime as dt
-def load_stata(df):
"""
Loads a Pandas DataFrame into a Stata data file format.
-
+
Args:
df (pandas.DataFrame): The DataFrame to be loaded into Stata format.
-
+
Returns:
pandas.DataFrame: The original DataFrame.
"""
@@ -20,12 +17,10 @@ def load_stata(df):
Data.setObsTotal(len(df))
for i in range(len(colnames)):
dtype = df.dtypes[i].name
- print(dtype)
# make a valid Stata variable name
varname = SFIToolkit.makeVarName(colnames[i], retainCase=True)
- print(colnames[i])
# varname = colnames[i]
- varval = df[colnames[i]].values.tolist()
+ varval = df.iloc[:, i].values.tolist() # Use .iloc to access values by position
if dtype == "int64":
Data.addVarInt(varname)
Data.store(varname, None, varval)
@@ -37,7 +32,7 @@ def load_stata(df):
Data.store(varname, None, varval)
elif dtype == "datetime64[ns]":
Data.addVarFloat(varname)
- price_dt_py = [dt.getSIF(j, '%tdCCYY-NN-DD') for j in df[colnames[i]]]
+ price_dt_py = [dt.getSIF(j, '%tdCCYY-NN-DD') for j in df.iloc[:, i]] # Use .iloc
Data.store(varname, None, price_dt_py)
Data.setVarFormat(varname, '%tdCCYY-NN-DD')
else:
@@ -49,4 +44,4 @@ def load_stata(df):
if __name__ == "__main__":
- pass
\ No newline at end of file
+ pass
diff --git a/environment.yaml b/environment.yaml
new file mode 100644
index 0000000..1f369da
--- /dev/null
+++ b/environment.yaml
@@ -0,0 +1,144 @@
+name: data_bridges
+channels:
+ - conda-forge
+ - defaults
+dependencies:
+ - annotated-types=0.7.0=pyhd8ed1ab_0
+ - archspec=0.2.3=pyhd8ed1ab_0
+ - boltons=23.0.0=py312haa95532_0
+ - brotli-python=1.0.9=py312hd77b12b_7
+ - bzip2=1.0.8=he774522_0
+ - ca-certificates=2024.2.2=h56e8100_0
+ - certifi=2024.2.2=pyhd8ed1ab_0
+ - cffi=1.16.0=py312h2bbff1b_0
+ - charset-normalizer=2.0.4=pyhd3eb1b0_0
+ - colorama=0.4.6=py312haa95532_0
+ - conda=24.5.0=py312h2e8e312_0
+ - conda-content-trust=0.2.0=py312haa95532_0
+ - conda-libmamba-solver=23.12.0=pyhd3eb1b0_1
+ - conda-package-handling=2.2.0=py312haa95532_0
+ - conda-package-streaming=0.9.0=py312haa95532_0
+ - console_shortcut_miniconda=0.1.1=haa95532_1
+ - cryptography=41.0.7=py312h89fc84f_0
+ - distro=1.8.0=py312haa95532_0
+ - et_xmlfile=1.1.0=pyhd8ed1ab_0
+ - expat=2.5.0=hd77b12b_0
+ - fmt=9.1.0=h6d14046_0
+ - frozendict=2.4.4=py312h4389bb4_0
+ - idna=3.4=py312haa95532_0
+ - importlib-metadata=7.1.0=pyha770c72_0
+ - importlib_metadata=7.1.0=hd8ed1ab_0
+ - intel-openmp=2024.1.0=h57928b3_964
+ - jsonpatch=1.32=pyhd3eb1b0_0
+ - jsonpointer=2.1=pyhd3eb1b0_0
+ - libarchive=3.6.2=hb62f4d4_2
+ - libblas=3.9.0=22_win64_mkl
+ - libcblas=3.9.0=22_win64_mkl
+ - libcurl=8.5.0=h86230a5_0
+ - libexpat=2.5.0=h63175ca_1
+ - libffi=3.4.4=hd77b12b_0
+ - libhwloc=2.9.1=h51c2c0f_0
+ - libiconv=1.16=h2bbff1b_2
+ - liblapack=3.9.0=22_win64_mkl
+ - libmamba=1.5.3=hcd6fe79_0
+ - libmambapy=1.5.3=py312h77c03ed_0
+ - libsolv=0.7.24=h23ce68f_0
+ - libsqlite=3.45.2=hcfcfb64_0
+ - libssh2=1.10.0=he2ea4bf_2
+ - libxml2=2.10.4=h0ad7f3c_1
+ - libzlib=1.2.13=hcfcfb64_5
+ - lz4-c=1.9.4=h2bbff1b_0
+ - menuinst=2.0.2=py312hd77b12b_0
+ - mkl=2024.1.0=h66d3029_692
+ - multimethod=1.9.1=pyhd8ed1ab_0
+ - mypy_extensions=1.0.0=pyha770c72_0
+ - numpy=1.26.4=py312h8753938_0
+ - openpyxl=3.1.2=py312he70551f_1
+ - openssl=3.3.0=h2466b09_3
+ - packaging=23.1=py312haa95532_0
+ - pandas=2.2.2=py312h72972c8_1
+ - pandera=0.19.3=hd8ed1ab_0
+ - pandera-base=0.19.3=pyhd8ed1ab_0
+ - pcre2=10.42=h0ff8eda_0
+ - pip=23.3.1=py312haa95532_0
+ - platformdirs=3.10.0=py312haa95532_0
+ - pluggy=1.0.0=py312haa95532_1
+ - powershell_shortcut_miniconda=0.0.1=haa95532_1
+ - pthreads-win32=2.9.1=hfa6e2cd_3
+ - pybind11-abi=4=hd3eb1b0_1
+ - pycosat=0.6.6=py312h2bbff1b_0
+ - pycparser=2.21=pyhd3eb1b0_0
+ - pydantic=2.7.1=pyhd8ed1ab_0
+ - pydantic-core=2.18.2=py312h2615798_0
+ - pysocks=1.7.1=py312haa95532_0
+ - python=3.12.2=h2628c8c_0_cpython
+ - python-dateutil=2.9.0=pyhd8ed1ab_0
+ - python-tzdata=2024.1=pyhd8ed1ab_0
+ - python_abi=3.12=4_cp312
+ - pytz=2024.1=pyhd8ed1ab_0
+ - reproc=14.2.4=hd77b12b_1
+ - reproc-cpp=14.2.4=hd77b12b_1
+ - requests=2.31.0=py312haa95532_1
+ - ruamel.yaml=0.17.21=py312h2bbff1b_0
+ - setuptools=68.2.2=py312haa95532_0
+ - six=1.16.0=pyh6c4a22f_0
+ - sqlite=3.41.2=h2bbff1b_0
+ - tbb=2021.9.0=h91493d7_0
+ - tk=8.6.13=h5226925_1
+ - tqdm=4.65.0=py312hfc267ef_0
+ - truststore=0.8.0=py312haa95532_0
+ - typeguard=4.2.1=pyhd8ed1ab_0
+ - typing-extensions=4.11.0=hd8ed1ab_0
+ - typing_extensions=4.11.0=pyha770c72_0
+ - typing_inspect=0.9.0=pyhd8ed1ab_0
+ - tzdata=2023d=h04d1e81_0
+ - ucrt=10.0.22621.0=h57928b3_0
+ - vc=14.2=h21ff451_1
+ - vc14_runtime=14.38.33130=h82b7239_18
+ - vs2015_runtime=14.38.33130=hcb4865c_18
+ - wheel=0.41.2=py312haa95532_0
+ - win_inet_pton=1.1.0=py312haa95532_0
+ - wrapt=1.16.0=py312he70551f_0
+ - xz=5.4.5=h8cc25b3_0
+ - yaml-cpp=0.8.0=hd77b12b_0
+ - zipp=3.17.0=pyhd8ed1ab_0
+ - zlib=1.2.13=hcfcfb64_5
+ - zstandard=0.19.0=py312h2bbff1b_0
+ - zstd=1.5.5=hd43e919_0
+ - pip:
+ - aiodns==3.2.0
+ - aiosmtpd==1.4.5
+ - anyio==4.3.0
+ - asttokens==2.4.1
+ - atpublic==4.1.0
+ - attrs==23.2.0
+ - decorator==5.1.1
+ - docutils==0.21.2
+ - executing==2.0.1
+ - flit==3.9.0
+ - flit-core==3.9.0
+ - git-filter-repo==2.38.0
+ - h11==0.14.0
+ - httpcore==1.0.5
+ - httpx==0.27.0
+ - ipython==8.24.0
+ - jedi==0.19.1
+ - matplotlib-inline==0.1.7
+ - parso==0.8.4
+ - prompt-toolkit==3.0.43
+ - pure-eval==0.2.2
+ - pycares==4.4.0
+ - pygments==2.18.0
+ - pyinstrument==4.6.2
+ - pystata==0.0.1
+ - python-dotenv==1.0.1
+ - pyyaml==6.0.1
+ - sniffio==1.3.1
+ - stack-data==0.6.3
+ - stata-setup==0.1.3
+ - tomli-w==1.0.0
+ - traitlets==5.14.3
+ - urllib3==2.0.7
+ - verify-email==2.4.3
+ - wcwidth==0.2.13
+ - xlsxwriter==3.2.0
diff --git a/examples/example_R.R b/examples/example_R.R
new file mode 100644
index 0000000..96b3656
--- /dev/null
+++ b/examples/example_R.R
@@ -0,0 +1,22 @@
+# Load required packages
+library(reticulate)
+library(dplyr)
+
+# Set up Python environment
+# use_python("/path/to/python/env")
+python_path <- "C:/Users/alessandra.gherardel/AppData/Local/miniconda3/envs/data_bridges_utils/python.exe"
+use_python(path.expand(python_path))
+
+# Import DataBridgesShapes class
+databridges_utils <- import("data_bridges_utils")
+DataBridgesShapes <- databridges_utils$DataBridgesShapes
+
+# Initialize DataBridges client with credentials from YAML file
+CONFIG_PATH <- "data_bridges_api_config.yaml"
+client <- DataBridgesShapes(CONFIG_PATH)
+
+# Get household data for survey id
+survey_data <- client$get_household_survey(survey_id=3329, access_type='full')
+survey_data_r <- py_to_r(survey_data)
+print(head(survey_data_r))
+
diff --git a/examples/example_STATA.do b/examples/example_STATA.do
new file mode 100644
index 0000000..89bf2bd
--- /dev/null
+++ b/examples/example_STATA.do
@@ -0,0 +1,54 @@
+python set exect "path/to/python/env"
+
+python:
+
+"""
+Read a 'full' Household dataset from Data Bridges and load it into STATA.
+Only works if user has STATA 18+ installed and added to PATH.
+"""
+
+from data_bridges_utils import DataBridgesShapes
+from data_bridges_utils.labels import get_column_labels, get_value_labels, map_value_labels
+from data_bridges_utils.load_stata import load_stata
+import numpy as np
+import stata_setup
+from sfi import Data, Macro, SFIToolkit, Frame, Datetime as dt
+
+stata_path = r"E:\Program Files\Stata18"
+stata_version = "mp"
+
+# Path to YAML file containing Data Bridges API credentials
+CONFIG_PATH = r"data_bridges_api_config.yaml"
+
+# Example dataset and questionnaire from 2023 Congo CFSVA
+CONGO_CFSVA = {
+ 'questionnaire': 1509,
+ 'dataset': 3094
+}
+
+# Initialize DataBridges client with credentials from YAML file
+client = DataBridgesShapes(CONFIG_PATH)
+
+survey_data = client.get_household_survey(survey_id=CONGO_CFSVA['dataset'], access_type='full', page_size=800)
+questionnaire = client.get_household_questionnaire(CONGO_CFSVA['questionnaire'])
+choice_list = client.get_choice_list(CONGO_CFSVA['questionnaire'])
+
+
+variable_labels = get_column_labels(questionnaire)
+# get value labels
+value_labels = get_value_labels(questionnaire)
+
+survey_data_value_labels = map_value_labels(survey_data, questionnaire)
+# mapped.replace({np.nan: None})
+
+# # Export
+survey_data.to_csv(f"congo_cfsva_survey_data.csv", index=False)
+questionnaire.to_csv(f"congo_cfsva_questionnaire.csv", index=False)
+choice_list.to_csv(f"congo_csfsva_choice_list .csv", index=False)
+survey_data_value_labels.to_csv(f"congo_cfsva_mapped.csv", index=False)
+
+# Load into STATA dataframe
+ds = load_stata(survey_data_value_labels, stata_path=stata_path, stata_version=stata_version, variable_labels=variable_labels, value_labels=value_labels)
+
+
+end
\ No newline at end of file
diff --git a/examples/example_python.py b/examples/example_python.py
new file mode 100644
index 0000000..ca988fd
--- /dev/null
+++ b/examples/example_python.py
@@ -0,0 +1,50 @@
+"""
+Reads Household Data from Data Bridges. The script uses the DataBridgesShapes class from the data_bridges_utils module to interact with the Data Bridges API and retrieve various datasets, including:
+- Household survey data
+- GORP (Global Operational Response Plan) data
+- Exchange rates and prices for Afghanistan
+- IPC and equivalent food security Data
+
+The script demonstrates how to use the DataBridgesShapes class to fetch these datasets and print the first few rows of the resulting pandas DataFrames.
+"""
+#%%
+
+from data_bridges_utils import DataBridgesShapes
+from data_bridges_client.load_stata import load_stata
+
+CONFIG_PATH = r"data_bridges_api_config.yaml"
+
+client = DataBridgesShapes(CONFIG_PATH)
+
+#%% XSLForm definition and Household dataset
+
+CONGO_CFSVA = {
+ 'questionnaire': 1509,
+ 'dataset': 3094
+}
+# get household survey data
+survey_data = client.get_household_survey(survey_id=CONGO_CFSVA["dataset"], access_type='full')
+# get XLSForm data
+questionnaire = client.get_household_questionnaire(CONGO_CFSVA["questionnaire"])
+
+# Map the categories to survey_data
+mapped_survey_data = map_value_labels(survey_data, questionnaire)
+
+
+#%% GORP data
+# Get GORP data
+latest_data = client.get_gorp('latest')
+list_data = client.get_gorp('list')
+regional_latest_data = client.get_gorp('regional_latest')
+global_latest_data = client.get_gorp('global_latest')
+# #%
+
+#%% Market data
+exchage_rates = client.get_exchange_rates('AFG')
+prices = client.get_prices('AFG', '2022-01-01')
+
+
+#%% IPC equivalent
+food_security = client.get_food_security()
+afg_food_security = client.get_food_security("AFG", 2024)
+
diff --git a/pyproject.toml b/pyproject.toml
index e69de29..bbadce2 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,31 @@
+[build-system]
+requires = ["setuptools>=61.0.0", "wheel"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "data_bridges_utils"
+version = "1.0.0"
+authors = [{ name = "Alessandra Gherardelli", email = "alessandra.gherardelli@wfp.org" }, {name = "Valerio Giuffrida", email = "valerio.giuffrida@wfp.org"}]
+description = "Wrapper for Data Bridges API client"
+readme = "README.md"
+license = { file = "LICENSE" }
+classifiers = ["License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)"]
+keywords = ["VAM", "WFP", "data"]
+requires-python = ">=3.9"
+
+dependencies = [
+ 'PyYAML',
+ 'pandas>=2',
+ 'pystata',
+ 'stata-setup',
+ 'data-bridges-client @ git+https://github.com/WFP-VAM/DataBridgesAPI.git@hotfix-4.1.0',
+]
+
+[project.optional-dependencies]
+dev = ["black", "bumpver", "isort", "pip-tools", "pytest"]
+data-bridges-utils = []
+data-bridges-utils-STATA = ["stata-setup", "pystata"]
+data-bridges-utils-R = []
+
+[tool.setuptools]
+packages = ["data_bridges_utils"]
diff --git a/requirements.txt b/requirements.txt
new file mode 100644
index 0000000..9c96c8b
Binary files /dev/null and b/requirements.txt differ
diff --git a/sample_R_load.R b/sample_R_load.R
deleted file mode 100644
index e9716d5..0000000
--- a/sample_R_load.R
+++ /dev/null
@@ -1 +0,0 @@
-# TODO: Load into R dataframe
diff --git a/sample_python_load.py b/sample_python_load.py
deleted file mode 100644
index cfd7ae8..0000000
--- a/sample_python_load.py
+++ /dev/null
@@ -1,18 +0,0 @@
-
-"""
-Read Household Data from Data Bridges and load it into STATA.
-Only works if user has STATA 18+ installed and added to PATH.
-"""
-from data_bridges_utils import DataBridgesShapes
-
-CONFIG_PATH = r"data_bridges_api_config.yaml"
-
-client = DataBridgesShapes(CONFIG_PATH)
-
-# Get houhold data for survey id
-survey_data = client.get_household_survey(survey_id=3329, access_type='full')
-print(survey_data.head())
-
-# TODO: other API calls, including GORP and IPC
-# food_sec = client.get_ipc_equivalent("AFG", 2023)
-# print(food_sec.head())
diff --git a/sample_stata_load.py b/sample_stata_load.py
deleted file mode 100644
index 56ca349..0000000
--- a/sample_stata_load.py
+++ /dev/null
@@ -1,17 +0,0 @@
-"""
-Read a 'full' Household dataset from Data Bridges and load it into STATA.
-Only works if user has STATA 18+ installed and added to PATH.
-"""
-
-from data_bridges_utils import DataBridgesShapes, load_stata
-
-CONFIG_PATH = r"data_bridges_api_config.yaml"
-
-# Initialize DataBridges client with credentials from YAML file
-client = DataBridgesShapes(CONFIG_PATH)
-
-# Get houhold data for survey id
-survey_data = client.get_household_survey(survey_id=3329, access_type='full')
-
-# Load into STATA dataframe
-ds = load_stata(survey_data)
\ No newline at end of file
diff --git a/setup.py b/setup.py
index 812d4f1..8d2876e 100644
--- a/setup.py
+++ b/setup.py
@@ -8,44 +8,32 @@
setup(
name='data_bridges_utils',
version='1.0.0',
- description='Utilities for working with the WFP Data Bridges API',
+ description='Wrapper for Data Bridges API client',
long_description=long_description,
long_description_content_type='text/markdown',
- url='https://github.com/your_org/data_bridges_utils',
- author='Your Name',
- author_email='your.email@example.com',
- license='MIT',
+ url='https://github.com/WFP-VAM/DataBridgesUtils',
+ author='Alessandra Gherardelli, Valerio Giuffrida',
+ author_email='alessandra.gherardelli@wfp.org, valerio.giuffrida@wfp.org',
+ license='GNU General Public License v3 or later (GPLv3+)',
classifiers=[
- 'Development Status :: 4 - Beta',
- 'Intended Audience :: Developers',
- 'License :: OSI Approved :: MIT License',
- 'Programming Language :: Python :: 3',
- 'Programming Language :: Python :: 3.6',
- 'Programming Language :: Python :: 3.7',
- 'Programming Language :: Python :: 3.8',
- 'Programming Language :: Python :: 3.9',
+ 'License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)',
],
- keywords='wfp data bridges api',
+ keywords=['VAM', 'WFP', 'data'],
packages=find_packages(exclude=['tests', 'tests.*']),
- python_requires='>=3.6',
- install_requires=[
- 'PyYAML',
- 'pandas>=2',
- 'pystata',
- 'stata-setup',
- 'data_bridges_client',
- ],
+ python_requires='>=3.9',
extras_require={
'dev': [
- 'pytest',
- 'pytest-cov',
- 'flake8',
'black',
+ 'bumpver',
'isort',
+ 'pip-tools',
+ 'pytest',
],
+ 'data-bridges-utils': [],
+ 'data-bridges-utils-STATA': [
+ 'stata-setup',
+ 'pystata',
+ ],
+ 'data-bridges-utils-R': [],
},
- project_urls={
- 'Bug Reports': 'https://github.com/your_org/data_bridges_utils/issues',
- 'Source': 'https://github.com/your_org/data_bridges_utils',
- },
-)
\ No newline at end of file
+)
diff --git a/tests/__init__.py b/tests/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/test_get_data.py b/tests/test_get_data.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/test_labels.py b/tests/test_labels.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/test_load_stata.py b/tests/test_load_stata.py
new file mode 100644
index 0000000..e69de29