Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #3

Merged
merged 51 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
17560a8
Initiated pyproject.toml
AlexGherardelli May 28, 2024
c654692
Added requirements document and changed number of pages
AlexGherardelli May 31, 2024
58e0419
Fixed typo
AlexGherardelli May 31, 2024
d466ab1
Added IPC food security endpoint example
AlexGherardelli May 31, 2024
da553f0
Added GORP endpoint examples and handling of data
AlexGherardelli May 31, 2024
fd55d19
Updated gitignore. Testing R sample load
AlexGherardelli May 31, 2024
58d6fc7
Updated README
AlexGherardelli May 31, 2024
e17e25c
Added pyproject.toml and setup.py
AlexGherardelli May 31, 2024
3234aed
Minor Fixes to pyproject.toml
AlexGherardelli May 31, 2024
518c652
Minor fixes to setup.py
AlexGherardelli May 31, 2024
668b3bc
Minor fixes
AlexGherardelli May 31, 2024
b6ff7f1
Changed dependency URL in setup.py and pyproject.toml
AlexGherardelli May 31, 2024
c6dab8e
Minor fixes
AlexGherardelli May 31, 2024
6f247aa
Minor fixes
AlexGherardelli May 31, 2024
540e7a9
Minor fixes
AlexGherardelli May 31, 2024
241e16e
Updated README file and examples
AlexGherardelli May 31, 2024
87ae918
Typo
AlexGherardelli May 31, 2024
b03a96f
Temporary branch that uses hotfix
AlexGherardelli Jun 3, 2024
98a201c
XLSForm Responses
AlexGherardelli Jun 3, 2024
15dfa60
Updated ROADMAP
AlexGherardelli Jun 4, 2024
2ef10c1
Reverted pyproject.toml and setup.py to follow dev DataBridges client
AlexGherardelli Jun 4, 2024
f3ed862
XLSForm
AlexGherardelli Jun 4, 2024
4606e3d
Added logging
AlexGherardelli Jun 4, 2024
f5f285d
Get XLSForm data and map value labels to dataset
AlexGherardelli Jun 6, 2024
1116ca6
Improved example
AlexGherardelli Jun 6, 2024
389a2e2
Improved STATA example
AlexGherardelli Jun 6, 2024
f4eb89b
Changed STATA import
AlexGherardelli Jun 7, 2024
3d1a153
Tested STATA load
AlexGherardelli Jun 7, 2024
bd03aad
Handle Nan; improved stata import
AlexGherardelli Jun 7, 2024
ba15035
DTA export fails for dtypes
AlexGherardelli Jun 7, 2024
e65fa49
Labelling functions - some working better than others!
AlexGherardelli Jun 10, 2024
edf5d0a
Created functions to get labels
AlexGherardelli Jun 10, 2024
9360706
Merge pull request #2 from WFP-VAM/labels
AlexGherardelli Jun 10, 2024
a1b9a38
Updated installation instructions
AlexGherardelli Jun 14, 2024
ac295da
Updated setup.py
AlexGherardelli Jun 14, 2024
54e5872
Updated README file with examples
AlexGherardelli Jun 17, 2024
bb9c40f
Updated README
AlexGherardelli Jun 17, 2024
13dc302
Updated README
AlexGherardelli Jun 17, 2024
a2c5b07
Updated .gitignore
AlexGherardelli Jun 17, 2024
cd4dfed
Updated .gitignore
AlexGherardelli Jun 17, 2024
b0f0675
Moved examples in example folder
AlexGherardelli Jun 17, 2024
d92ff47
updated README
AlexGherardelli Jun 17, 2024
e80c681
Updated STATA example
AlexGherardelli Jun 17, 2024
e56ff58
Updated STATA example file
AlexGherardelli Jun 17, 2024
81efd76
Updated STATA example
AlexGherardelli Jun 17, 2024
868b158
Fix FutureWarning in load_stata
AlexGherardelli Jun 17, 2024
9eefdc5
Updated STATA example
AlexGherardelli Jun 17, 2024
72401c8
Updated STATA example
AlexGherardelli Jun 17, 2024
5d4181d
Update README.md
AlexGherardelli Jun 18, 2024
235efc8
Update README.md
AlexGherardelli Jun 18, 2024
c03ee51
Update README.md
AlexGherardelli Jun 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -163,4 +163,11 @@ cython_debug/

# Custom
data_bridges_api_config.yaml
ROADMAP.md
.Rproj.user
.RData
.Rhistory
*.Rproj
*.yaml
sandbox.py
*.csv
.vscode
160 changes: 59 additions & 101 deletions LICENSE.md → LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007

Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

Preamble

The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The GNU General Public License is a free, copyleft license for
software and other kinds of works.

The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.

When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
Expand All @@ -24,34 +26,44 @@ them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.

Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.

A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.

The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.

An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.

For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.

Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.

For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.

Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.

Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.

The precise terms and conditions for copying, distribution and
modification follow.
Expand All @@ -60,7 +72,7 @@ modification follow.

0. Definitions.

"This License" refers to version 3 of the GNU Affero General Public License.
"This License" refers to version 3 of the GNU General Public License.

"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
Expand Down Expand Up @@ -537,45 +549,35 @@ to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.

13. Remote Network Interaction; Use with the GNU General Public License.

Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
13. Use with the GNU Affero General Public License.

Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.

14. Revised Versions of this License.

The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.

If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.

Expand Down Expand Up @@ -615,47 +617,3 @@ reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.

END OF TERMS AND CONDITIONS

How to Apply These Terms to Your New Programs

If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.

Also add information on how to contact you by electronic and paper mail.

If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.

You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.
110 changes: 110 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Data Bridges Connect

This Python module allows you to get data from the WFP Data Bridges API, including household survey data, market prices, exchange rates, GORP (Global Operational Response Plan) data, and food security data (IPC equivalent). It is a wrapper for the [Data Bridges API Client](https://github.com/WFP-VAM/DataBridgesAPI), providing an easier way to data analysts to get VAM and monitoring data using their language of choice (Python, R and STATA).

## Installation

> NB This is the dev version of the data_bridges_utils and API client package, it is frequently updated yet not stable.

You can install the `data_bridges_utils` package using `pip` and the Git repository URL:

```
pip install --force-reinstall git+https://github.com/WFP-VAM/DataBridgesConnect.git@dev
```

## Configuration
1. Create a ```data_bridges_api_config.yaml``` in the main folder you're running your core from.
2. The structure of the file is:
```
NAME: ''
VERSION : ''
KEY: ''
SECRET: ''
SCOPES:
- ''
- ''
```
1. Replace your_api_key and your_api_secret with your actual API key and secret from the Data Bridges API. Update the SCOPES list with the required scopes for your use case.
2. (For WFP users) Credentials and scopes for DataBridges API can be requested by opening a ticket with the [TEC Digital Core team](https://dev.azure.com/worldfoodprogramme/Digital%20Core/_workitems). See [documentation](https://docs.api.wfp.org/consumers/index.html#application-accounts)
3. External users can reach out to [[email protected]](mailto:[email protected]) for support on getting the API credentials.

### Python
Run the following code to extract household survey data.

```python
from data_bridges_utils import DataBridgesShapes

CONFIG_PATH = "data_bridges_api_config.yaml"

client = DataBridgesShapes(CONFIG_PATH)

# Get household data for survey id
survey_data = client.get_household_survey(survey_id=3329, access_type='full')
print(survey_data.head())
```
A sample python file with additional examples for other endpoints is provided in the repo.

### STATA
1. Make sure you declare where your Python instance is by setting ```python set exec "path/to/python/env"```
2. Run the following code to extract household survey data and loading it into STATA as a flat dataset with value labels. Make sure to edit your ```stata_path```and ```stata_version``` to match the one installed in your system.

```stata
python set exect "path/to/python/env"

python:

"""
Read a 'base' Household dataset from Data Bridges and load it into STATA.
Only works if user has STATA 18+ installed and added to PATH.
"""

from data_bridges_utils import DataBridgesShapes, map_value_labels
from data_bridges_utils.load_stata import load_stata
import stata_setup

# set installation path for STATA
stata_path = r"C:/Program Files/Stata18"
# set stata version
stata_version = "se"

stata_setup.config(stata_path, stata_version)
from sfi import Data, Macro, SFIToolkit, Frame, Datetime as dt

# Path to YAML file containing Data Bridges API credentials
CONFIG_PATH = r"data_bridges_api_config.yaml"

# Example dataset and questionnaire from 2023 Congo CFSVA
CONGO_CFSVA = {
'questionnaire': 1509,
'dataset': 3094
}

# Initialize DataBridges client with credentials from YAML file
client = DataBridgesShapes(CONFIG_PATH)

# Get houhold data for survey id
survey_data = client.get_household_survey(survey_id=CONGO_CFSVA["dataset"], access_type='base') # base is the standardized-only dataset
questionnaire = client.get_household_questionnaire(CONGO_CFSVA["questionnaire"])

# Map the categories to survey_data
mapped_survey_data = map_value_labels(survey_data, questionnaire)

# Get variable labels
variable_labels = get_column_labels(questionnaire)
# Get value labels
value_labels = get_value_labels(questionnaire)

# Return flat dataset with value labels
survey_data_with_value_labels = map_value_labels(survey_data, questionnaire)

# Load into STATA dataframe
ds = load_stata(survey_data_with_value_labels, stata_path, stata_version)

end
```

## Contributing
Contributions are welcome! Please open an issue or submit a pull request if you have any improvements or bug fixes.

## License
This project is licensed under the AGPL 3.0 License.
Loading
Loading