-
Notifications
You must be signed in to change notification settings - Fork 22
Under Development: Electronic health record queries
The feature has been added to the EHR_development
branch, which is not currently stable.
EHR queries work much like the yaml-enabled phenotype queries. To filter for certain individuals, a samples_filters
block can be specified just as in a phenotype query. In the query section, a table
must be specified, and then optional tags are max_records
, columns
, and records_filters
. We will go through these tags individually, and then see an example putting them together.
-
table
specifies the name of the database table to query. The tables are loaded into ukbREST with the same names given to them in the UK Biobank data showcase. -
columns
specifies which columns should be returned. If not specified, ukbREST defaults to returning all columns from the table. -
records_filters
is a list of filters for the rows of the EHR table. This acts much like thesamples_filters
section. Row filters could be things like- diag_icd9 = 'A123'
or- dsource != 'HES'
. -
max_records
can be specified to set a limit on the number of records returned by ukbREST. By default, there is no limit.
So an EHR query located at ~/ehr_query.yaml
may look like this:
$ cat ~/ehr_query.yaml
samples_filters:
- eid not in (select eid from withdrawals)
- c31_0_0 = 0
ehr_query:
table: gp_clinical
records_filters:
- data_provider = 2
- event_dt > 31/12/1999
columns:
- event_dt
- read_2
- value_1
- value_2
- value_3
max_records: 1000
This query requests records from gp_clinical
, the table of clinical primary care events. Columns event_dt
, read_2
, value_1
, value_2
, and value_3
are requested. Records requested are those provided by data_provider 2 (a GB data provider -- check the UK Biobank's primary care documentation for more details) and those occurring after December 31, 1999. The individuals are filtered to ensure that they have not withdrawn consent eid not in (select eid from withdrawals)
and they are female c31_0_0 = 0
. This will likely generate a large amount of records, so only the first 1000 will be returned.
This query can be executed similarly to the phenotype queries using curl
. The yaml file and section are specified in the curl
command to access the resource at /ukbrest/api/v1.0/ehr
:
curl -H accept:text/csv \
"http://127.0.0.1:5000/ukbrest/api/v1.0/ehr" \
-F file=@ehr_query.yaml \
-F section=ehr_query \
> my_data.csv