-
Notifications
You must be signed in to change notification settings - Fork 9
Home
These are the web-services:
- auth-token; retrieve an authentication token (for use with the other web-services).
- boadicea; calculates risks and mutation carrier probabilities for breast cancer (see code).
- ovarian; calculates risks and mutation carrier probabilities for ovarian cancer (see code).
- vcf2prs; takes a VCF file and returns a PRS (alpha and beta) for use in the boadicea w/s (see code).
The following describes auth-token and boadicea.
All requests to the web-services are made over HTTPS. An authentication token can be requested using the ‘auth-token’ web-service and added to the request authorization headers. Data is sent to and from the web-service in JSON format.
Patient identifiable data is not needed to run the risk calculations and the client software should remove this to de-identify the data before submitting the request to the web-service.
The web-service supports the pedigree data submitted in the CanRisk file format and BOADICEA pedigree data format v4. It can be sent either as a ‘pedigree_data’ field in the JSON request or posted as a file. See the example usage below.
-
Obtaining Authentication Token:
curl -k 'https://{URL}/auth-token/' \ -d '{"username": "XYZ", "password": "ABC"}' \ -H "Content-Type: application/json"
-
Pedigree as a JSON parameter (replace with the authentication token):
curl -k -XPOST -H "Content-Type: application/json" \ -H 'Authorization: Token <TOKEN>' -H "Accept: application/json" \ -d '{"mut_freq": "UK", "cancer_rates":"UK", "user_id": "end_user_id", \ "pedigree_data":"BOADICEA import pedigree file format 4.0\nFamID\tName\tTarget\tIndivID\tFathID\tMothID\tSex\tMZtwin\tDead\tAge\tYob\t1stBrCa\t2ndBrCa\tOvCa\tProCa\tPanCa\tAshkn\tBRCA1t\tBRCA1r\tBRCA2t\tBRCA2r\tPALB2t\tPALB2r\tATMt\tATMr\tCHEK2t\tCHEK2r\tER\tPR\tHER2\tCK14\tCK56\nXXX1 \tF1 \t1\t1 \t3 \t2 \tF\t0\t0\t23 \t1993\t21 \t0 \t0 \t0 \t0 \t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\nXXX1 \tF2 \t0\t2 \t0 \t0 \tF\t0\t0\t55 \t1961\t0 \t0 \t0 \t0 \t0 \t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\nXXX1 \tM2 \t0\t3 \t0 \t0 \tM\t0\t0\t53 \t1963\t0 \t0 \t0 \t0 \t0 \t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0"}' https://{URL}/boadicea/
-
Pedigree file posted as a form (leave off -H "Accept: application/json" to return as xml):
curl -k -XPOST -F "mut_freq=UK" -F "cancer_rates=UK" \ -F "user_id=end_user_id" \ -F "pedigree_data=@/home/xxx/bwa4_pedigree_data.txt" \ https://{URL}/boadicea/ -H 'Authorization: Token <TOKEN>' \ -H "Accept: application/json"
Results are returned in JSON format. The mutation frequency, sensitivity values and cancer incidence rates used in the calculations are reported along with the version of the BOADICEA model and a timestamp. The results for each family in the input are given in 'pedigree_result' as an array and can be identified by their 'family_id'. The baseline/population, lifetime, ten year range (40-49) and remaining lifetime risks are given.
The remaining lifetime risks are given in the 'cancer_risks' array. The baseline are given in the 'baseline_cancer_risks' array. Lifetime and 10 year age range risks are given in 'lifetime_cancer_risk' and 'ten_yr_cancer_risk' arrays respectively. The mutation probabilities are given in the 'mutation_probabilties' array for each of the genes.
{
"version": "BOADICEA_5.0.0",
"timestamp": "2019-01-16T14:49:39.028628",
"mutation_frequency": {
"UK": {
"PALB2": 0.000575,
"BRCA2": 0.00102,
"CHEK2": 0.002614,
"ATM": 0.001921,
"BRCA1": 0.0006394
}
},
"mutation_sensitivity": {
"PALB2": 0.9,
"BRCA2": 0.9,
"CHEK2": 1,
"ATM": 0.9,
"BRCA1": 0.9
},
"cancer_incidence_rates": "UK",
"pedigree_result": [
{
"family_id": "XXXX1",
"proband_id": "PB",
"cancer_risks": [
{
"age": 41,
"breast cancer risk": {
"percent": 0.3,
"decimal": 0.003276
},
"ovarian cancer risk": {
"percent": 0,
"decimal": 0.0001412
}
},
...
{
"age": 80,
"breast cancer risk": {
"percent": 22.1,
"decimal": 0.2205581
},
"ovarian cancer risk": {
"percent": 1.9,
"decimal": 0.0194776
}
}
],
"baseline_cancer_risks": [
{
"age": 41,
"breast cancer risk": {
"percent": 0.1,
"decimal": 0.0010082
},
"ovarian cancer risk": {
"percent": 0,
"decimal": 0.0000963
}
},
...
{
"age": 80,
"breast cancer risk": {
"percent": 11,
"decimal": 0.1097168
},
"ovarian cancer risk": {
"percent": 1.6,
"decimal": 0.0163075
}
}
],
"lifetime_cancer_risk": [
{
"age": 80,
"breast cancer risk": {
"percent": 23.7,
"decimal": 0.237001
},
"ovarian cancer risk": {
"percent": 2.2,
"decimal": 0.021647
}
}
],
"baseline_lifetime_cancer_risk": [
{
"age": 80,
"breast cancer risk": {
"decimal": 0.1153311,
"percent": 11.5
},
"ovarian cancer risk": {
"decimal": 0.0175416,
"percent": 1.8
}
}
],
"ten_yr_cancer_risk": [
{
"age": 50,
"breast cancer risk": {
"percent": 4.9,
"decimal": 0.0485692
},
"ovarian cancer risk": {
"percent": 0.2,
"decimal": 0.0020721
}
}
],
"baseline_ten_yr_cancer_risk": [
{
"age": 50,
"breast cancer risk": {
"decimal": 0.0170431,
"percent": 1.7
},
"ovarian cancer risk": {
"decimal": 0.0016042,
"percent": 0.2
}
}
],
"mutation_probabilties": [
{
"no mutation": {
"percent": 95.7,
"decimal": 0.9572
}
},
{
"BRCA1": {
"percent": 0.7,
"decimal": 0.0069
}
},
{
"BRCA2": {
"percent": 1.1,
"decimal": 0.0107
}
},
{
"PALB2": {
"percent": 0.5,
"decimal": 0.0048
}
},
{
"CHEK2": {
"percent": 1.2,
"decimal": 0.0121
}
},
{
"ATM": {
"percent": 0.8,
"decimal": 0.0084
}
}
]
},
{
"family_id": "XXX2",
"cancer_risks": [ ... ],
"baseline_cancer_risks": [ ... ],
"lifetime_cancer_risk": [ ... ],
"baseline_lifetime_cancer_risk": [ ... ],
"ten_yr_cancer_risk": [ ... ],
"baseline_ten_yr_cancer_risk": [ .. ],
"mutation_probabilties": [ ... ],
"prs": {},
"risk_factors": {
"age_of_first_live_birth": "25-29",
"age_of_menopause": "45-49",
"alcohol_intake": "<5",
"bmi": "18.5-<25",
"height": ">=172.70",
"mammographic_density": "-",
"menarche_age": "15",
"mht": "current e-type",
"oral_contraception": "never",
"parity": "2"
}
}
],
"warnings": [
"year of birth and age at last follow up must be specified in order for PF to be included in a calculation",
"year of birth and age at last follow up must be specified in order for PGA to be included in a calculation"
]
}
A different HTTP response status is returned depending on the type of error encountered. Before the calculation is run validation of the parameters and pedigree is carried out. These are the errors that can arise:
-
Missing model parameter (e.g. pedigree data, cancer rates) a 400 status (bad request) is returned.
-
Authentication error returns 401 status (unauthorized).
-
Pedigree validation errors return 400 status (bad request). The key in the response is one of:
- 'Pedigree File Error' - e.g. header errors, wrong no. of columns.
- 'Person Error' - e.g. sex of parent validation, ID's alphanumeric.
- 'Pedigree Error' - e.g. no target, duplicate individual.
- 'Cancer Error' - e.g. male with ovarian cancer, first breast cancer missing if second present.
- 'Pathology Error' - e.g. pathology results not correctly specified.
- 'Genetic Test Error' - e.g. mutation status specified if tested.
and the value of the error details why validation has failed, e.g:
"Pedigree Error": "Pedigree (XXX) family members are not physically connected to the target: ['MOO']"
See below for details of error messages.
-
Errors from the Fortran return with a 406 status (not acceptable):
"detail": "Error: 18"
"The first header record in the pedigree file has unexpected characters. The first header record must be 'BOADICEA import pedigree file format 4'."
"Column headers in the pedigree file contains unexpected characters. It must include the 'FamID', 'Target', 'IndivID','FathID' and 'MothID' in columns 1, 3, 4, 5 and 6 respectively."
"A data record has an unexpected number of data items. BOADICEA format 4 pedigree files should have <BOADICEA_PEDIGREE_FORMAT_FOUR_DATA_FIELDS> data items per line."
"Invalid batch file type."
"Invalid BOADICEA import format pathology test option has unexpected characters."
"Program string has unexpected value" Pedigree Error "A value in the Target data column has been set to . Target column parameters must be set to '0' or '1'."
"Individual ID appears more than once in the pedigree file."
"Pedigree has either no index or more than 1 index individuals. Only one target can be specified."
"Pedigree has unexpected number of family members pedigree_size”
"Family ID (1st data column) has been set to . Family IDs must be specified with between 1 and <MAX_LENGTH_PEDIGREE_NUMBER_STR>) non-zero number or alphanumeric characters."
"Pedigree family members are not physically connected to the target: unconnected”
"The target's year of birth has been set to . This person must be assigned a valid year of birth."
"The target's age has been set to . This person must be assigned an age."
"BOADICEA cannot compute mutation carrier probabilities because the target has a positive genetic test. Also BOADICEA cannot compute breast and ovarian cancer risks because the target is: (1) over <MAX_AGE_FOR_RISK_CALCS> years old or (2) male, or (3) an affected female who has developed contralateral breast cancer, ovarian cancer or pancreatic cancer."
"MZ twin identifier does not appear twice in the pedigree file. Only MZ twins are permitted in the pedigree, MZ triplets or quads are not allowed."
"Invalid MZ twin character . MZ twins must be identified using one of the following ASCII characters: UNIQUE_TWIN_IDS)."
"Monozygotic (MZ) twins identified with the character have different parents. MZ twins must have the same parents."
"Monozygotic (MZ) twins identified with the character have different years of birth. MZ twins must have the same year of birth."
"Monozygotic (MZ) twins identified with the character have different ages. If both MZ twins are alive, they must have the same age at last follow up."
"Monozygotic (MZ) twins identified with the character have a different sex. MZ twins must have the same sex."
"Monozygotic (MZ) twins have both had a genetic test, but the genetic test results for these individuals are different. Under these circumstances, the genetic test results must be the same."
"Maximum number of MZ twin pairs has been exceeded. Input pedigrees must have a maximum of <MAX_NUMBER_MZ_TWIN_PAIRS> MZ twin pairs."
"The sex of family member ‘’ is invalid. An individuals sex must be specified as 'M' or 'F' only."
"A name ‘’ is unspecified or is not an alphanumeric string."
"An individual identifier (IndivID column) was specified as . Individual identifiers must be alphanumeric strings with a maximum of <MAX_FAMILY_ID_STR_LENGTH> characters."
"Father identifier (, FathID column) has unexpected characters. It must be alphanumeric strings with a maximum of <MAX_FAMILY_ID_STR_LENGTH> characters"
"Mother identifier (, MothID column) has unexpected characters. It must be alphanumeric strings with a maximum of <MAX_FAMILY_ID_STR_LENGTH> characters"
"Family member name has only one parent specified. All family members must have no parents specified (i.e. they must be founders) or both parents specified."
"The mother mothid of family member is missing from the pedigree."
"The father fathid of family member is missing from the pedigree."
"The father of family member is not specified as male. All fathers in the pedigree must have sex specified as 'M'."
"The mother of family member is not specified as female. All mothers in the pedigree must have sex specified as 'F'."
"The family member has an invalid vital status (alive must be specified as '0', and dead specified as '1')"
"The age specified for family member has unexpected characters. Ages must be specified with as '0' for unknown, or in the range 1-MAX_AGE”
"The year of birth yob specified for family member is out of range. Years of birth must be in the range <MIN_YEAR_OF_BIRTH-current_year>.”
"Family member has been assigned an invalid Ashkenazi origin parameter. The Ashkenazi origin parameter must be set to '1' for Ashkenazi origin, or '0' for not Ashkenazi origin."
"Family member exeeded the maximum number of siblings <MAX_NUMBER_OF_SIBS_PER_NUCLEAR_FAMILY>."
"Family member exceeded the maximum number of siblings with the same year of birth exceeded <MAX_NUMBER_OF_SIBS_PER_NUCLEAR_FAMILY_WITH_SAME_YOB>.”
"Family member has been assigned an invalid genetic test type. It must be specified with '0' for untested, 'S' for mutation search or 'T' for direct gene test."
"Family member has been assigned an invalid genetic test result. Genetic test results must be '0' for untested, 'N' for no mutation, 'P' mutation detected."
Family member has had a genetic test but the corresponding test result has not been specified."
"Family member has been assigned a genetic test result, but the corresponding genetic test type has not been specified."
"Invalid BOADICEA format four genetic test summary."
"Family member has been assigned an invalid test_type status. It must be 'N' for negative, 'P' for positive, or '0' for unknown."
"Family member has not developed breast cancer but has been assigned a breast cancer pathology test result (test_type). Pathology test results can only be assigned to family members who have developed breast cancer."
These are the rules used to define the breast cancer pathology representation in BOADICEA format:
RULES = "Please note the following rules for breast cancer pathology data: (1) if an individual's ER status is unspecified, no pathology information for that individual will be taken into account in the calculation; (2) if a breast cancer is ER positive, no other pathology information for that individual will be taken into account in the calculation; (3) if a breast cancer is ER negative, information on PR and HER2 for that individual will only be taken into account in the calculation if both PR and HER2 are specified; and (4) an individual's CK14 and CK5/6 status will only be taken into account in the calculation if both CK14 and CK5/6 are specified and the breast cancer is triple negative (ER negative, PR negative and HER2 negative). "
Below are warnings that accompany the BOADICEA results.
"Incomplete data record in the pedigree: family member has an unspecified ER status, but another pathology parameter (PR, HER2, CK14 or CK5/6) has been specified . As a result, this individual's pathology information will not be taken into account in this case."
"Incomplete data record in the pedigree: family member has a breast cancer pathology where PR status is specified but HER2 status is unspecified (or vice versa). . As a result, PR and HER2 status will not be taken into account in this case."
"Incomplete data record in the pedigree: family member has a breast cancer pathology where only CK14 or CK5/6 status has been specified. . As a result, CK14 and CK5/6 status will not be taken into account in this case."
"Incomplete data record in your pedigree: family member has a breast cancer pathology where CK14 or CK5/6 status is specified but the breast cancer pathology is not triple negative (ER negative, PR negative and HER2 negative). . As a result, CK14 and CK5/6 status will not be taken into account in this case."
"Incomplete data record in your pedigree: family member has a breast cancer pathology that is ER positive, where an additional pathology parameter (PR, HER2, CK14 or CK5/6) has been specified. . As a result, only ER positive status will be taken into account in this case."
"Family member has an age at cancer diagnosis (ctype)specified as . Age at cancer diagnosis must be set to '0' for unaffected, 'AU' for affected at unknown age, or specified with an integer in the range 1-<MAX_AGE>."
“Family member has been assigned an age at cancer diagnosis that exceeds age at last follow up. An age at cancer diagnosis must not exceed an age at last follow up."
"Family member is male but has been assigned an ovarian cancer diagnosis."
"Family member is female but has been assigned an prostate cancer diagnosis."
"Family member has had contralateral breast cancer, but the age at diagnosis of the first breast cancer is missing."
Family member has had contralateral breast cancer, but the age at diagnosis of the first breast cancer exceeds that of the second breast cancer."
"Family member has had contralateral breast cancer, but the age at diagnosis of the first breast cancer is missing."