Skip to content

Latest commit

 

History

History
144 lines (88 loc) · 6.19 KB

constraints_explained.md

File metadata and controls

144 lines (88 loc) · 6.19 KB

The constraints explained

This file explains all constraints in detail.

The tissue data

tissue(1,cancer). tissue(2,healthy). Each tissue sample, that is, each row in the data file, is either healthy or cancerous as determined by the column with header Annots. The first argument is the integer TissueID, the second one of the constants healthy or cancer.

The miRNA data

data(1,g1,high). data(1,g2,low). These facts determine the expression levels of miRNAs in tissue samples. The first argument is the integer TissueID, the second a constant MiRNA value, the third one of the constants high or low.

User Input

upper_bound_inputs(10). Specifies the bounds on the total number of inputs and gates for the classifier.

gate type 1

is_gate_type(1). Binds constraints related to gate types. The argument is the integer GateID.

upper_bound_pos_inputs(2, 2). lower_bound_pos_inputs(2, 2). These facts specify the upper and lower bounds on the number of positive and negative inputs for each gate type. It takes two arguments, the GateType and the value of the bound.

upper_bound_gate_occurence(1, 2). This fact specifies the upper bound on the number of occurences of a gate type in the classifier. The first argument is the GateID the second is the value of the bound.

binding of variables

is_tissue_id(X) :- tissue(X,Y). The predicate is_tissue_id(X) is not a constraint. It is used for binding values of TissueID in other constraints.

is_mirna(Y) :- data(X,Y,Z). The predicate is_mirna(Y) is not a constraint. It is used for binding values of MiRNA in other constraints.

is_sign(positive). is_sign(negative). These facts are used to bind value of Sign.

Constraints

number of gates

1 {number_of_gates(1..6)} 1. is_integer(1..6). This constraint determines the number of gates that are used in the classifier. The predicate number_of_gates is true for exactly one integer value between 1 and the upper bound given by the user. The predicate is_integer is used to generate all _GateID_s between 1 and the chosen number of gates.

is_gate_id(GateID) :- number_of_gates(X), is_integer(GateID), GateID<=X. This constraint generates a GateID for each number between 1 and number_of_gates(X). The predicate is_gate_id is used for binding values of GateID.

assignment of gate types

1 {gate_type(GateID, X): is_gate_type(X)} 1 :- is_gate_id(GateID). This constraint assigns a GateType to every GateID. The predicates is_gate_type(X) and is_gate_id(GateID) are just binding admissible values of GateType and GateID.

inputs for gates (EfficiencyConstraint=True)

feasible_pos_miRNA(MiRNA) :- data(TissueID, MiRNA, high), tissue(TissueID,cancer). These constraints define feasible positive and negative inputs. They are based on the observation that inputs to the classifier must be variables that are not constant in the data. More specifically, a positive input must be highly expressed on some cancer sample and a negative input lowly.

X {gate_input(GateID, positive, MiRNA): feasible_pos_miRNA(MiRNA)} Y :- gate_type(GateID, GateType), lower_bound_pos_inputs(GateType, X), upper_bound_pos_inputs(GateType, Y). This constraint enforces that each GateID has the correct number of positive and negative inputs, bounded by the GateType parameters.

inputs for gates (EfficiencyConstraint=False)

X {gate_input(GateID, positive, MiRNA): is_mirna(MiRNA)} Y :- gate_type(GateID, GateType), lower_bound_pos_inputs(GateType, X), upper_bound_pos_inputs(GateType, Y).

at least one input for each gate

1 {gate_input(GateID, Sign, MiRNA): is_sign(Sign), is_mirna(MiRNA)} :- is_gate_id(GateID). This is a safety constraint in case the lower bounds for inputs for a gate type are 0. It forbids a gate to have no inputs at all.

inputs must be unique

{gate_input(GateID,Sign,MiRNA): is_sign(Sign), is_gate_id(GateID)} 1 :- is_mirna(MiRNA). This constraint enforces that a MiRNA can only appear once in a classifier.

number of inputs is bounded

{gate_input(GateID,Sign,MiRNA): is_gate_id(GateID), is_sign(Sign), is_mirna(MiRNA)} X :- upper_bound_inputs(X). This constraint enforces that the number of gate inputs is smaller that or equal to upper_bound_inputs(X).

occurences of gate types is bounded

{gate_type(GateID,GateType): is_gate_id(GateID)} X :- upper_bound_gate_occurence(GateType,X). This constraint enforces that the number of gates of each GateType is bounded from above by the user given occurence value.

gates fire condition

gate_fires(GateID,TissueID) :- gate_input(GateID,positive,MiRNA), data(TissueID,MiRNA,high). A gate fires for a TissueID if one of its positive inputs is _high_ly expressed or one of its negative inputs is _low_ly expressed.

prediction of classifier

classifier(TissueID,healthy) :- not gate_fires(GateID, TissueID), is_gate_id(GateID), is_tissue_id(TissueID). The classifier predicts a healthy TissueID if there is a GateID that does not fire for that tissue.

classifier(TissueID,cancer) :- not classifier(TissueID, healthy), is_tissue_id(TissueID). The classifier predict cancer for a TissueID if it does not predict a healthy TissueID. Note that the encoding via negation is neccessary since the classifier is a conjunction of gates.

consistency of classifier and data

:- tissue(TissueID,healthy), classifier(TissueID,cancer). :- tissue(TissueID,cancer), classifier(TissueID,healthy). The classifier must agree with the tissue samples.

Breaking symmetries

gate id symmetries

GateType1 <= GateType2 :- gate_type(GateID1, GateType1), gate_type(GateID2, GateType2), GateID1 <= GateID2. This is the first symmetry breaking constraint. It says that gate ids should be assinged to the smallest possible gate types.

gate input symmetries

MiRNA1<=MiRNA2 :- gate_type(GateID1, GateType), gate_type(GateID2, GateType), gate_input(GateID1,Sign,MiRNA1), gate_input(GateID2,Sign,MiRNA2), GateID1<=GateID2. This is the second symmetry breaking constraint. If two gates are of the same GateID then the inputs of the one with the smaller GateID should be smaller than the inputs of the one with the larger GateID.

not sure if this constraint is correct