-
Notifications
You must be signed in to change notification settings - Fork 12
PatternQuickStart
The Pattern module in Paxtools is implemented for enabling the search of specific topological structures in a BioPAX model. A pattern is defined by a list of ''constraints''. A ''pattern match'' is an array of BioPAXElement
objects that satisfy those constraints.
Assume we are looking for a structure that a Conversion has a PhysicalEntity
at its left and another PhysicalEntity
at its right, both belong to the same EntityReference
. The pattern contains 4 elements (a Conversion
, two PhysicalEntity
, and an EntityReference
).
Pattern p = new Pattern(Conversion.class, "Conv"); // Conversion is the initial element
p.add(ConBox.left(), "Conv", "left PE"); // left PE: The PhysicalEntity at left
p.add(ConBox.peToER(), "left PE", "ER"); // ER: The related EntityReference
p.add(ConBox.right(), "Conv", "right PE"); // right PE: The PhysicalEntity at right
p.add(ConBox.peToER(), "right PE", "ER"); // Last constraint makes sure that the last PhysicalEntity has the same EntityReference
Then we can use the class Searcher
for searching this pattern in a model.
Model model = ... // Get the model
Map<BioPAXElement, List<Match>> map = Searcher.search(model, p);
In this case, keys of the result map will all be Conversion
objects and values will contain list of the related pattern matches. Values of elements in a result Match
can be accessed using the label of the element.
Match m = ... // next Match
PhysicalEntity pe = (PhysicalEntity) m.get("left PE", p);
If you are unfamiliar with Paxtools environment and want to do as minimum programming as possible, then please check PatternSearchingModel for simplest use cases. In both cases, users have to learn how to construct a pattern, and it is described in the rest of this document.
A Pattern
object knows the type of the initial element. It also keeps the list of the constraints with the elements that the constraints will be applied. addConstraint
method of Pattern
takes the constraint and the array of labels of elements that it will be applied, and uses a MappedConstraint
object to bundle them.
A constraint is used for ensuring a specific property either for one object or between several objects. Each constraint has a size, i.e. the number of elements that it needs to use. Constraints can be generative or not. Generative constraints have size greater than 1, and they can generate values for the last element, using the previous mapped elements.
A Match
simply encapsulates an array of BioPAXElement
objects. Its size equals to the number of elements in the related pattern. If a Match
matches a pattern, then this means it satisfies all the constraints in that pattern.
The Searcher
has some static methods to search occurrence of patterns in a given model. It can either search from an initial object, or the entire model, iterating over all qualifying objects. The searcher iterates the list of constraints in the pattern, and maintains a list of Match
objects that is satisfied and/or generated by the constraints. At each step, if the last element in the Match
that the current constraint needs is null
, then the constraint is used to generate this element (the constraint has to be generative at this point), otherwise constraint is used to check if the Match
satisfies the constraint.
ConBox
provides static methods to prepare frequently used constraints. These are typically single line methods, constructing a specific constraint in a specific way, and returning. An example is below. This constraint links from the EntityReference
to the member PhysicalEntity
.
public static Constraint erToPE()
{
return new PathConstraint("EntityReference/entityReferenceOf");
}
The complete list of constraints is documented in the Javadoc. Below text aims to serve as a mini tutorial.
PathConstraint
is probably the most frequently needed constraint. Its size is always 2 and it is generative. It simply encapsulates a PathAccessor
inside, and provides simple traversal. For instance, for traversing from a Control
to a controlled TemplateReaction
, below PathConstraint
can be used.
new PathConstraint("Control/controlled*:TemplateReaction");
This example starts from a Control
, and traverses controlled objects recursively while they are other controls, and output if it reaches a TemplateReaction
. PathConstraint
makes use of the full power of PathAccessor
. It is user's responsibility to make sure that the output of the PathAccessor
is a BioPAXElement
. Otherwise an error will occur in the runtime.
MultiPathConstraint
provides a way to aggregate multiple PathAccessors
. Following example is for reaching all generic equivalents of a PhysicalEntity
, traversing through the possible multi-level nesting. The first path string traverses toward more specific elements, and the second path string traverses toward more general elements in recursive fashion.
new MultiPathConstraint("PhysicalEntity/memberPhysicalEntity*", "PhysicalEntity/memberPhysicalEntityOf*");
This example however, only reaches to the equivalent objects, excluding the seed object. If including the seed is desired, then the constraint should be encapsulated with a SelfOrThis
constraint.
new SelfOrThis(new MultiPathConstraint("PhysicalEntity/memberPhysicalEntity*", "PhysicalEntity/memberPhysicalEntityOf*"));
The constraint Field
is used for constraining the values of objects fields. It uses a PathAccessor
to access the related field, and ensures the given value is among the found values. Following constraint makes sure that AKT1 is among the object's names.
new Field("Named/name", "AKT1");
Optionally, Field
can use another object in the pattern as the required value. This feature should be set at the constructor. In that case it works like the PathConstraint
, but it cannot be used as a generative constraint.
new Field("Interaction/participant", Field.USE_SECOND_ARG);
Above constraint have size 2, and checks if the object at second position is a participant of the Interaction
at first position.
If an empty field is desired, then this could be set in the constructor as well. Below constraint makes sure that Interaction
has no participants.
new Field("Interaction/participant", Field.EMPTY);
Size
constraint encapsulates any generative constraint and ensures the number of candidates it generates is equal, less than, or greater than a certain size. Below constraint makes sure that the physical entity participates in less than 100 conversions. This can be useful for excluding high-degree molecules from the pattern. Size
is always non-generative.
new Size(new PathConstraint("PhysicalEntity/participantOf:Conversion"), 100, Size.Type.LESS);
Constraints AND
, OR
, XOR
, and NOT
are used to apply logical operators to any kind of constraints. AND
, OR
, and XOR
constraints take array of MappedConstraint
in the constructor. The indexes used in those MappedConstraints
map to the element array that is sent to the logical constraint. The example below should clarify that. Imagine we have two different Control objects in the pattern, and we want the next object to be a Conversion
directly controlled by both of them.
Pattern p = ... // Assume ctrl1 and ctrl2 are two different Control objects. And we are generating the Conversion conv.
Constraint cont2Conv = new PathConstraint("Control/controlled:Conversion");
Constraint c = new AND(new MappedConstraint(cont2Conv, 0, 2),
new MappedConstraint(cont2Conv, 1, 2));
p.add(c, "ctrl1", "ctrl2", "conv");
In this example, the indexes 0, 1, and 2 in MappedConstraint
object correspond to elements ctrl1
, ctrl2
, and conv
, respectively. AND
and OR
constraints are generative only when all the member constraints are generative.
NOT
constraint is used for negating other constraints. NOT
cannot be generative. Below constraint makes sure the Control
(first element) does not control the Conversion
(second element).
new NOT(new PathConstraint("Control/controlled*:Conversion"));
When we reach a Conversion
from a participant PhysicalEntity
while constructing a pattern, we can continue towards the other side, or towards the same side of the conversion using the ConversionSide
constraint. We can re-write the example in the PatternQuickStart (A Quick Example section), this time starting from a PhysicalEntity
instead of the Conversion
.
Pattern p = new Pattern(PhysicalEntity.class, "PE"); // PhysicalEntity is the initial element in a pattern match
p.add(new PathConstraint("PhysicalEntity/participantOf:Conversion"), "PE", "Conv"); // Conv: The Conversion
p.add(new ConversionSide(ConversionSide.Type.OTHER_SIDE), "PE", "Conv", "other PE"); // other PE: The PhysicalEntity at the other side of the Conversion
p.add(ConBox.peToER(), "PE", "ER"); // ER: The EntityReference of the initial PhysicalEntity
p.add(ConBox.peToER(), "other PE", "ER"); // Last constraint makes sure that the last PhysicalEntity has the same EntityReference
Sometimes we need to construct the pattern using inputs and outputs of a Conversion
, but they are not explicitly defined. The fields left and right of the Conversion
should be mapped to input and outputs according to the direction of Conversion
. The constraint ParticipatesInConv
can help for the mapping. Below pattern is similar to the above pattern, but makes sure that the first PhysicalEntity
is input and the second is output.
Pattern p = new Pattern(PhysicalEntity.class, "input PE"); // PhysicalEntity is the initial element in a pattern match
p.add(new ParticipatesInConv(RelType.INPUT), "input PE", "Conv"); // Conv: The Conversion
p.add(new ConversionSide(ConversionSide.Type.OTHER_SIDE), "input PE", "Conv", "output PE"); // output PE: The output PhysicalEntity
p.add(ConBox.peToER(), "input PE", "ER"); // ER: The EntityReference of the initial PhysicalEntity
p.add(ConBox.peToER(), "output PE", "ER"); // Last constraint makes sure that the last PhysicalEntity has the same EntityReference
The second boolean argument of ParticipatesInConv
is for treating reversible conversions as if left_to_right. If we set it to false
, then all the participants of the reversible conversion will be treated as both input and output.
When we want to traverse a Control
--> Conversion
--> PhysicalEntity
path and if we care about PhysicalEntity
being input or output to the Conversion
, then we can use ParticipatingPE
constraint to enforce it.
Pattern p = ... // Assume "ctrl" is a Control and "conv" is a controlled Conversion. We want an output PhysicalEntity for the label "out PE".
p.add(new Participant(RelType.OUTPUT, true), "ctrl", "conv", "out PE");
While traversing from a PhysicalEntity
to an Interaction
, many times we also want to traverse over complex-member relations and generic relationships to reach the Interaction
. Assume we are looking for the Control
objects that PhysicalEntity
A is controller. Sometimes A is a member of a Complex
, and this molecular complex is the controller. Sometimes A has a parent generic PhysicalEntity
and this one is the controller. And some other times these relations are recursively mixed. The constraint LinkedPE
is built for handling such cases. It is a generative constraint of size 2, and traverses other related physical entities recursively, in the user specified direction. The direction only matters in complex-member relations, and can be either TO_COMPLEX
or TO_MEMBER
.
Assume we want to search for Conversions that modify PhysicalEntity
of an EntityReference
.
Pattern p = new Pattern(EntityReference.class, "ER"); // Start from an EntityReference
p.add(ConBox.erToPE(), "ER", "simple PE1"); // Get a related PhysicalEntity
p.add(new LinkedPE(LinkedPE.Type.TO_COMPLEX), "simple PE1", "PE1"); // Include parent complexes and all equivalent generics recursively
p.add(new ParticipatesInConv(RelType.INPUT), "PE1", "Conv"); // Get to the Conversion
p.add(new ConversionSide(ConversionSide.Type.OTHER_SIDE), "PE1", "Conv", "PE2"); // Get to the PhysicalEntity at the other side
PhysicalEntity at the other side
p.add(new Equality(false), "PE1", "PE2"); // Make sure that the PhysicalEntity objects at each sides are different
p.add(new LinkedPE(LinkedPE.Type.TO_MEMBER), "PE2", "simple PE2"); // Include complex members and all equivalent generics recursively
p.add(ConBox.peToER(), "simple PE2", "ER"); // Make sure that the last PhysicalEntity has the same EntityReference
LinkedPE
constraint also generates the seed object, so the linked PhysicalEntity
can be the same PhysicalEntity
.
If the current framework does not have the constraint you want, then consider implementing them. It is best done by extending the ConstraintAdapter
class. If it is a generative constraint then override the generate
method, else, override the satisfies
method.
For instance the below generative constraint gets the Conversion
that the PhysicalEntity
is a participant, and the Converision
is not reversible.
Constraint con = new ConstraintAdapter(2)
{
@Override
public Collection<BioPAXElement> generate(Match match, int... ind)
{
Collection<BioPAXElement> set = new HashSet<BioPAXElement>();
PhysicalEntity pe = (PhysicalEntity) match.get(ind[0]);
for (Interaction inter : pe.getParticipantOf())
{
if (inter instanceof Conversion)
{
Conversion conv = (Conversion) inter;
if (conv.getConversionDirection() != null &&
conv.getConversionDirection() != ConversionDirectionType.REVERSIBLE)
{
set.add(conv);
}
}
}
return set;
}
};
Note that another way of doing this is using the below 3 constraints together.
p.add(new PathConstraint("PhysicalEntity/participantOf:Conversion"), "PE", "Conv");
p.add(new NOT(new Field("Conversion/conversionDirection", null, Field.EMPTY)), "Conv");
p.add(new NOT(new Field("Conversion/conversionDirection", Field.Operation.INTERSECT, ConversionDirectionType.REVERSIBLE)), "Conv");
And the below non-generative constraint makes sure two Pathway
classes have overlapping content (without handling nested pathways).
Constraint con = new ConstraintAdapter(2)
{
@Override
public boolean satisfies(Match match, int... ind)
{
Pathway p1 = (Pathway) match.get(ind[0]);
Pathway p2 = (Pathway) match.get(ind[1]);
for (Process process : p1.getPathwayComponent())
{
if (p2.getPathwayComponent().contains(process))
return true;
}
return false;
}
};
We can reuse existing patterns while constructing another pattern. This is good for modularizing the pattern structure and for coping with code duplication. The method addPattern
copies the constraint list of the parameter pattern. Elements with the same label are mapped to each other. The first element of the parameter pattern should always be mapped to an existing element in the current pattern.
Assume we already defined the below pattern, which contains a Conversion
and a PhysicalEntity
at its right.
Pattern p1 = new Pattern(Conversion.class, "Conv");
p1.add(ConBox.right(), "Conv", "right PE");
Then we can reuse it while defining a pattern like ''Controller PhysicalEntity
-- Control
-- Conversion
-- Right PhysicalEntity
'', where two PhysicalEntity
objects should be different.
Pattern p2 = new Pattern(PhysicalEntity.class, "controller");
p2.add(ConBox.peToControl(), "controller", "control");
p2.add(ConBox.controlToConv(), "control", "Conv");
p2.add(p1);
p2.add(new Equality(false), "controller", "right PE");
We could add p1
to p2
because first element of p1
(Conv
) exists in pattern p2
. Otherwise an exception would occur.