-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saul inference: moving beyond LBJava's inference #401
base: master
Are you sure you want to change the base?
Conversation
Do you have any documentation that we can start reading from there? |
Yes, see the changes. |
ok, I'll do. |
the inference starts from the head object. This function finds the objects of type `INPUT_TYPE` which are | ||
connected to the target object of type `HEAD_TYPE`. If we don't define `filter`, by default it returns all | ||
objects connected to `HEAD_TYPE`. The filter is useful for the `JointTraining` when we go over all | ||
global objects and generate all contained object that serve as examples for the basic classifiers involved in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
In Saul, the constraints are defined for the assignments to class labels. | ||
A constraint classifiers is a classifier that predicts the class labels with regard to the specified constraints. | ||
In Saul, the constraints are defined for the assignments to class labels. In what follows we outine the details of operators | ||
which help us define the constraints. Before jumping into the details, note that you have to have the folling import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
|
||
In the above definition, `on` and `is` are keywords. | ||
|
||
Here different variations of this basic, but there are different variations to it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: didn't understand this sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol fixed.
| `ForAll` | For **all** the elements in the collection it applies the constraints. In other words, the constrain should hold for **all** elements of the collection. | `textAnnotationNode.ForAll { x: TextAnnotation => Some-Constraint-On-x }` | | ||
| `Exists` | The constrain should hold for **at least one** element of the collection. | `textAnnotationNode.Exists { x: TextAnnotation => Some-Constraint-On-x }` | | ||
| `AtLest(k: Int)` | The constrain should hold for **at least `k`** elements of the collection. | `textAnnotationNode.AtLeast(2) { x: TextAnnotation => Some-Constraint-On-x }` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: AtLeast(k: Int)
in the first column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
There are just the definitions of the operations. If you want to see real examples of the operators in actions see [the definitions of constraints for ER-example](https://github.com/IllinoisCogComp/saul/blob/master/saul-examples/src/main/scala/edu/illinois/cs/cogcomp/saulexamples/nlp/EntityRelation/EntityRelationConstraints.scala). | ||
|
||
**Tip:** Note whenever the constrained inference is infeasible (i.e. the constraints are overlly tight), we use the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: overly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
@@ -19,48 +19,48 @@ object ClassifierUtils extends Logging { | |||
def apply[T <: AnyRef](c: (Learnable[T], Iterable[T])*) = { | |||
c.foreach { | |||
case (learner, trainInstances) => | |||
logger.info(evalSeparator) | |||
logger.info("Training " + learner.getClassSimpleNameForClassifier) | |||
println(evalSeparator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the change from logger.info
to println
intentional? No strong opinion, just wanted to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it because it was messing up the formatting of the output results. But now in retrospect, I think only the ones related to testing was needed. I returned back the rest.
val instanceIsInvolvedInConstraint = instancesInvolved.exists { set => | ||
set.exists { | ||
case x: T => x == t | ||
case everythingElse => false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can use _
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
case (singleConstraint, ins) => | ||
ins union getInstancesInvolved(singleConstraint).asInstanceOf[Set[Any]] | ||
} | ||
case c: AtMost[_, _] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, I miss something here, but could you explain a bit: all these have the same body? AtMost, AtLeast, forAll...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Merged them into one by adding an extra type.
Applied the comments. Let me know if you have any other comments. |
object TestConstraintClassifier extends ConstrainedClassifier[String, String] { | ||
override def subjectTo = None | ||
override val solverType = OJAlgo | ||
override lazy val onClassifier = TestClassifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change the onClassifier
to baseClassifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this rename.
object OrgConstrainedClassifier extends ConstrainedClassifier[ConllRawToken, ConllRelation] { | ||
override lazy val onClassifier = EntityRelationClassifiers.OrganizationClassifier | ||
override def pathToHead = Some(-EntityRelationDataModel.pairTo2ndArg) | ||
override def subjectTo = Some(EntityRelationConstraints.relationArgumentConstraints) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I want to say subjectTo = Some(worksForConstraint)
? what will be the syntax? workForContatint needs an input parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another relevant comment, how can I just express true
or false
i.e. constant expressions as constraints. Please add these to the documentation also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I want to say subjectTo = Some(worksForConstraint)? what will be the syntax? workForContatint needs an input parameter.
Regarding the first questions (as also mentioned in the documentation), the definition of the constraints starts with Node
and use ForEach
operator. If you want to define "worksForConstraints", instead of having it as a function:
def worksForConstraint(x: ConllRelation) = {
(WorksForClassifier on x isTrue) ==> ((PersonClassifier on x.e1 isTrue) and (OrganizationClassifier on x.e2 isTrue))
}
it should be written as
def worksForConstraint = EntityRelationDataModel.pairs.ForEach { x: ConllRelation =>
(WorksForClassifier on x isTrue) ==> ((PersonClassifier on x.e1 isTrue) and (OrganizationClassifier on x.e2 isTrue))
}
and then you can do subjectTo = Some(worksForConstraint)
....
Another relevant comment, how can I just express true or false i.e. constant expressions as constraints. Please add these to the documentation also.
You can't (and you shouldn't) define constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not mapping the case with constant True
to the case with no constraints and the case with constant False
to an infeasible solution
message? I think it would more expressive and robust to cover all possible logical expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not mapping the case with constant True to the case with no constraints and the case with constant False to an infeasible solution message? I think it would more expressive and robust to cover all possible logical expressions.
I can almost surely guarantee that you never need to use constant True
or constant False
. Can you come up with an example that either of these constants are necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my view, it is a matter of completeness of the representation. And those are the basic cases that your system should not fail to address those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something is never needed why is it a "basic case" and "matter of completeness"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you give me a concrete example that it's needed, I won't be convinced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give me a concrete application example that a hypothesis which says h=True
is useful? Why the most general hypothesish=True
is discussed and mentioned at all!!
Your argument looks like this to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something is never needed why is it a "basic case" and "matter of completeness?"
I missed this sentence, now I see it. I think the confusion comes from where we use first-order logic inside a different formalism that is ILP, therefore, it seems we can look at it only very practically without thinking if our constraint representation is even sound. From my view when you talk about logical expressions in any context the first expressions to think of are True
and False
and if I can not represent these, I am not sure what I am talking about then. We probably need to ask a third or more opinions on this. I do not have concrete examples.
For Foreach, I think the documentation is not clear enough, I can not see easily that I alway need to start writing any kind of constraint with Foreach expression, maybe you should say this above the table. Also, why this should be the case? I guess having ForAll and Foreach with two different semantics here will be confusing, from the logical expressions perspective. Unless this is technically impossible to change, I would suggest changing this. |
Yes there is a conceptual difference.
For |
Update Badge example to use new Constraint convention.
Update the Saul Inference PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I agree with @kordjamshidi 's comment about the ambiguity of ForEach
and ForAll
. ForEach
is implemented as a quantifier that you can apply to nodes.
But ForAll
is implemented as a conjunction of constraint clauses. If we plan to keep this convenience notation, we should rename to avoid confusion.
Ideally, we should be able to do node.ForAll
as a quantifier.
|----------|------------|---------|---| | ||
| `ForEach` | This operator works only on `Node`s. For each single instance in the node. This is often times one of the starting points for defining constraints. So if you are defining using a constrained classifier with head type `HEAD_TYPE`, we the definition of the constraint have to start with the node corresponding to this type. | `textAnnotationNode.ForEach { x: TextAnnotation => Some-Constraint-On-X }` | | ||
| `ForAll` | For **all** the elements in the collection it applies the constraints. In other words, the constrain should hold for **all** elements of the collection. | `textAnnotationNode.ForAll { x: TextAnnotation => Some-Constraint-On-x }` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo: constrain
-> constraint
in this and the next 4 lines.
object TestConstraintClassifier extends ConstrainedClassifier[String, String] { | ||
override def subjectTo = None | ||
override val solverType = OJAlgo | ||
override lazy val onClassifier = TestClassifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this rename.
} | ||
|
||
/** find all the instances used in the definiton of the constraint. This is used in caching the results of inference */ | ||
private def getInstancesInvolved(constraint: Constraint[_]): Set[_] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getInstancesInvolved
and getClassifiersInvolved
methods do not depend on ConstrainedClassifer (They don't need to be here). We should add these methods to the trait/abstract class for Constraint and have them implemented there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What abstract class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant adding them to the trait for Constraint[T]
here https://github.com/danyaljj/saul-1/blob/addingSaulInference/saul-core/src/main/scala/edu/illinois/cs/cogcomp/saul/classifier/infer/Constraints.scala#L114
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see; that is doable, but not sure if we would gain significant anything from that.....
@bhargav I think the introduction of |
Updated the documentation to make the confusion between |
Overall the change look good. I want to do a test on the JVM memory usage after these changes. I'll do a comparative run on some of our examples. ETA: 1-2 days. |
Items to finish:
Results on ER (L+I):
Exactly the same numbers in the readme file: