-
Notifications
You must be signed in to change notification settings - Fork 242
/
Copy pathgroup-by.scm
167 lines (145 loc) · 6.47 KB
/
group-by.scm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
;
; group-by.scm -- Demo of the GroupLink
;
; This example demonstrates the use of the GroupLink to group together
; search results into groupings that are similar to one-another in some
; way. It is inspired by the SQL `GROUP BY` statement, and works in a
; similar fashion.
;
; Background: Search results are always presented as sets of trees.
; These trees are the result of grounding the set of search clauses.
; If search has two or more variable in it, it can be interesting to
; obtain groupings where all results in a group have exactly the same
; grounding for that variable.
;
; Such a grouping can be thought of as a connected (hyper-)graph. All
; trees in the group will share the Atom that grounds the grouped
; variable. They are connected through that Atom. Different groupings
; will be disjoint from one another, as they don't share that Atom.
; They may, of course, be connected in other ways, just not through
; the grouping Atom/variable.
;
; Another way of thinking about groupings is that the grouping Atom
; forms a "kernel". Many similar trees are all attached to this kernel,
; these trees differ in various ways from one-another, but they have
; this kernel in common. In this respect, they are all similar.
;
; A third way of thinking of groupings is as "local for-all clauses".
; Thus, for all members in a group, the property specified in the
; grouping kernel holds. In this sense, GroupLink is a "local" version
; of AlwaysLink. The AlwaysLink asks that all search results must have
; in common the specified clause, or, equivalently, that there must be
; one and only one group. The GroupLink relaxes this demand for there
; to be only one, and presents several groupings, as these occur.
;
; This demo will use a single, simple grouping variable. Multiple
; variables and complex terms can be used to define a grouping kernel;
; this demo shows only the simplest case.
(use-modules (opencog) (opencog exec))
; Create a collection of trees to search over.
; The structure should be obvious, but a few points bear empahsis:
;
; * All of these trees are connected to one-another, as they all
; have the Atom `(Predicate "property")` in common with one-another.
; Thus, the below specifies a single connected hypergraph.
;
; * There are two groupings of three that are evident: the colors and
; the shapes. Ignoring the common `(Predicate "property")`, these
; groupings are obviously disjoint: none of the colors are shapes,
; and vice-versa. Each grouping forms a connected sub-hypergraph,
; in that the Atom `(Item "colors")` is shared in common by all
; of the trees that ... share it.
;
; The goal of the search query will be to define a search pattern
; that can find the colors and the shapes, and group these results
; together.
(Edge (Predicate "property") (List (Item "green") (Item "colors")))
(Edge (Predicate "property") (List (Item "brown") (Item "colors")))
(Edge (Predicate "property") (List (Item "black") (Item "colors")))
(Edge (Predicate "property") (List (Item "round") (Item "shapes")))
(Edge (Predicate "property") (List (Item "square") (Item "shapes")))
(Edge (Predicate "property") (List (Item "trident") (Item "shapes")))
(Edge (Predicate "le grande foobar") (List (Item "blob") (Item "shapes")))
(Edge (Predicate "property") (List (Item "vague") (Item "cloudy")))
; Define a query that will look for relations having "property", and
; group them together by commonality in the second position of the
; property: group them by commonality in `(Variable "$Y")`.
(define grp-query
(Query
; Variable declarations (optional)
(VariableList (Variable "$X") (Variable "$Y"))
; Use an AndLink to unite all the search clauses
(And
; The "property" must be present in the AtomSpace.
(Present
(Edge (Predicate "property")
(List (Variable "$X") (Variable "$Y"))))
; The search results will be grouped together by having
; a common value of $Y
(Group (Variable "$Y")))
; The QueryLink is a kind of rewrite-rule; the variable
; groundings can be used to create new structures. For this
; demo, some nonsense Implication & Evaluation links are
; created. You don't want to use Implication like this in
; practice, but visually, it works for this demo.
(Evaluation (Concept "things that go together")
(Implication (Variable "$Y") (Variable "$X")))))
; Perform the query, and put the results in a scheme object.
(define query-results (cog-execute! grp-query))
; Print a report.
(format #t "There are ~A results.\n" (length (cog-value->list query-results)))
(format #t "The query results are:\n~A\n" query-results)
; -------------------------------------------------------------
; Part the second.
; This is a variant of the above, perhaps showing more clearly and
; distinctly the grouping effect.
(define grp-set
(Query
(VariableList (Variable "$X") (Variable "$Y"))
(And
(Present
(Edge (Predicate "property")
(List (Variable "$X") (Variable "$Y"))))
(Group (Variable "$Y")))
(Variable "$X")))
(define set-results (cog-execute! grp-set))
(format #t "The groupings are:\n~A\n" set-results)
; Groupings can be interesting when their sizes can be constrained.
; The IntervalLink can be used to do this. In the below, groups with
; fewer than 2 members will not be provided.
;
; The IntervalLink behaves as in other contexts. Setting the upper
; limit to -1 is interpreted as no upper bound.
(define grp-range
(Query
(VariableList (Variable "$X") (Variable "$Y"))
(And
(Present
(Edge (Predicate "property")
(List (Variable "$X") (Variable "$Y"))))
(Group
(Variable "$Y")
(Interval (Number 2) (Number 5))))
(Variable "$X")))
(define range-results (cog-execute! grp-range))
(format #t "The groupings are:\n~A\n" range-results)
; The grouping size constraint applies to the group before the rewrite,
; and not after. Below, only the group name is reported, and since there
; is only one name per group, all group members collapsed down to this
; one name. In general, one is interested in the size of the group,
; before the collapse, not after. Thus, names are reported for those
; groups with two or more members.
(define grp-collapse
(Query
(VariableList (Variable "$X") (Variable "$Y"))
(And
(Present
(Edge (Predicate "property")
(List (Variable "$X") (Variable "$Y"))))
(Group
(Variable "$Y")
(Interval (Number 2) (Number 5))))
(Variable "$Y")))
(define collapse-results (cog-execute! grp-collapse))
(format #t "The group names are:\n~A\n" collapse-results)
; ------------ That's All, Folks! The End. ------------------