Skip to content

Network Filtering and Querying

Bryan Fox edited this page Dec 14, 2018 · 10 revisions

Querying and filtering

Filtering and querying a network are core requirements in Network Canvas and Server. Network Canvas uses filtering to determine which nodes should be shown in node panels, and uses querying for skip logic. Server uses querying during the export process to determine which networks to include in an export, and filtering to determine which data within those networks to include (or exclude).

Querying and filtering are related, but separate operations.

  • Filtering a network applies one or more logical constraints to network entities (alters, and edges) and returns the subset of the network that satisfies these constraints. Important to note is that filtering retains the validity of the network model: the returned network will not contain orphaned or partially disconnected edges, since these are not phenomenologically valid.

  • Querying similarly applies one or more logical constraints to the network, but adds an additional "top level" constraint. Unlike the filter operation, a query returns true or false depending on if this top level constraint is satisfied. For example, if we wish to skip a stage if no edges of a given type exist in our network, we would construct a rule (see below) to target that edge type, and evaluate the outcome in terms of Boolean logic.

Rules

Filter and querying operations are both comprised of one or more logical constraints, called rules. These target each of the basic entity types in our network model: ego, alters, and edges.

  • An egoRule operates only on the node representing the participant, and is only available during querying operations.
  • An alterRule operates on alters, and requires a 'type' to be specified. It will remove nodes that do not satisfy the criteria.
  • An edgeRule operates on edges, and requires a 'type' to be specified. It will remove nodes not connected to an edge that satisfies the rule criteria.

Eight operators are available for use with these rules:

  • GREATER_THAN
  • LESS_THAN
  • GREATER_THAN_OR_EQUAL
  • LESS_THAN_OR_EQUAL
  • EXACTLY
  • NOT
  • EXISTS
  • NOT_EXISTS

Note that some of these operators are for testing an attribute of the entity, whereas others apply to the entity itself. For example, GREATER_THAN, LESS_THAN, LESS_THAN_OR_EQUAL, EXACTLY, and NOT can be used to test the value of a node, edge, or ego attribute. In contrast, EXISTS, and NOT_EXISTS can be used to test for a node or edge type, regardless of other attributes.

Using the entity type in combination within an operator allows the construction of simple logical tests such as: "alters of type 'Person' with attribute 'age' greater than or equal to 18". Rules are mapping functions, and as such return a copy of the network.

Joining multiple rules together

Rules are combined using one of two joining methods: AND and OR.

These methods behave much as you might expect. In the context of a filter operation:

  • OR signifies that the result of each rule should be combined together in the returned network. If a node or edge matches any individual rule criteria, it will be returned in the overall network.
  • AND signifies that nodes and edges in the returned network should satisfy all rule constraints.

Querying

Rules behave fundamentally differently in the context of a query operation.

As mentioned previously, a querying operation returns a boolean value depending on if the rules applied to the network model satisfy a "top level constraint". The value of the top level constraint is determined by the "truthiness" of the individual rules themselves, along with the joining method.

To accomplish this, each rule, when used in a query, has an additional operator that determines how its truthiness is evaluated. For illustration, refer back to the example rule mentioned above:

alters of type 'Person' with attribute 'age' greater than or equal to 18

When used within a filter, this rule will return a subset of a network that satisfies this constraint. When used within a query, we must describe how to interpret the result of this rule in terms of Boolean logic. To do this, we evaluate the result using dedicated query operators:

  • COUNT
  • COUNT_NOT
  • COUNT_GREATER_THAN
  • COUNT_GREATER_THAN_OR_EQUAL
  • COUNT_LESS_THAN
  • COUNT_LESS_THAN_OR_EQUAL

Using these operators, we can reduce the outcome of the rule to a true or false value. For example:

Count of alters of type 'Person' with attribute 'age' greater than or equal to 18 is greater than 0

The role of joining operators in queries

The joining operators govern how multiple rules are evaluated together to determine the overall boolean value that the query returns. If rules are joined by AND statements, and all individual rules evaluate to true, so will the top level constraint (for example: true && true === true). Conversely, if any individual rules evaluate to false, so too will the top level constraint (true && false === false).

If rules are joined by OR statements, any individual rule evaluating truthfully will cause the top level constraint to also evaluate as true (true || false || false === true).

To provide a worked example of the relationship between rules and the top level constraint (✅ = evaluates to true, ❌ = evaluates to false):

Query 1: ✅ Ego with attribute gender = male ❌ OR (Alter of type person with attribute gender = male) count > 0 ✅

Query 2: ❌ Ego with attribute gender = male ❌ AND (Alter of type person with attribute gender = male) count > 0 ✅

Advanced functionality

If more complex logic is required, such as injecting an ego attribute into a subsequent alter or edge rule, or performing a "network traversal", a custom query can be written in vanilla JavaScript.