Skip to content
This repository has been archived by the owner on Dec 14, 2023. It is now read-only.
Saul Pwanson edited this page Jan 4, 2022 · 4 revisions

Tips and tricks for writing Ibis expressions

  • After .join()s, refer to columns (for filters, aggregates, etc) from the join expression instead of from the base tables. (This will require a .materialize() so the join expression knows what columns it has available.)
  • lambda expressions are more general and robust than pandas expressions; they will always work, and never operate on a huge table accidentally.
  • group_by() returns a GroupedTableExpr which is not a TableExpr, so the usual Table operations aren't accessible on it. Needs to have an aggregate(). [Maybe GroupedTableExpr should inherit from TableExpr with a default behavior like .distinct() if no other aggregations are specified.]
  • .mutate() is just sugar for .projection().

Possible pitfalls

t[t.a] == t['a']
t[['a']] != t['a']
t[['a']] == t.projection(['a']) == t.select(['a'])
t['a', 'b'] == t.projection(['a', 'b'])
  • Ibis uses bitwise ops for "and" (&) "or" (|) "not" (~)
    • Python and/or bind more tightly than comparison operators like ==, but & and | bind more loosely
    • therefore expressions for & and | must always be wrapped in parentheses
Clone this wiki locally