Skip to content

Commit

Permalink
add docs/tests/codecov CI and related stubs (#2)
Browse files Browse the repository at this point in the history
  • Loading branch information
jrevels authored Jun 14, 2021
1 parent 036523c commit 560222d
Show file tree
Hide file tree
Showing 13 changed files with 216 additions and 25 deletions.
58 changes: 58 additions & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: CI
on:
push:
branches:
- main
tags:
- v*
pull_request:
jobs:
test:
name: Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
version:
- '1'
- '1.3'
os:
- ubuntu-latest
arch:
- x64
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- uses: julia-actions/setup-julia@v1
with:
version: ${{ matrix.version }}
arch: ${{ matrix.arch }}
- uses: actions/cache@v2
with:
path: ~/.julia/artifacts
key: ${{ runner.os }}-test-artifacts-${{ hashFiles('**/Project.toml') }}
restore-keys: ${{ runner.os }}-test-artifacts
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-runtest@v1
- uses: julia-actions/julia-processcoverage@v1
- uses: codecov/codecov-action@v1
with:
file: lcov.info
docs:
name: Documentation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@v1
with:
version: '1'
- run: |
julia --project=docs -e '
using Pkg
Pkg.develop(PackageSpec(path=pwd()))
Pkg.instantiate()'
- run: julia --project=docs docs/make.jl
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
14 changes: 14 additions & 0 deletions .github/workflows/TagBot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: TagBot
on:
issue_comment:
types:
- created
workflow_dispatch:
jobs:
TagBot:
if: github.event_name == 'workflow_dispatch' || github.actor == 'JuliaTagBot'
runs-on: ubuntu-latest
steps:
- uses: JuliaRegistries/TagBot@v1
with:
token: ${{ secrets.GITHUB_TOKEN }}
9 changes: 8 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,11 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"

[compat]
Arrow = "1.5"
Tables = "1.4"
Tables = "1.4"
julia = "1.3"

[extras]
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
test = ["Test"]
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
# Legolas.jl

[![CI](https://github.com/beacon-biosignals/Legolas.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/beacon-biosignals/Legolas.jl/actions/workflows/CI.yml)
[![codecov](https://codecov.io/gh/beacon-biosignals/Legolas.jl/branch/master/graph/badge.svg?token=D0bcI0Rtsw)](https://codecov.io/gh/beacon-biosignals/Legolas.jl)
[![](https://img.shields.io/badge/docs-stable-blue.svg)](https://beacon-biosignals.github.io/Legolas.jl/stable)
[![](https://img.shields.io/badge/docs-dev-blue.svg)](https://beacon-biosignals.github.io/Legolas.jl/dev)

*wield `.arrow`s with style*

Legolas.jl is a Julia package that provides opinionated utilities for constructing, reading, writing, and validating Arrow tables against extensible, versioned, user-specified schemas.

Currently WIP.

NOTE TO BEACON EMPLOYEES: This repository is intended to be open-sourced directly; please don't include private/internal Beacon content in commits/issues/etc.
[Take The Tour](https://github.com/beacon-biosignals/Legolas.jl/tree/master/examples/tour.jl)
1 change: 1 addition & 0 deletions codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
comment: off
6 changes: 6 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[deps]
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Legolas = "741b9549-f6ed-4911-9fbf-4a1c0c97f0cd"

[compat]
Documenter = "0.24"
10 changes: 10 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
using Legolas
using Documenter

makedocs(modules=[Legolas],
sitename="Legolas",
authors="Beacon Biosignals, Inc.",
pages=["API Documentation" => "index.md",
"Tips For Schema Authors" => "schema.md"])

deploydocs(repo="github.com/beacon-biosignals/Legolas.jl.git", push_preview=true)
38 changes: 38 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# API Documentation

If you're a newcomer to Legolas.jl, please familiarize yourself with via the [tour](https://github.com/beacon-biosignals/Legolas.jl/blob/master/examples/tour.jl) before diving into this documentation.

```@meta
CurrentModule = Legolas
```

## Legolas `Schema`s and `Row`s

```@docs
Legolas.@row
Legolas.Row
Legolas.Schema
Legolas.is_valid_schema_name
Legolas.schema_name
Legolas.schema_version
Legolas.schema_qualified_string
Legolas.schema_parent
Legolas.transform
```

## Validating/Writing/Reading Legolas Tables

```@docs
Legolas.validate
Legolas.write
Legolas.read
```

## Utilities

```@docs
Legolas.lift
Legolas.assign_to_table_metadata!
Legolas.gather
Legolas.materialize
```
12 changes: 12 additions & 0 deletions docs/src/schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Tips for Schema Authors

If you're a newcomer to Legolas.jl, please familiarize yourself with via the [tour](https://github.com/beacon-biosignals/Legolas.jl/blob/master/examples/tour.jl) before diving into this documentation.

TODO: cover the following items:

- Legolas.jl's Simple Integer Versioning: You Break It, You Bump It
- forward/backward compatibility via allowing `missing` columns when possible
- avoid bumping schema versions by handling the deprecation path in the constructor
- prefer idempotency in field expressions when possible
- prefer Liskov substitutability when possible

21 changes: 1 addition & 20 deletions examples/tour.jl
Original file line number Diff line number Diff line change
Expand Up @@ -174,23 +174,4 @@ Arrow.setmetadata!(invalid, Dict("legolas_schema_qualified" => "my-child-schema@
# these functions are relatively agnostic to the types of provided path arguments. Generally, as long as a
# given `path` supports `Base.read(path)::Vector{UInt8}`, `Base.write(path, bytes::Vector{UInt8})`, and
# `mkpath(dirname(path))`, then `path` will work as an argument to `Legolas.read`/`Legolas.write`. At some
# point, we'd like to make similar upstream improvements to Arrow.jl to render its API more path-type-agnostic.

#####
##### Simple Integer Versioning: You Break It, You Bump It
#####
# TODO

#####
##### Tips For Schema Design
#####
# TODO: Cover the following:
#
# - forward/backward compatibility via allowing `missing` columns when possible
# - avoid bumping schema versions by handling the deprecation path in the constructor
# - prefer idempotency in field expressions when possible
# - prefer Liskov substitutability when possible

#####
##### Miscellaneous Utilities
#####
# point, we'd like to make similar upstream improvements to Arrow.jl to render its API more path-type-agnostic.
38 changes: 37 additions & 1 deletion src/rows.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,27 @@

const ALLOWED_SCHEMA_NAME_CHARACTERS = Char['-', '.', 'a':'z'..., '0':'9'...]

"""
TODO
"""
is_valid_schema_name(x::AbstractString) = all(i -> i in ALLOWED_SCHEMA_NAME_CHARACTERS, x)

"""
TODO
"""
struct Schema{name,version} end

"""
TODO
"""
function Schema(name::AbstractString, version::Integer)
is_valid_schema_name(name) || throw(ArgumentError("TODO"))
return Schema{Symbol(name),version}()
end

"""
TODO
"""
function Schema(str::AbstractString)
x = split(first(split(str, '>', limit=2)), '@')
if length(x) == 2
Expand All @@ -23,15 +35,27 @@ function Schema(str::AbstractString)
throw(ArgumentError("TODO"))
end

"""
TODO
"""
@inline schema_version(::Type{<:Schema{name,version}}) where {name,version} = version
@inline schema_version(schema::Schema) = schema_version(typeof(schema))

"""
TODO
"""
@inline schema_name(::Type{<:Schema{name}}) where {name} = name
@inline schema_name(schema::Schema) = schema_name(typeof(schema))

"""
TODO
"""
@inline schema_parent(::Type{<:Schema}) = nothing
@inline schema_parent(schema::Schema) = schema_parent(typeof(schema))

"""
TODO
"""
function schema_qualified_string end

# Note that there exist very clean generic implementations of `transform`/`validate`:
Expand All @@ -53,10 +77,16 @@ function schema_qualified_string end
# unnecessarily for schemas with a few ancestors, while the "hardcoded" versions
# generated by the current implementation of the `@row` macro (see below) do not.

"""
TODO
"""
function transform end

function _transform end

"""
TODO
"""
function validate end

function _validate end
Expand All @@ -77,6 +107,9 @@ Base.show(io::IO, schema::Schema) = print(io, "Schema(\"$(schema_name(schema))@$
##### Row
#####

"""
TODO
"""
struct Row{S<:Schema,F} <: Tables.AbstractRow
schema::S
fields::F
Expand Down Expand Up @@ -115,6 +148,9 @@ end

_parse_schema_expr(str::AbstractString) = Schema(str), nothing

"""
TODO
"""
macro row(schema_expr, fields...)
schema, parent = _parse_schema_expr(schema_expr)
isnothing(schema) && throw(ArgumentError("`@row` schema argument must be of the form `\"name@X\"` or `\"name@X\" > \"parent@Y\"`. Received: $schema_expr"))
Expand Down Expand Up @@ -148,7 +184,7 @@ macro row(schema_expr, fields...)

function Legolas._transform(::$schema_type; $([Expr(:kw, f, :missing) for f in field_names]...), other...)
$(map(esc, fields)...)
return (; $(field_names...), other...)
return (; $([Expr(:kw, f, f) for f in field_names]...), other...)
end

function Legolas._validate(tables_schema::Tables.Schema, legolas_schema::$schema_type)
Expand Down
24 changes: 24 additions & 0 deletions src/tables.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ const LEGOLAS_SCHEMA_QUALIFIED_METADATA_KEY = "legolas_schema_qualified"
##### validate tables
#####

"""
TODO
"""
function validate(table, legolas_schema::Schema)
columns = Tables.columns(table)
Tables.rowcount(columns) > 0 || return nothing
Expand All @@ -21,6 +24,9 @@ function validate(table, legolas_schema::Schema)
return nothing
end

"""
TODO
"""
function validate(table)
metadata = Arrow.getmetadata(table)
(metadata isa Dict && haskey(metadata, LEGOLAS_SCHEMA_QUALIFIED_METADATA_KEY)) || throw(ArgumentError("`$LEGOLAS_SCHEMA_QUALIFIED_METADATA_KEY` field not found in Arrow table metadata"))
Expand All @@ -32,12 +38,18 @@ end
##### read/write tables
#####

"""
TODO
"""
function read(path; validate::Bool=true)
table = read_arrow(path)
validate && Legolas.validate(table)
return table
end

"""
TODO
"""
function write(io_or_path, table, schema::Schema; validate::Bool=true, kwargs...)
# This `Tables.columns` call is unfortunately necessary; ref https://github.com/JuliaData/Arrow.jl/issues/211
# It is also the case that `Tables.schema(Tables.columns(table))` is more likely to return a `Tables.Schema`
Expand All @@ -50,6 +62,9 @@ function write(io_or_path, table, schema::Schema; validate::Bool=true, kwargs...
return table
end

"""
TODO
"""
function tobuffer(args...; kwargs...)
io = IOBuffer()
Legolas.write(io, args...; kwargs...)
Expand Down Expand Up @@ -81,6 +96,11 @@ write_arrow(path, table; kwargs...) = (io = IOBuffer(); write_arrow(io, table; k
#####
# TODO: upstream to Arrow.jl?

"""
TODO
Note that we intend to eventually migrate this function from Legolas.jl to a more appropriate package.
"""
function assign_to_table_metadata!(table, pairs)
m = Arrow.getmetadata(table)
if !(m isa Dict)
Expand Down Expand Up @@ -134,6 +154,8 @@ subtable. The default definition is sufficient for `DataFrames` tables.
Note that this function may internally call `Tables.columns` on each input table, so
it may be slower and/or require more memory if `any(!Tables.columnaccess, tables)`.
Note that we intend to eventually migrate this function from Legolas.jl to a more appropriate package.
"""
function gather(column_name, tables::Vararg{Any,N};
extract=((cols, idxs) -> view(cols, idxs, :))) where {N}
Expand Down Expand Up @@ -164,5 +186,7 @@ julia> materialized = Onda.materialize(items);
julia> @time foreach(identity, (nested_structure for nested_structure in materialized.nested_structures));
0.000014 seconds (2 allocations: 80 bytes)
```
Note that we intend to eventually migrate this function from Legolas.jl to a more appropriate package.
"""
materialize(table) = map(collect, Tables.columntable(table))
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include(joinpath(dirname(@__DIR__), "examples", "tour.jl"))

0 comments on commit 560222d

Please sign in to comment.