Skip to content

foundator/schema

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 

Repository files navigation

schema

This is a WIP design document

A schema for defining serialization formats.

The schema consists of 7 fixed-size types, 3 variable-sized types and two kinds of user defined types:

Schema Description
bool a boolean, either false or true
int8 8bit signed twos complement integer
int16 16bit signed twos complement integer
int32 32bit signed twos complement integer
int64 64bit signed twos complement integer
float32 single precision floating point number
float64 double precision floating point number
string utf8 variable-length string
binary zero or more raw bytes of binary data
T* zero or more elements of type T (aka list or array)
T? zero or one element of type T (aka nullable or optional)
struct a user-defined record type, possibly generic (aka product type)
enum a user-defined tagged union type, possibly generic (aka sum type)

While we only need the last entry in this table, the rest of the types are added to make it easier to understand how the types map to the various programming languages and serialization formats.

The * and ? modifiers can be combined to make an optional list of elements, but the modifiers can only appear at top-level in a type. This makes them map better to languages that have limited type systems. To nest these modifiers, an intermediate struct or enum is required.

The syntax for the schema is concise but C-like. Here's how you could define a person:

struct Person {
    Age : int32
    Name : string
    Email : string?
}

A team might contain a list of members:

struct Team {
    Name : string
    Members : Person*
}

The alignment of a widget may be one of three values, like an enum:

enum Alignment {
    Top
    Middle
    Bottom
}

The syntax tree for a small language may need values attached to each tag:

enum Expression {
    Constant(Value : int32)
    Add(X : Expression, Y : Expression)
    Multiply(X : Expression, Y : Expression)
}

A binary tree might as well be generic:

enum Tree<T> {
    Leaf(Value : T)
    Branch(Left : Tree<T>, Right : Tree<T>)
}

The compiler enforces that built-in types and keywords are lowercase, while user-defined types start with an upper-case letter.

There is a binary format and a JSON format.

{
    "schema": "http://example.com/schema/1.1",
    "type": "Tree<Team>",
    "value": ["Leaf", {
        "Name": "Mojos",
        "Members": [
            {
                "Age": 21,
                "Name": "Johanna"
            }
        ]
    }]
}

About

A schema for defining serialization formats.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published