description |
---|
Additional examples that demonstrate handling of complex types. |
Additional examples that demonstrate handling of complex types.
In this example, we would look at un-nesting json records that are batched together as part of a single key at the root level. We will make use of the ComplexType configs to persist the individual student records as separate rows in Pinot.
{
"students": [
{
"firstName": "Jane",
"id": "100",
"scores": {
"physics": 91,
"chemistry": 93,
"maths": 99
}
},
{
"firstName": "John",
"id": "101",
"scores": {
"physics": 97,
"chemistry": 98,
"maths": 99
}
},
{
"firstName": "Jen",
"id": "102",
"scores": {
"physics": 96,
"chemistry": 95,
"maths": 100
}
}
]
}
The Pinot schema for this example would look as follows.
{
"schemaName": "students001",
"enableColumnBasedNullHandling": false,
"dimensionFieldSpecs": [
{
"name": "students.firstName",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
},
{
"name": "students.id",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
},
{
"name": "students.scores",
"dataType": "JSON",
"notNull": false,
"fieldType": "DIMENSION"
}
],
"dateTimeFieldSpecs": [
{
"name": "ts",
"fieldType": "DATE_TIME",
"dataType": "LONG",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS"
}
],
"metricFieldSpecs": []
}
The Pinot table configuration for this schema would look as follows.
{
"ingestionConfig": {
"complexTypeConfig": {
"fieldsToUnnest": [
"students"
]
}
}
}
Post ingestion, the student records would appear as separate records in Pinot. Note that the nested field scores
is
captured as a JSON field.
In this example, we would look at un-nesting the sibling collections "student" and "teacher".
{
"student": [
{
"name": "John"
},
{
"name": "Jane"
}
],
"teacher": [
{
"physics": "Kim"
},
{
"chemistry": "Lu"
},
{
"maths": "Walsh"
}
]
}
{
"schemaName": "students002",
"enableColumnBasedNullHandling": false,
"dimensionFieldSpecs": [
{
"name": "student.name",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
},
{
"name": "teacher.physics",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
},
{
"name": "teacher.chemistry",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
},
{
"name": "teacher.maths",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
}
]
}
"complexTypeConfig": {
"fieldsToUnnest": [
"student",
"teacher"
]
}
In this example, we would look at un-nesting the nested collection "students.grades".
{
"students": [
{
"name": "Jane",
"grades": [
{
"physics": "A+"
},
{
"maths": "A-"
}
]
},
{
"name": "John",
"grades": [
{
"physics": "B+"
},
{
"maths": "B-"
}
]
}
]
}
{
"schemaName": "students003",
"enableColumnBasedNullHandling": false,
"dimensionFieldSpecs": [
{
"name": "students.name",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
},
{
"name": "students.grades.physics",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
},
{
"name": "students.grades.maths",
"dataType": "STRING",
"fieldType": "DIMENSION",
"notNull": false
}
]
}
"complexTypeConfig": {
"fieldsToUnnest": [
"students",
"students.grades"
]
}
In this example, we would look at un-nesting the array "finalExam" which is located within the array "students".
{
"students": [
{
"name": "John",
"grades": {
"finalExam": [
{
"physics": "A+"
},
{
"maths": "A-"
}
]
}
},
{
"name": "Jane",
"grades": {
"finalExam": [
{
"physics": "B+"
},
{
"maths": "B-"
}
]
}
}
]
}
{
"schemaName": "students004",
"enableColumnBasedNullHandling": false,
"dimensionFieldSpecs": [
{
"name": "students.name",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
},
{
"name": "students.grades.finalExam.physics",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
},
{
"name": "students.grades.finalExam.maths",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
}
]
}
"complexTypeConfig": {
"fieldsToUnnest": [
"students",
"students.grades.finalExam"
]
}
In this example, the inner collection "grades" is converted into a multi value string column.
{
"students": [
{
"name": "John",
"grades": [
{
"physics": "A+"
},
{
"maths": "A"
}
]
},
{
"name": "Jane",
"grades": [
{
"physics": "B+"
},
{
"maths": "B-"
}
]
}
]
}
{
"schemaName": "students005",
"enableColumnBasedNullHandling": false,
"dimensionFieldSpecs": [
{
"name": "students.name",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
},
{
"name": "students.grades",
"dataType": "STRING",
"notNull": false,
"isSingleValue": false,
"fieldType": "DIMENSION"
}
]
}
"complexTypeConfig": {
"fieldsToUnnest": [
"students"
]
}
In this example, the array of primitives "extra_curricular" is converted to a Json string.
{
"students": [
{
"name": "John",
"extra_curricular": [
"piano", "soccer"
]
},
{
"name": "Jane",
"extra_curricular": [
"violin", "music"
]
}
]
}
{
"schemaName": "students006",
"enableColumnBasedNullHandling": false,
"dimensionFieldSpecs": [
{
"name": "students.name",
"dataType": "STRING",
"notNull": false,
"fieldType": "DIMENSION"
},
{
"name": "students.extra_curricular",
"dataType": "JSON",
"notNull": false,
"fieldType": "DIMENSION"
}
]
}
"complexTypeConfig": {
"fieldsToUnnest": [
"students"
],
"collectionNotUnnestedToJson": "ALL"
}