Skip to content
Daniel Burrell edited this page Jun 2, 2020 · 4 revisions

Key Functions

  • long getLatestSchemaPublicationTime() Gets the time the latest schema was published. This value can then be used as part of the "changed" http method which returns "no changes" given the right information. This value should be derived from the latest schema version that has been stored in the doc store.

  • void checkNewSchema()

  • Notes the fact that an attempt to fetch the schema was made to the audit log (time, outcome)

  • Gets the schema

    • If changes exist:
      • write the schema itself to s3
      • write the time this was discovered to dynamo db
      • write the file reference
      • write the new latestSchemaTime for future calls
      • trigger the Delta Analysis.
      • writes the fact that this was discovered by schematf (for example other schemas may have been imported)
      • date this entry was added to the database (different to the discovery time again because discovery is not writing and some entries may have been imported at a certain time)
      • the order of this schema (should be an orderable field such that ordering by this fields causes the schemas to be placed in change order) - may be derived from the schema itself, may also be made up in the case of imports.
    • If not. does nothing.
  • void long getPriorSchema(long id) returns the id of the schema prior to the current schema

  • void doAnalysis(long current) calls getPriorSchema to identify the schemaId prior to the current one known as id current and id prev Triggered when there is a difference in the schema, and carries out analysis on the delta between schema with id current and id prev. Records the analysis along with the time the analysis was recorded in a database of delta analysis, together with the schemas being compared. When recording this analysis it may be the case that two records between current and prev exist, but it's ok so long as they have bigger dates. we will pick the max date. and should prune previous analysis. No matter what, calls the availability tracker to update what the latest schema is (if this wasn't already done further up, design decision here).

  • void getDiff(long current) calls getPriorSchema to identify the schemaId prior to the current one. looks up the diff for these two elements. returns empty if one doesn't exist.

  • void getSchema() gets the latest known schema locally.

Let people write bullshit comments on the schema if they want to, leave this to "discuss" or some such thing. Let the cacheing layer of aws deal with the cacheing of results - this is probably cloudfront and it will probably only work if you have a proper restful api.

Clone this wiki locally