Proposal for variation selector - suitable for i18n etc. #44

rjelliffe · 2022-07-11T10:08:29Z

Basic Variants, using i18n as example

Two new element /sch:schema/sch:variant and sch:variant/sch:enable are introduced. A variant is a test (like an assertion test) that can be used to enable (or strip out) any element. Like sch:param, you provide the a value for it like parameter to the schema

<sch:variant name="LANGUAGE"  of="sch:diagnostics"  default=" 'en' " >
   <sch:p>The language of diagnostics may be selected by specifying a parameter "language" with  
   the language code, such as "en" or "jp". All other diagnostics are removed. If there is no diagnostics
    with the appropriate language, then a default with no language is used. If no "language" is specified
   then no diagnostics are enabled. </sch:p>

   <sch:variable name="diagnostic-elements" select="/sch:schema/sch:diagnostics" />
   
   <sch:variable name="diagnostic-element-candidates" select="$diagnostic-elements[ @xml:lang= $LANGUAGE]". />

   <sch:enable test="if ($diagnostic-element-candidates) 
                                   then ($diagnostic-element-candidates = . )
                                   else ($diagnostic-elements[not(@xml:lang)] = .)" />
<sch:variant>

Our document also imports various files with sch:diagnostics for each language, such as

<sch:diagnostics id="d1" xml:lang="en_AU">
   <sch:diagnostic>What's that, Skip? Helicopter crash near Cabbage Tree Creek?</sch:diagnostic>
    ...
</sch:diagnostics>

So our sch:variant element

Is called "LANGUAGE" and defines a kind of schema parameter that it is supplied externally and scoped to the sch:variant element.
applies to every node self::sch:diagnostics that is found
has a variable that collects all diagnostics (because in this case, we want to default if there are none) both in the schema
has another variable that collects diagnostics in our particuar case
has an enable element with a test at that sch:diagnostics element to enable or disable (strip) the test.

This variant select allows very simple selection. The details can be discussed. Note that in this case, @id is not used at all.

Cleaner files

A two-stage declaration is less distracting: the main schema only has a top-level

   <sch:import href="internationalized-diagnostics-list.sch"/>

This imported file in turn has

 <sch:schema ... > 
       <sch:variant name="LANGUAGE"  of="sch:diagnostics"  >
             ...  <!-- unchanged from above -->
       </sch:variant>

      <sch:import href="diagnostics-en.sch" />
      <sch:import href="diagnostics-jp.sch" />
      <sch:import href="diagnostics-de.sch" />
      <sch:import href="diagnostics-cs.sch" />

</sch:schema>

So the operation here is just inclusion, with scoped variant selection. An implementation method would be for the sch:import processor to perform the enabling/disabling, so that it only returns the enabled diagnostics up. Or it could mark the sch:diagnostics with some implementation-specific attribute to disable them. There is no need for this to be a costly feature (e.g. calculated each time every assertion is checked.)

Smart imports

We really would prefer to not import files we don't need at all, without parsing them. But we can use the variant mechanism to do so: in our internationalized-diagnostics-list.sch file we change our variant function to:

<sch:schema ...>
    <sch:variant name="LANGUAGE"  of="sch:import" >
        <sch:enable test="ends-with(@href, concat('-', $LANGUAGE, '.sch')" />
   </sch:variant>
    
      <sch:import href="diagnostics-en.sch" />
      <sch:import href="diagnostics-jp.sch" />
      <sch:import href="diagnostics-de.sch" />
      <sch:import href="diagnostics-cs.sch" />
</sch:schema>

Other scenarios:

Severity-level selection

This mechanism can be used to, with a parameter, enable or disable any part of the schema. For example, we could have

   <sch:param name="SEVERITY" select="'#ALL'" />

     <sch:variant name="SEVERITY" default=" '#ALL' "  of="sch:assert | sch:report[@role]" >
        <sch:enable test=".[$SEVERITY = '' 
               | $SEVERITY = '#ALL' 
               | (#SEVERITY = 'error' and @role='error')
               | (#SEVERITY = 'warning' and (@role='error' or @role='warning')
               | (#SEVERITY = 'info' and (@role='error' or @role='warning' or @role='info')  ]" />
     </sch:variant>

This allows the caller to control which level of severity is tested and reported, using a schema parameter (e.g. a command line parameter or invocation parameter.) In this case, an sch:assert with no @ROLE will be disabled if SEVERITY="error", but an sch:report with no @ROLE will not be affected by this variant.

If we wanted to have rule-level control we could have

 <sch:param name="SEVERITY" select="'#ALL'" />

     <sch:variant name="severity-level-of-assertions"  of="sch:assert | sch:report" >
        <sch:enable test=".[$SEVERITY = '' 
               | $SEVERITY = '#ALL' 
               | (#SEVERITY = 'error' and (@role='error' or parent::rule/@role="error"))
               | (#SEVERITY = 'warning' and (@role='error' or @role='warning" or parent::rule[@role='error' or @role='warning']))
               | (#SEVERITY = 'info' and (@role='error' or @role='warning' or @role='info'
                                             or parent::rule[@role="error" or @role=;warning'] or @role='info')  ]" />
     </sch:variant>

Fallback to database implementation

I have seen Xpaths in assertions or variables which drop into Java to do database connections. But we could make the same schema allow both java and odbc or whatever.

<sch:schema ...> 

     <sch:variant name="USE-JAVA"  of="sch:let[contains(@select, 'java:')]" >
        <sch:enable test="$USE-JAVA = 'yes' " />
     </sch:variant>

     <sch:variant name="USE-ODBC"  of="sch:let[contains(@select, 'odbc:')]" >
        <sch:enable test="$USE-ODBC= 'yes' " />
     </sch:variant>

     <sch:let id="database_access" select="java:blahblah" />
     <sch:let id="datacase_access" select="java:blahblah" />

Design rationale

We don't want to mark targets, as it makes lots of markup if we have to annotate each assertion, for example. But we can use any markup in the schema: ids, roles, flags, text in Xpaths, and so on. (Because foreign attributes are allowed in Schematron, if we do want our variant to enable/disable elements that do not have any convenient markup, someone could use that Eg.

This, like macros, is a general feature that can perform lots of tasks. It can be implemented both in a pipeline, by inclusion time, or at compile time, or at runtime. Indeed, it may be that some mix of these is optimal.

Include time versus run-time parameters

Like sch:param in any case, it may be that we need some other control mechanism to determine which stage of a pipeline the parameter is needed. For example, for database access, the variant is known at deployment time, and so stripped out of the code. But severity level might be varied at runtime without recompilation, so may be better served by wrapping the assertion in a conditional statement. An implementation might provide an invocation parameter that identifies any variants that are to be resolved at run-time.

This is not information that, I think, belongs in a schema.

Connection to phase

A phase can select a variant.

   <sch:phase id="start">
           <sch:active pattern="p1">...</sch:active>
          <sch:active variant="SEVERITY"  value="warning"  />
  </sch:phase>

The phase would override the command-line.

Conceptual: is a variant a conceptual object or a practical one?

A variant is at least a practical object, such as selecting the database provision, or selecting the diagnostics by language, or the severity level.

However, it could also be conceptual. First, because it ties into phases, which certainly can be conceptual. But also because it allows a different construction of the schema based on its characteristics.

Lets take this as an example Akoma Ntoso, that schema for national laws, or (probably) HL7. A Schematron schema can be made for it, but it is designed to be subsetted. So you may keep the kitchen sink XSD and use the Schematron just for the particular dialect by the region, or even for each particular sub use (legislation, regulation, treaty, etc). You may combine some of these into phases so that the phase indicates which one.

What variants allow is a different way of ruling things in or out of larger Schematron schema. For example, you might declare a schema-level variant for "legislation" that enables assertions for legislation metadata (or, rather, which disables rules for regulation metadata) and one for "regulation" that does the reverse.

     <sch:variant name="LEGISLATION"  of="sch:rule[contains(trim(@context), 'metadata/legislation-number')]" >
        <sch:enable test=" $LEGISLATION = 'yes' " />
     </sch:variant>

     <sch:variant name="REGULATION"  of="sch:rule[contains(trim(@context), 'metadata/regulation-number')]" >
        <sch:enable test="$REGULATION='yes' " />
     </sch:variant>

Now it might be that this is perfectly well handled without variants by, say, having

<sch:rule context="metadata/legislation-number[ $I-AM-LEGISLATION-PARAM = 'yes' ]">

where the rule is explicitly enabled. But to do this at the individual assertion level means adding a lot of boilerplate, and likelyhood of error. So I don't think it is feasible.

Possibility: Variants that are detected in the document

Optionally, we could even make variants be based on the incoming document.

<sch:variant name="XHTML" of="sch:phase">
     <sch:enable document-test=" /*[contains(namespace(), 'xhtml')] and  @id="xhtml-constraints" />
</sch:variant>

This introduces an alternative attribute @document-test. If the incoming document uses the xhtml namespace, then the phase "xhtml-constraints" is enabled.

I can see three implementation methods,among several.

Dynamic. Before looking at the incoming parameter for selecting the phase, the implementation tries each

sch:variant[contains(@as, 'sch:phase')][sch:enable/@document-test]

and, if none match then uses the supplied one. The generated code for every pattern has a conditional that allows run-time selection of phase et.

Static. The document is first read in, and the test applied. Then that becomes information to be used to include and compile the Schematron schema into executable code.
Semi static. The document inclusions are performed statically. At runtime, the phase variants are tested, then that is used to trim down the schema into executable code which is then run.

The text was updated successfully, but these errors were encountered:

AndrewSales added the i18n Issues relating to internationalization label Jun 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for variation selector - suitable for i18n etc. #44

Proposal for variation selector - suitable for i18n etc. #44

rjelliffe commented Jul 11, 2022 •

edited

Loading

Proposal for variation selector - suitable for i18n etc. #44

Proposal for variation selector - suitable for i18n etc. #44

Comments

rjelliffe commented Jul 11, 2022 • edited Loading

Basic Variants, using i18n as example

Cleaner files

Smart imports

Other scenarios:

Severity-level selection

Fallback to database implementation

Design rationale

Include time versus run-time parameters

Connection to phase

Conceptual: is a variant a conceptual object or a practical one?

Possibility: Variants that are detected in the document

rjelliffe commented Jul 11, 2022 •

edited

Loading