This advanced validation workflow component, Rule-based XML validation, can validate XML according to rules.
The rules for rule-based validation can be used to validate documents in XML format.
A suggestion for implementation can be found here.
The component has a single parameter:
The validation rules must be setup in a schematron file (.sch) in the validation section of the Library.
When you insert this component, then matching conditions are also inserted as below:
An example of a rule based validation can be found here.
You can then insert workflow component in the valid and invalid subtrees to setup what should happen depending on the validation, and report the result of the validation with either of these workflow components:
1.Rule based validation to log.
2.Rule based validation report to text attachment.
3.Rule based validation report to XML attachment.
Validation rules
The rules must be described in the the standard ISO Schematron format.
For the Schematron files to be accessible in InterFormNG2 workflows, they must be uploaded to the library under "Validation rules" and have the file extension .sch.
The basic structure of a Schematron file is this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron">
<pattern>
<rule context="XPATH">
<assert test="XPATH">ERROR_DESCRIPTION</assert>
</rule>
</pattern>
</schema>
You can add as many patterns as you like. The purpose of a pattern is simply to group a set of rules. Each pattern can have multiple rules and each rule can have multiple asserts.
Let us try to validate this XML document:
<?xml version="1.0" encoding="UTF-8"?>
<Persons>
<Person Title="Mr">
<Name>John Doe</Name>
<Gender>Male</Gender>
<CustomerId>1000</CustomerId>
<Email>jd@example.com</Email>
</Person>
<Person Title="Mr">
<Name>Michael Smith</Name>
<Gender>Male</Gender>
<CustomerId>1001</CustomerId>
<Initials>JS</Initials>
<Email>js@example.com</Email>
</Person>
<Person Title="Mrs">
<Name>Jane Doe</Name>
<Gender>Female</Gender>
<CustomerId>1099</CustomerId>
<Initials>JD</Initials>
</Person>
</Persons>
When creating a rule, the context attribute must be an XPath expression that identifies the node-set that we want to validate.
To validate every person element, the context must be "Persons/Person", like this:
<rule context="Persons/Person">
</rule>
Now we can add the rules for Person elements. In each assert, the test attribute must be an XPath expression that must be true when the person element is valid. The XPath expressions can use any XPath v2.0 functions that are valid in XSL. The text for the assert is the error message that should be displayed in the report, if the XPath expression evaluates to false.
Common rule patterns
Validate existence of an attribute. To check if an attribute exists, do:
test="@ATTRIBUTE-NAME"
Validate existence of an element. To check if an element exists, do:
test="ELEMENT-NAME"
To validate the number of characters in a string, use the XPath function "string-length" in a logic expression.
Since < is a reserved character in XML, less than (<) can be written as "lt" and "less than or equal" (<=) can be written as "le".
For instance to check if the text in an element is 10 characters or less.
test="string-length(ELEMENT-NAME) le 10"
To validate if the contents of an element is a numeric value, use the XPath function "number":
test="number(ELEMENT-NAME)"
To validate the boundaries of a numeric value, use a logic expression like this:
test="ELEMENT-NAME >= 1000"
Example
The below is an example of an entire ruleset for the Persons XML example above:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron">
<pattern>
<rule context="Persons/Person">
<assert test="@Title">The element Person must have a Title attribute</assert>
<assert test="number(CustomerId)">The element CustomerId must be numeric</assert>
<assert test="number(CustomerId) >= 1000">The element CustomerId must be at least 1000</assert>
<assert test="@Title='Mr' or @Title='Mrs' or @Title='Ms'">Title must be Mr, Mrs or Ms</assert>
<assert test="string-length(Initials) le 3">Initials must be no more than 3 characters</assert>
<assert test="not(Email) or contains(Email,'@')">Email address must contain a @</assert>
</rule>
<rule context="Persons">
<assert test="count(Person) > 0">The document must contain at least one person</assert>
</rule>
</pattern>
</schema>
Tool support
If you need tool support to help with the authoring of Schematron files, some editors are available, for instance OxygenXML: https://www.oxygenxml.com/xml_editor.html
Error report
When a Schematron validation is executed, a report in XML format is generated. The report contains all of the failed test cases and display the defined error message for that test.
This is an example of a failed validation of the Persons XML file.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl"
xmlns:iso="http://purl.oclc.org/dsdl/schematron"
xmlns:schold="http://www.ascc.net/xml/schematron"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
schemaVersion=""
title="">
<svrl:active-pattern document=""/>
<svrl:fired-rule context="Persons"/>
<svrl:fired-rule context="Persons/Person"/>
<svrl:fired-rule context="Persons/Person"/>
<svrl:failed-assert test="number(CustomerId)" location="/Persons/Person[2]">
<svrl:text>The element CustomerId must be numeric</svrl:text>
</svrl:failed-assert>
<svrl:failed-assert test="number(CustomerId) ge 1000" location="/Persons/Person[2]">
<svrl:text>The element CustomerId must be at least 1000</svrl:text>
</svrl:failed-assert>
<svrl:failed-assert test="@Title='Mr' or @Title='Mrs' or @Title='Ms'"
location="/Persons/Person[2]">
<svrl:text>Title must be Mr, Mrs or Ms</svrl:text>
</svrl:failed-assert>
<svrl:failed-assert test="not(Email) or contains(Email,'@')" location="/Persons/Person[2]">
<svrl:text>Email address must contain a @</svrl:text>
</svrl:failed-assert>
<svrl:fired-rule context="Persons/Person"/>
<svrl:failed-assert test="@Title" location="/Persons/Person[3]">
<svrl:text>The element Person must have a Title attribute</svrl:text>
</svrl:failed-assert>
<svrl:failed-assert test="number(CustomerId) ge 1000" location="/Persons/Person[3]">
<svrl:text>The element CustomerId must be at least 1000</svrl:text>
</svrl:failed-assert>
<svrl:failed-assert test="@Title='Mr' or @Title='Mrs' or @Title='Ms'"
location="/Persons/Person[3]">
<svrl:text>Title must be Mr, Mrs or Ms</svrl:text>
</svrl:failed-assert>
<svrl:failed-assert test="string-length(Initials) le 3" location="/Persons/Person[3]">
<svrl:text>Initials must be no more than 3 characters</svrl:text>
</svrl:failed-assert>
</svrl:schematron-output>