Simplifying XML schema: effortless handling of nondeterministic regular expressions
G. Bex, W. Gelade, W. Martens, and F. Neven. SIGMOD '09 Proceedings of the 35th SIGMOD international conference on Management of data, (2009)
Abstract
Whether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. For instance, complex content models in XML Schema are constrained by the infamous unique particle attribution (UPA) constraint. In formal language theoretic terms, this constraint restricts content models to deterministic regular expressions. As the latter constitute a semantic notion and no simple corresponding syntactical characterization is known, it is very difficult for non-expert users to understand exactly when and why content models do or do not violate UPA. In the present paper, we therefore investigate solutions to relieve users from the burden of UPA by automatically transforming nondeterministic expressions into concise deterministic ones defining the same language or constituting good approximations. The presented techniques facilitate XSD construction by reducing the design task at hand more towards the complexity of the modeling task. In addition, our algorithms can serve as a plug-in for any model management tool which supports export to XML Schema format.
%0 Journal Article
%1 BGM09
%A Bex, Geert Jan
%A Gelade, Wouter
%A Martens, Wim
%A Neven, Frank
%D 2009
%J SIGMOD '09 Proceedings of the 35th SIGMOD international conference on Management of data
%K database expressions nondeterministic regular xml
%P 731-744
%T Simplifying XML schema: effortless handling of nondeterministic regular expressions
%U http://portal.acm.org/citation.cfm?id=1559845.1559922
%X Whether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. For instance, complex content models in XML Schema are constrained by the infamous unique particle attribution (UPA) constraint. In formal language theoretic terms, this constraint restricts content models to deterministic regular expressions. As the latter constitute a semantic notion and no simple corresponding syntactical characterization is known, it is very difficult for non-expert users to understand exactly when and why content models do or do not violate UPA. In the present paper, we therefore investigate solutions to relieve users from the burden of UPA by automatically transforming nondeterministic expressions into concise deterministic ones defining the same language or constituting good approximations. The presented techniques facilitate XSD construction by reducing the design task at hand more towards the complexity of the modeling task. In addition, our algorithms can serve as a plug-in for any model management tool which supports export to XML Schema format.
@article{BGM09,
abstract = {Whether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. For instance, complex content models in XML Schema are constrained by the infamous unique particle attribution (UPA) constraint. In formal language theoretic terms, this constraint restricts content models to deterministic regular expressions. As the latter constitute a semantic notion and no simple corresponding syntactical characterization is known, it is very difficult for non-expert users to understand exactly when and why content models do or do not violate UPA. In the present paper, we therefore investigate solutions to relieve users from the burden of UPA by automatically transforming nondeterministic expressions into concise deterministic ones defining the same language or constituting good approximations. The presented techniques facilitate XSD construction by reducing the design task at hand more towards the complexity of the modeling task. In addition, our algorithms can serve as a plug-in for any model management tool which supports export to XML Schema format.},
added-at = {2010-11-03T15:41:53.000+0100},
author = {Bex, Geert Jan and Gelade, Wouter and Martens, Wim and Neven, Frank},
biburl = {https://www.bibsonomy.org/bibtex/2ace4d8ed0c9bf428ccbd08abd893cb1b/malte.wunsch},
description = {Simplifying XML schema},
interhash = {8ff9ff3f00840895e3f7923fdfa34e70},
intrahash = {ace4d8ed0c9bf428ccbd08abd893cb1b},
journal = { SIGMOD '09 Proceedings of the 35th SIGMOD international conference on Management of data},
keywords = {database expressions nondeterministic regular xml},
pages = {731-744},
timestamp = {2010-11-03T15:42:57.000+0100},
title = {Simplifying XML schema: effortless handling of nondeterministic regular expressions},
url = {http://portal.acm.org/citation.cfm?id=1559845.1559922},
year = 2009
}