@malte.wunsch

Simplifying XML schema: effortless handling of nondeterministic regular expressions

, , , and . SIGMOD '09 Proceedings of the 35th SIGMOD international conference on Management of data, (2009)

Abstract

Whether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. For instance, complex content models in XML Schema are constrained by the infamous unique particle attribution (UPA) constraint. In formal language theoretic terms, this constraint restricts content models to deterministic regular expressions. As the latter constitute a semantic notion and no simple corresponding syntactical characterization is known, it is very difficult for non-expert users to understand exactly when and why content models do or do not violate UPA. In the present paper, we therefore investigate solutions to relieve users from the burden of UPA by automatically transforming nondeterministic expressions into concise deterministic ones defining the same language or constituting good approximations. The presented techniques facilitate XSD construction by reducing the design task at hand more towards the complexity of the modeling task. In addition, our algorithms can serve as a plug-in for any model management tool which supports export to XML Schema format.

Description

Simplifying XML schema

Links and resources

Tags

community

  • @dblp
  • @malte.wunsch
@malte.wunsch's tags highlighted