Saturday, February 5, 2011

Generating XML Schema with schemagen and Groovy

I have previously blogged on several utilitarian tools that are provided with the Java SE 6 HotSpot SDK such as jstack, javap, and so forth. I focus on another tool in the same $JAVA_HOME/bin (or %JAVA_HOME%\bin directory: schemagen. Although schemagen is typically used in conjunction with web services and/or JAXB, it can be useful in other contexts as well. Specifically, it can be used as an easy way to create a starting point XML Schema Definition (XSD) for someone who is more comfortable with Java than with XML Schema.

We'll begin with a simple Java class called Person to demonstrate the utility of schemagen. This is shown in the next code listing.

package dustin.examples;

public class Person
{
   private String lastName;

   private String firstName;

   private char middleInitial;

   private String identifier;

   /**
    * No-arguments constructor required for 'schemagen' to create XSD from
    * this Java class.  Without this "no-arg default constructor," this error
    * message will be displayed when 'schemagen' is attempted against it:
    *
    *      error: dustin.examples.Person does not have a no-arg default
    *      constructor.
    */
   public Person() {}

   public Person(final String newLastName, final String newFirstName)
   {
      this.lastName = newLastName;
      this.firstName = newFirstName;
   }

   public Person(
      final String newLastName,
      final String newFirstName,
      final char newMiddleInitial)
   {
      this.lastName = newLastName;
      this.firstName = newFirstName;
      this.middleInitial = newMiddleInitial;
   }

   public String getLastName()
   {
      return this.lastName;
   }

   public void setLastName(final String newLastName)
   {
      this.lastName = newLastName;
   }

   public String getFirstName()
   {
      return this.firstName;
   }

   public void setFirstName(final String newFirstName)
   {
      this.firstName = newFirstName;
   }

   public char getMiddleInitial()
   {
      return this.middleInitial;
   }
}

The class above is very simple, but is adequate for the first example of employing schemagen. As the comment on the no-arguments constructor in the above code states, a constructor without arguments (sometimes called a "default constructor") must be available in the class. Because other constructors are in this class, it is required that a no-args constructor be explicitly specified. I also intentionally provided get/set (accesor/mutator) methods for some of the fields, only an accessor for one of the fields, and neither for a field to demonstrate that schemagen requires get/set methods to be specified if the schema it generates includes a reference to those attributes.

The next screen snapshot demonstrates the most simple use of schemagen in which the generated XML schema file (.xsd) is generated with the default name of schema1.xsd (there is no current way to control this directly with schemagen) and is placed in the same directory from which the schemagen command is run (output location can be dictated with the -d option).


The generated XSD is shown next.

schema1.xsd
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person">
    <xs:sequence>
      <xs:element name="firstName" type="xs:string" minOccurs="0"/>
      <xs:element name="lastName" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

This is pretty convenient, but is even easier with Groovy. Suppose that one wanted to generate an XSD using schemagen and did not care about or need the original Java class. The following very simple Groovy class could be used. Very little effort is required to write this, but it's compiled .class file can be used with schemagen.

package dustin.examples;

public class Person2
{
   String lastName;

   String firstName;

   char middleInitial;

   String identifier;
}

When the above Groovy class is compiled with groovyc, its resulting Person2.class file can be viewed through another useful tool (javap) located in the same directory as schemagen. This is shown in the next screen snapshot. The most important observation is that get/set methods have been automatically generated by Groovy.


When the groovyc-generated .class file is run through schemagen, the XSD is generated and is shown next.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person2">
    <xs:sequence>
      <xs:element name="firstName" type="xs:string" minOccurs="0"/>
      <xs:element name="identifier" type="xs:string" minOccurs="0"/>
      <xs:element name="lastName" type="xs:string" minOccurs="0"/>
      <xs:element name="middleInitial" type="xs:unsignedShort"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Because I did not explicitly state that Groovy's automatic get/set methods should not be applied, all attributes are represented in the XML. Very little Groovy, but XSD nonetheless.

It is interesting to see what happens when the attributes of the Groovy class are untyped. The next Groovy class listing does not explicitly type the class attributes.

package dustin.examples;

public class Person2
{
   def lastName;

   def firstName;

   def middleInitial;

   def identifier;
}

When schemagen is run against the above class with untyped attributes, the output XSD looks like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person2">
    <xs:sequence>
      <xs:element name="firstName" type="xs:anyType" minOccurs="0"/>
      <xs:element name="identifier" type="xs:anyType" minOccurs="0"/>
      <xs:element name="lastName" type="xs:anyType" minOccurs="0"/>
      <xs:element name="middleInitial" type="xs:anyType" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Not surprisingly, the Groovy class with the untyped attributes leads to an XSD with elements of anyType. It is remarkably easy to generate Schema with schemagen from a Groovy class, but what if I don't want an attribute of the class to be part of the generated schema? Explicitly specifying an attribute as private communicates to Groovy to not automatically generate get/set methods and hence schemagen will not generate XSD elements for those attributes. The next Groovy class shows two attributes explicitly defined as private and the resultant XSD from running schemagen against the compiled Groovy class is then shown.

package dustin.examples;

public class Person2
{
   String lastName;

   String firstName;

   /** private modifier prevents auto Groovy set/get methods */
   private String middleInitial;

   /** private modifier prevents auto Groovy set/get methods */
   private String identifier;
}

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person2">
    <xs:sequence>
      <xs:element name="firstName" type="xs:string" minOccurs="0"/>
      <xs:element name="lastName" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Groovy makes it really easy to generate an XSD. The Groovy code required to do so is barely more than a list of attributes and their data types.


Conclusion

The schemagen tool is a highly useful tool most commonly used in conjunction with web services and with JAXB, but I have found several instances where I have needed to create a "quick and dirty" XSD file for a variety of purposes. Taking advantage of Groovy's automatically generated set/get methods and other Groovy conciseness makes it really easy to generate a simple XSD.

No comments: