Google

A first look at Spring-Batch, part 2

Written on:October 11, 2008
Comments
Add One

In my first post about Spring-Batch, I described in detail a Hello-World application using Spring-Batch and discussing the necessary plumbing wiring needed in the spring-beans configuration file. In this second post I will take it one step further, by introducing the concept of tokenizer and field-set mapper.

I will copy-reuse as much as possible from the previous project. Because the reuse is by copy, you can download and study them independent, without any compile/runtime dependencies between them. (In constrast to the Spring-Batch samples, wich is one big heap of code)

The Application

The application reads a set of person data from a file, creates a person object and prints it out. Simple as that and still a toy application, however it allows you to concentrate on the key concepts of tokenizer and mapper.

The Input File

Let’s start with the input data. It is in CSV (Character Separated Values) format and located at the top of the class path.

Name;Street;PostCode;City
Anna Conda;Hacker street 17;12345;Javaville
Sham Poo;Reboot lane 5;67890;Perlvillage
Sandy Shoes;Desert town street 11;98765;Cobolburgh

The Domain Class

The overall objective is to transform each record in the CSV file into a Person object. Here is the Person class for easy reference. It uses the ToStringBuilder from the Jakarta Commons Lang project.

package com.ribomation.tutorial;
import org.apache.commons.lang.builder.ToStringBuilder;
import org.apache.commons.lang.builder.ToStringStyle;

public class Person {
    private String      name, street, postCode, city;
    public String toString() {
        return ToStringBuilder.reflectionToString(this, ToStringStyle.SHORT_PREFIX_STYLE);
    }
    //. . .getters and setters. . .
}

Tokenizers and Mappers

The file above is a so called flat-file, each row is a record and each record is subdivied into fields, where the fields are separated by a semi-colon character. We will be using a FlatFileItemReader to read from our (class path) resource. The reader will read one line at a time, need a means to break the line into fields. This is the task for tokenizer, in our case we will use a DelimitedLineTokenizer and tell it to split fields around semi-colon.

A tokenizer gets a string and returns a FieldSet. The next step is to convert the field-set into an business object. A FieldSetMapper gets a field-set and returns a fresh new object. This mapper class is something you often have to implement yourself, although in many cases you can get way with a BeanWrapperFieldSetMapper. I don’t not want to use too many magics at once, so here is the very straight-forward mapper class.

package com.ribomation.tutorial;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.mapping.FieldSet;

public class PersonMapper implements FieldSetMapper {
    public Object mapLine(FieldSet fs) {
        Person  p   = new Person();
        int     idx = 0;
        p.setName    ( fs.readString(idx++) );
        p.setStreet  ( fs.readString(idx++) );
        p.setPostCode( fs.readString(idx++) );
        p.setCity    ( fs.readString(idx++) );
        return p;
    }
}

The Spring Beans

Now we know sufficient to configure the reader in the Spring beans config. As I said initially, I reuse (by copy) as much as possible from the first hello spring-batch application, so I will only show you the new/changed XML snippet.

    <bean id="inputFile" class="org.springframework.core.io.ClassPathResource">
        <constructor-arg value="/names.csv"/>
    </bean>

    <bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" ref="inputFile"/>
        <property name="firstLineIsHeader" value="true"/>
        <property name="lineTokenizer">
            <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                <property name="delimiter" value=";"/>
            </bean>
        </property>
        <property name="fieldSetMapper">
            <bean class="com.ribomation.tutorial.PersonMapper"/>
        </property>
    </bean>

Running the Application

Now we are ready to compile/build and execute. The new POM aslmost the same as the previous, except the artifact name is ‘HelloSpringBatch2′ (plus a new dependency for commons-lang). Build the application with

mvn package

Finally run the application with the command below. We are reusing the LogWriter from the first application, which just prints out the supplied object using its toString() method.

HelloSpringBatch-2> java -jar target\HelloSpringBatch2-1.0.jar hello-spring-batch.xml helloJob
[SimpleJobLauncher] No TaskExecutor has been set, defaulting to synchronous executor.
[SimpleStepFactoryBean] Setting commit interval to default value (1)
[SimpleJobLauncher] Job: [SimpleJob: [name=helloJob]] launched with the following parameters: [{}{}{}{}]
[LogWriter] Person[name=Anna Conda,street=Hacker street 17,postCode=12345,city=Javaville]
[LogWriter] Person[name=Sham Poo,street=Reboot lane 5,postCode=67890,city=Perlvillage]
[LogWriter] Person[name=Sandy Shoes,street=Desert town street 11,postCode=98765,city=Cobolburgh]
[SimpleJobLauncher] Job: [SimpleJob: [name=helloJob]] completed successfully with the following parameters: [{}{}{}{}]

A Small Variation

I mentioned above we can get away with implementing a mapper class, if it’s easy to map fields to bean properties. So let’s do just that.

A BeanWrapperFieldSetMapper need to know the destination class (or use a protype bean) and know the names of each field. My own mapper class above intentionally used indexing instead of field names. Look at the input field again, the first line contain the field names

Name;Street;PostCode;City
Anna Conda;Hacker street 17;12345;Javaville
. . .

and the reader configuration contained an instruction to interpret the first line as the field name defintion

    <bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="firstLineIsHeader" value="true"/>
        <property name="fieldSetMapper"    ref="mapper"/>
        . . .

That’s all we need, to provide BeanWrapperFieldSetMapper with sufficient information for it to perform its duties.

    <bean id="mapper" class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
        <property name="targetType" value="com.ribomation.tutorial.Person"/>
    </bean>

Don’t forget to recompile/package (mvn package). The execution will produce the exact same output as before.

HelloSpringBatch-2>java -jar target\HelloSpringBatch2-1.0.jar hello-spring-batch.xml helloJob
[SimpleJobLauncher] No TaskExecutor has been set, defaulting to synchronous executor.
[SimpleStepFactoryBean] Setting commit interval to default value (1)
[SimpleJobLauncher] Job: [SimpleJob: [name=helloJob]] launched with the following parameters: [{}{}{}{}]
[LogWriter] Person[name=Anna Conda,street=Hacker street 17,postCode=12345,city=Javaville]
[LogWriter] Person[name=Sham Poo,street=Reboot lane 5,postCode=67890,city=Perlvillage]
[LogWriter] Person[name=Sandy Shoes,street=Desert town street 11,postCode=98765,city=Cobolburgh]
[SimpleJobLauncher] Job: [SimpleJob: [name=helloJob]] completed successfully with the following parameters: [{}{}{}{}]

This new version only uses two written classes; the domain class Person and the writer class LogWriter. The rest is pure configuration.

A Second Variation

Let’s twist this application a second time by changing the output format. Instead of using our LogWriter, let Spring-Batch convert the output to XML.

We will use another bean called xmlWriter, which is of type StaxEventItemWriter. This bean needs to have a XML serializer, an output resource and the name of the XML root tag. The serializer uses a marshaller, which is an abstraction around different XML O/X mapping tools. We will use XStream, a light-weight XML tool.

Here are all additions to the beans configuration file. The xmlWriter reference should replace the LogWriter reference in the helloStep configuration.

    <bean id="xmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
        <property name="rootTagName"     value="persons"/>
        <property name="serializer"      ref="xmlSerializer"/>
        <property name="overwriteOutput" value="true"/>
        <property name="resource"        ref="xmlOutputFile"/>
    </bean>

    <bean id="xmlSerializer" class="org.springframework.batch.item.xml.oxm.MarshallingEventWriterSerializer">
        <constructor-arg>
            <bean class="org.springframework.oxm.xstream.XStreamMarshaller"/>
        </constructor-arg>
    </bean>

    <bean id="xmlOutputFile" class="org.springframework.core.io.FileSystemResource">
        <constructor-arg value="persons.xml"/>
    </bean>

The output now goes to a file, named ‘persons.xml’ in the current directory. Don’t forget to recompile/rebuild and run it as before. The contents of the produced file is

<?xml version="1.0" encoding="UTF-8" ?>
<persons>
    <com.ribomation.tutorial.Person>
        <name>Anna Conda</name>
        <street>Hacker street 17</street>
        <postCode>12345</postCode>
        <city>Javaville</city>
    </com.ribomation.tutorial.Person>
    <com.ribomation.tutorial.Person>
        <name>Sham Poo</name>
        <street>Reboot lane 5</street>
        <postCode>67890</postCode>
        <city>Perlvillage</city>
    </com.ribomation.tutorial.Person>
    <com.ribomation.tutorial.Person>
        <name>Sandy Shoes</name>
        <street>Desert town street 11</street>
        <postCode>98765</postCode>
        <city>Cobolburgh</city>
    </com.ribomation.tutorial.Person>
</persons>

This concludes my second post of Spring-Batch.

Source Code

10 Comments add one

  1. Hila says:

    @pitabas:
    try to uncomment the last line and comment the second last line in the helloStep bean, so it would look like this:

    <!–

    –>

  2. Hila says:

    @pitabas:
    try to uncomment the last line and comment the second last line in the “helloStep” bean

  3. Doug says:

    WOW Thank you so much for this, I have been having a tough time finding good spring batch examples. I am curious, is it possible to output html instead of XML?

  4. Mansoor says:

    Thanks for the post.
    I have question here, can we invoke reader bean directly with out launching a job?

    • jens says:

      Yes, all beans are spring-beans, which means they can be wired the spring way and invoked accordingly.

  5. pitabas says:

    Hi,
    I am getting the below error, please suggest me to solve the problem

    [SimpleJobLauncher] No TaskExecutor has been set, defaulting to synchronous executor.
    [SimpleStepFactoryBean] Setting commit interval to default value (1)
    [SimpleJobLauncher] Job: [SimpleJob: [name=helloJob]] launched with the following parameters: [{}{}{}{}]
    [AbstractStep] Encountered an error executing the step: class javax.xml.stream.FactoryConfigurationError: Provider com.bea.xml.stream.XMLOutputFactoryBase not found

    Thanks,
    Pitabas

  6. jens says:

    @Rahul:
    Thanks for spotting the typo. I have now updated the post.

  7. Rahul says:

    Pom.xml can not have bean Bean declaration.Its mistake in your article.

  8. Javabee says:

    Hi

    Thanks for detailed explanation.

    I am trying to do a vice versa of example, i.e trying to convert details from XML file to csv file.
    I am not sure how to set the mapper for XML file and read them for flat file.

    Could you please provide some details or possible share the code.

    Thanks

  9. Axel says:

    Är det möjligt att prenumerera på uppdateringarna på sidan?

Leave a Comment

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Why ask?

Previous post:

Next post: