CSV
parsing can be performed automatically without any additional meta
information indication necessary. However, user want sometimes change
the meta names. Using interface description files, user can define the
header values, the field names or both.
The
steps to achieve the renaming of CSV metadata are:
- Create an
interface description file
- Indicate the
definition file in the configuration:
Chapter |
Section |
Key |
Value |
System
|
source
interface
name
|
DescriptionFile
|
Filename
|
DTD for interface
description file
The interface description file must be in XML format. It needs to be
valid against the DTD definition file InterfaceSpecCSV.dtd
:
<?xml
version="1.0"
encoding="UTF-8"?>
<!ELEMENT InterfaceSpec ( Header?, Records?) >
<!-- Root element -->
<!ATTLIST InterfaceSpec Name NMTOKEN #REQUIRED >
<!ELEMENT Header ( Field+ ) >
<!-- File header -->
<!ATTLIST Header Name NMTOKEN #IMPLIED >
<!ELEMENT Records ( Record+ ) >
<!-- Part which contains the
Content Fields-->
<!ELEMENT Record ( Field+ ) >
<!-- Structure description for a
single record type. -->
<!ATTLIST Record Name NMTOKEN #REQUIRED >
<!ELEMENT Field EMPTY >
<!-- Specification of a single
field -->
<!ATTLIST Field Name CDATA #REQUIRED >
<!ATTLIST Field Format ( alpha | blank | const | date | num )
#IMPLIED >
<!ATTLIST Field Value CDATA #IMPLIED >
<!-- Necessary if
<Format>="const".
Specifies the constant value. -->
<!ATTLIST Field DateFormat NMTOKEN #IMPLIED >
<!-- Necessary if
<Format>="date".
Format strings like "yyMMdd" should be used which obey the
java.text.SimpleDateFormat conventions. -->
<!ATTLIST Field DecimalPoint ( comma | dot ) #IMPLIED >
<!-- Necesssary if
<Format>="num" and float values may be specified.
-->
<!ATTLIST Field Length NMTOKEN #IMPLIED >
|
Interface description example
The following example shows, how the sample CSV format from the top of
the page can be modified by an interface description.
The CSV file:
01 Name, Rank, Location
02 John Doe, Software Engineer, Munich
03 "Powers, Mary", Senior Software Engineer, London
A suitable interface description file:
<?xml
version="1.0" encoding="UTF-8"?>
<!DOCTYPE InterfaceSpec SYSTEM "InterfaceSpecCSV.dtd">
<InterfaceSpec Name="CSVFullTest">
<Header
Name="employeesHeader">
<Field Name="Full name"/>
<Field Name="Status in company"/>
<Field Name="Working location"/>
</Header>
<Records>
<Record Name="employee">
<Field Name="name"/>
<Field Name="status"/>
<Field Name="location"/>
</Record>
</Records>
</InterfaceSpec>
|
Using the CVS data and the description file, the input values are
parsed into an XML representation. The XML representation is used
internally by the xBus and the starting point for further processing.
The parsed data in XML format:
01 <?xml version="1.0" encoding="UTF-8"?>
02 <EmployeesInCSV>
03 <Header>
04 <Heading>Full name</Heading>
05 <Heading>Status in company</Heading>
06 <Heading>Working location</Heading>
07 </Header>
08 <Records>
09 <Record>
10 <name>John Doe</name>
11 <status>Software Engineer</status>
12 <location>Munich</location>
13 </Record>
14 <Record>
15 <name>Powers, Mary</name>
16 <status>Senior Software Engineer</status>
17 <location>London</location>
18 </Record>
19 </Records>
20 </EmployeesInCSV>
Processing
details:
When using interface description files, a check is done, whether the
number of fields per record in the description file and the data file
are equal. An error occurs if they differ.
Only names, which are valid tag names, are allowed in the
record section of the description file. If an incorrect setting is made
in the description file for a field , the tag "field" is used in the
XML representation instead.
The mechanism to select header values and tag names is quite complex.
The following table gives an overview, which names are taken depending
on the content of the data and description file:
CSV
file |
Description
file |
Result |
has header |
does not
contain header |
No
description file |
does not
contain any information |
contains
only Header information |
contains
only Records information |
contains
Header and Records information |
Header
information taken from: |
Tag
names of entries taken from: |
X |
|
X |
|
|
|
|
CSV header |
CSV header |
X |
|
|
X |
|
|
|
CSV header |
CSV header |
X |
|
|
|
X |
|
|
Description
file |
Header
information in description file |
X |
|
|
|
|
X |
|
CSV header |
Records
information in description file |
X |
|
|
|
|
|
X |
Description
file |
Records
information in description file |
|
X |
X |
|
|
|
|
No header |
Tag name =
"field" |
|
X |
|
X |
|
|
|
No header |
Tag name =
"field" |
|
X |
|
|
X |
|
|
Description
file |
Header
information in description file |
|
X |
|
|
|
X |
|
No header |
Records
information in description file |
|
X |
|
|
|
|
X |
Description
file |
Records
information in description file |