BizTalk 2004: Delimited flat file schema with multiple record types

BizTalk 2004: Delimited flat file schema with multiple record types

Post by Matthew Ro » Tue, 01 Jun 2004 11:40:59


reetings:

I have a delimited flat file being produced by a legacy system that
contains a variety of different record types. I need to create a flat file
schema so that I can map these records to SQL Server stored procedures using
the SQL Adapter and enter their data into a SQL Server database. I've been
through the BizTalk 2004 documentation and the public BizTalk Server
newsgroups and have not found any information that will help me.

Here is a sample extract from the files I need to process:

20040508,"000175",5
"000175",20040508,"XD9",4,"",$0.00,$0.00,""
"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00

Here are some significant characteristics of the file format:

1) The file format uses CrLf as its record delimiter and comma as its
field delimiter.
2) The first record in the file includes the date in yyyyMMdd format, a
data source identifier (a numeric identifier that gives context to the data
in the file) and an integer listing the total number of records in the file.
3) Every other record begins with the data source identifier (the same
value as the second field in the first record) as the first field, followed
by the date (the same value as the first field in the first record).
4) The third field in all records after the first is a string that
identifies the record type. There are four record types listed in the sample
extract above; there are over 50 in the actual file format I need to handle.
5) Each record type has its own well-defined schema, with specific
fields being included for each record type.
6) Other than the first record, the records in the file can appear in
any order. There is nothing to say that an AA1 record would appear before a
ZZ9 record, for example.

The only thing in the BizTalk schema editor that I have found that looks
even remotely helpful is the Tag Identifier node property ("You can use the
Tag Identifier property to specify the tag within a delimited record" sounds
helpful, right?), the BizTalk Server documentation includes this text:
"Unlike tags in positional records, tags in delimited records must occur at
the beginning of the delimited record and are automatically never included
in the data when the record is translated to its equivalent XML format."
That sounds a lot LESS helpful. Because the record identifier in my files is
not the first field in the record, and because the data in the first two
fields will vary from file to file, it does not appear that I can use this
property without doing a lot of extra legwork.

The only approach that looks like it will solve my problem is to write a
custom pipeline component to be executed in the Decode pipeline stage, and
to have that custom component rewrite the incoming file stream to place the
record identifier at the beginning of each record, so that the flat file
disassembler will find this information in the location where it can process
it. With this approach, the file stream for the example above will look like
this when it gets processed by the flat file disassembler:

20040508,"000175",5
XD9,"000175",20040508,"XD9",4,"",$0.00,$0.00,""
XM8,"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
XM9,"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
XM8,"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00
 
 
 

BizTalk 2004: Delimited flat file schema with multiple record types

Post by TWF0dGhldy » Tue, 01 Jun 2004 23:31:03

realize it's probably bad form to post follow-ups to one's own messages, but I've been doing some additional research in an attempt to avoid unpleasant surprises down the road. I still need to solve the problem listed in the first message in this thread, but I believe I've found another one waiting to be uncovered.

If I manually update my sample input document instance to include the Tag Identifier values for each row, I can get the instance to validate successfully, but only if the detail rows are in sequence (which they will not be in the real documents I need to process) or if there are only one of each record type. I have been experimenting - unsuccessfully - with the Group Order Type property of the document root node, which seems to be what will control this part of the document schema. This is what the BizTalk Server documentation says about this property:

Allowed Values:
All: Specifies the element group as an all group. All groups allow their child elements to appear zero (0) or one (1) time, and in any order, in instance messages. Restrictions apply; see remarks for more information.
Choice: Specifies the element group as a choice group. A choice group allows only one of its child elements to appear in instance messages.
Sequence: Specifies the element group as a sequence group. Sequence groups require that child elements in instance messages appear in the same order as defined in the schema. This is the default value.

None of these values appear to do what I need. Currently, using "Sequence" as the Group Order Type property value I can parse this document instance (where the detail records appear in the same order as they are defined in the schema):

HDR,20040508,"000175",5
XD9,"000175",20040508,"XD9",4,"",$0.00,$0.00,""
XM8,"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
XM8,"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00
XM9,"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00

but can not parse this one (where the detail records appear in the pseudo-random order in which they will appear in "live" documents):

HDR,20040508,"000175",5
XD9,"000175",20040508,"XD9",4,"",$0.00,$0.00,""
XM8,"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
XM9,"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
XM8,"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00

I have been unable to find *any* information about parsing this type of flat file using BizTalk Server 2004, which has been very frustrating. I have, however, found EDI samples in the BizTalk SDK that appear to do what I need. The X124010850Schema.xsd EDI schema from the sample in %InstallFolder%\EDI\Adapter\Getting Started with EDI\Visual Studio Projects\Getting Started with EDI\Session 1 appears to do what I need, but I cannot see how it does it. This EDI schema has the Group Order Type value of "Sequence" but in the sample instance documents included with the EDI sample application the various child nodes (such as PER and PID) appear repeatedly.

Is there a way to support this type of flat file in BizTalk Server 2004?

----- Matthew Roche wrote: -----

Greetings:

I have a delimited flat file being produced by a legacy system that
contains a variety of different record types. I need to create a flat file
schema so that I can map these records to SQL Server stored procedures using
the SQL Adapter and enter their data into a SQL Server database. I'v