Wednesday, August 12, 2009

Validate a text file based on LINQ



This article will introduce how to validate a text file based on LINQ.


A flat file is being used as the data file to pass around. In order to make the file to be accepted from other systems, the file has to follow some data format that the other systems expect. Not only the data format, some business rules may also need to be pre-evaluated before the file is sent out. In this article, I'll build a validation engine based on LINQ to a text file to challenge if the file meets all the requirements which I defined for a system to process it.

Rules need to be validated

The valid file should have a HEADER record, a FOOTER record, and some CONTENT records. The first 3 characters will be used to describe the type of record it is. In this case, I am going to use 111 for HEADER record, 999 for FOOTER record, and 222 for CONTENT records. We could use 4-6 characters to describe the sub-record information. However, I am only going to demo the HEADER, FOOTER, and CONTENT records here. File example:

111this is header record 123-45-6789 123456789 123-45-6789 999totalrecord5

In the above file, here are some basic rules I want to validate:

  1. File needs to have a HEADER record, which has a fixed length (16) --> raise error message when not met.
  2. File needs to have one or many CONTENT records, which has a fixed length (40) --> raise error message when not met.
  3. Position starting from 4 to 23 from CONTENT record is used for email information, and it needs to follow the email format ( --> raise warning when not met.
  4. Position starting from 24 to 40 from CONTENT record is used for SSN information, and it need to be like: xxx-xx-xxxx (x is number only) -> raise warning when not met.
  5. File needs to have a FOOTER record. --> raise error message when not met.

Validation engine classes

Since the validation engine needs to validate multiple text file formats, I am going create an abstract class (BaseValidator) to structure the basic functionality. A child class (File1Validator) will implement those methods based on it own rules. An Event Handler in BaseValidator will handle the information passed through to the client when an error occurs.

Diagram class:


Validation engine implementation


BaseValidator is an abstract class. It has a Validate() method that drives all the validating process. In this case, it will always execute the ValidateHeader(), ValidateContent(), and ValidateFooter() processes when we invoke Validate(). The Validate() method can be overridden from the child class to implement more processes.

See full detail:

No comments: