Tuesday, July 7, 2009

Searching, Modifying, and Encoding Text

Searching, Modifying, and Encoding Text

Introduction

The purpose of this article is to help the developer to enhance the text building capabilities of a .NET Framework application via the System.Text namespace. Processing text is one of the most common programming tasks. User input is typically in text format, and it might need to be validated, sanitized, and reformatted. Also, as a developer, you might need to process text files generated from a legacy system to extract important data. These legacy systems often use nonstandard encoding techniques. Moreover, the majority of software development companies maintain large legacy code bases. Finally, a developer might need to output text files in specific formats to input data in a legacy system.

Forming Regular Expressions – The Basics

As I stated earlier, developers often need to process text. For example, you might to process text from a user to remove or replace special characters. A regular expression is a set of characters that can be compared to a string to determine whether the string meets specified format requirements. That is, regular expressions are simply arrangements of text that describe a pattern to match within some input text. Simple variants of regular expressions pop up everywhere. The MS_DOS command ‘dir *.cs’ employs a very simple pattern matching technique much like what regular expressions can do. The asterisk followed by the period and the two letters is a simple wildcard pattern. To use regular expressions for pattern matching, we will create a simple console application named TestRegExp that accepts two strings as input and determines whether the first string (a regular expression) matches the second string. Here is some basic code. The regular expression will not make sense at all, but it should by the end of the article:


See full detail: http://www.codeproject.com/KB/cs/Patterns.aspx

No comments: