
| Easily use the power of regular expressions in your C# and VB.NET applications with RegexBuddy. Create and analyze regex patterns with RegexBuddy's intuitive regex building blocks. Implement regexes in your applications with instant C# and VB.NET code snippets. Just tell RegexBuddy what you want to achieve, and copy and paste the auto-generated C# or VB.NET code. Get your own copy of RegexBuddy now. |
The Microsoft .NET Framework, which you can use with any .NET programming language such as C# (C sharp) or Visual Basic.NET, has solid support for regular expressions. The documentation of the regular expression classes is very poor, however. Read on to learn how to use regular expressions in your .NET applications. In the text below, I will use VB.NET syntax to explain the various classes. After the text, you will find a complete application written in C# to illustrate how to use regular expressions in great detail. I recommend that you download the source code, read the source code and play with the application. That will give you a clear idea how to use regexes in your own applications.
As you can see in the regular expression flavor comparison, .NET's regex flavor is very feature-rich. The only noteworthy feature that's lacking are possessive quantifiers.
There are no differences in the regex flavor supported by .NET versions 1.x, 2.0 and 3.0, except for one feature added in .NET 2.0: character class subtraction. It works exactly the way it does in XML Schema regular expressions. The XML Schema standard first defined this feature and its syntax.
The regex classes are located in the namespace System.Text.RegularExpressions. To make them available, place Imports System.Text.RegularExpressions at the start of your source code.
The Regex class is the one you use to compile a regular expression. For efficiency, regular expressions are compiled into an internal format. If you plan to use the same regular expression repeatedly, construct a Regex object as follows: Dim RegexObj as Regex = New Regex("regularexpression"). You can then call RegexObj.IsMatch("subject") to check whether the regular expression matches the subject string. The Regex allows an optional second parameter of type RegexOptions. You could specify RegexOptions.IgnoreCase as the final parameter to make the regex case insensitive. Other options are RegexOptions.Singleline which causes the dot to match newlines and RegexOptions.Multiline which causes the caret and dollar to match at embedded newlines in the subject string.
Call RegexObj.Replace("subject", "replacement") to perform a search-and-replace using the regex on the subject string, replacing all matches with the replacement string. In the replacement string, you can use $& to insert the entire regex match into the replacement text. You can use $1, $2, $3, etc... to insert the text matched between capturing parentheses into the replacement text. Use $$ to insert a single dollar sign into the replacement text. To replace with the first backreference immediately followed by the digit 9, use ${1}9. If you type $19, and there are less than 19 backreferences, the $19 will be interpreted as literal text, and appear in the result string as such. To insert the text from a named capturing group, use ${name}. Improper use of the $ sign may produce an undesirable result string, but will never cause an exception to be raised.
RegexObj.Split("Subject") splits the subject string along regex matches, returning an array of strings. The array contains the text between the regex matches. If the regex contains capturing parentheses, the text matched by them is also included in the array. If you want the entire regex matches to be included in the array, simply place round brackets around the entire regular expression when instantiating RegexObj.
The Regex class also contains several static methods that allow you to use regular expressions without instantiating a Regex object. This reduces the amount of code you have to write, and is appropriate if the same regular expression is used only once or reused seldomly. Note that member overloading is used a lot in the Regex class. All the static methods have the same names (but different parameter lists) as other non-static methods.
Regex.IsMatch("subject", "regex") checks if the regular expression matches the subject string. Regex.Replace("subject", "regex", "replacement") performs a search-and-replace. Regex.Split("subject", "regex") splits the subject string into an array of strings as described above. All these methods accept an optional additional parameter of type RegexOptions, like the constructor.
If you want more information about the regex match, call Regex.Match() to construct a Match object. If you instantiated a Regex object, use Dim MatchObj as Match = RegexObj.Match("subject"). If not, use the static version: Dim MatchObj as Match = Regex.Match("subject", "regex").
Either way, you will get an object of class Match that holds the details about the first regex match in the subject string. MatchObj.Success indicates if there actually was a match. If so, use MatchObj.Value to get the contents of the match, MatchObj.Length for the length of the match, and MatchObj.Index for the start of the match in the subject string. The start of the match is zero-based, so it effectively counts the number of characters in the subject string to the left of the match.
If the regular expression contains capturing parentheses, use the MatchObj.Groups collection. MatchObj.Groups.Count indicates the number of capturing parentheses. The count includes the zeroth group, which is the entire regex match. MatchObj.Groups(3).Value gets the text matched by the third pair of round brackets. MatchObj.Groups(3).Length and MatchObj.Groups(3).Index get the length of the text matched by the group and its index in the subject string, relative to the start of the subject string. MatchObj.Groups("name") gets the details of the named group "name".
To find the next match of the regular expression in the same subject string, call MatchObj.NextMatch() which returns a new Match object containing the results for the second match attempt. You can continue calling MatchObj.NextMatch() until MatchObj.Success is False.
Note that after calling RegexObj.Match(), the resulting Match object is independent from RegexObj. This means you can work with several Match objects created by the same Regex object simultaneously.
In literal C# strings, as well as in C++ and many other .NET languages, the backslash is an escape character. The literal string "\\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a C# string, becomes "\\\\". That's right: 4 backslashes to match a single one.
The regex \w matches a word character. As a C# string, this is written as "\\w".
To make your code more readable, you should use C# verbatim strings. In a verbatim string, a backslash is an ordinary character. This allows you to write the regular expression in your C# code as you would write it a tool like RegexBuddy or PowerGREP, or as the user would type it into your application. The regex to match a backlash is written as @"\\" when using C# verbatim strings. The backslash is still an escape character in the regular expression, so you still need to double it. But doubling is better than quadrupling. To match a word character, use the verbatim string @"\w".
To really get to grips with the regex support of the Microsoft .NET Framework, I recommend that you study the demo application I created. It is written in C#. The demo is fairly simple, so you should understand the source code even if you do not use C# yourself. The demo code has lots of comments that clearly indicate what my code does, why I coded it that way, and which other options you have. The demo code also catches all exceptions that may be thrown by the various methods, something I did not explain above.
The demo application covers every aspect of the System.Text.RegularExpressions package. You can use it to learn how to use the package, and to quickly test regular expressions while coding.
Read the source code in your web browser
Download the demo application and source code
The book Mastering Regular Expressions not only explains everything you want to know and don't want to know about regular expressions, including the regex features that are unique to .NET. It has an excellent chapter on .NET's System.Text.RegularExpressions namespace, explaining the various Regex classes far better than Microsoft's documentatio, with plenty of example VB.NET example code and some C# code showing more advanced techniques.
My review of the book Mastering Regular Expressions
Buy Mastering Regular Expressions from Amazon.com
Buy Mastering Regular Expressions from Amazon.co.uk
Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!
Page URL: http://www.Regular-Expressions.info/dotnet.html
Page last updated: 28 June 2008
Site last updated: 23 December 2008
Copyright © 2003-2008 Jan Goyvaerts. All rights reserved.
| Regex Tools |
| grep |
| PowerGREP |
| RegexBuddy |
| General Applications |
| EditPad Pro |
| Languages & Libraries |
| Delphi |
| GNU (Linux) |
| Java |
| JavaScript |
| .NET |
| PCRE (C/C++) |
| Perl |
| PHP |
| POSIX |
| PowerShell |
| Python |
| R |
| REALbasic |
| Ruby |
| Tcl |
| VBScript |
| Visual Basic 6 |
| wxWidgets |
| XML Schema |
| XQuery & XPath |
| Databases |
| MySQL |
| Oracle |
| PostgreSQL |
| More Information |
| Introduction |
| Quick Start |
| Tutorial |
| Tools and Languages |
| Examples |
| Books |
| Reference |
| Print PDF |
| About This Site |
| RSS Feed & Blog |
| PowerGREP 3 |
| Use regular expressions to search through large numbers of text and binary files, such as source code, correspondence, server or system logs, reference texts, archives, etc. Quickly find the files you are looking for, or extract the information you need. Look through just a handful of files, or thousands of files and folders. |
| Perform comprehensive text and binary replacement operations for easy maintenance of websites, source code, reports, etc. Preview replacements before modifying files, and stay safe with flexible backup and undo options. |
| Work with plain text files, Unicode files, binary files, files stored in zip archives, and even MS Word documents, Excel spreadsheets and PDF files. Runs on Windows 98, ME, NT4, 2000, XP & Vista. |
| More information |
| Download PowerGREP now |