Products
Database Search Solution (New Version) Search Control SEO Pager Highlighter Shortcut Controls Crypt Package Free ASP.NET Controls
Geotargeting Component ASP.NET Media Player Control Flash Video Player Control Services
ASP.NET Telecommute Jobs Free IP Location Lookup Test .Net Regular Expressions CSS/Table/DIV Page Layouts Custom Programming Article Sites Master List |
Using Regex Class in ASP.NETRegex class represents regular expression. It is immutable (means "read-only", it can't be changed after an instance is created. Thus Regex instance is completely defined in class constructor) and thread safe. Regex class is located in System.Text.RegularExpressions namespace. Regex class is used in two basic ways: To achieve better performance, it is usually better to call static method to avoid creating of Regex instance. Regex class has these members: CacheSize static propertyCacheSize is static property; it represents maximum number of compiled regular expressions in current static cache. Default value is 15. You can increase this value if needed, but be aware that large cache will require more memory and possibly slow down your application. Cache size property affects only when static methods of Regex class are used. It is not recommended to completely turn off cache (with CacheSize = 0), especially if you use same regular expression a lot of times. Options read-only propertyOptions is read-only property. It returns options from RegexOptions collection that are used in Regex constructor. Complete list and meaning of each option you can find at .Net Regular Expressions Syntax article. RightToLeft read-only propertyRightToLeft is read-only property. It returns true if Regex searches from right to left which is useful for languages that are read from right to left. RightToLeft will return true if RegexOptions.RightToLeft is used in Regex constructor. CompileToAssembly propertyCompiles regular expression to assembly on disc. Then, you can use this assembly like any other .Net assembly: add reference to application and call its methods from code. Compiled regular expression executes faster (commonly between 30% and 300% faster, depending of expression and amount of text. Improvement could be even 6x if you analyze gigabytes of text). Also, there is no initial compilation like when RegexOptions.Compile is used, so expressions compiled to assembly start faster and execute faster too. Be aware that expressions compiled in assembly can't be changed at run-time. Escape methodEscape method converts input string to string with escaped metacharacters. Practically, Escape method just adds backslash ( \ ) character before metacharacters \, *, +, ?, |, {, [, (,), ^, $,., #, and white space. On this way escaped metacharacters are interpreted as literals. This method is useful when you work with string inserted dynamically by user, when we don't know in advance what characters regular expression could contain. Then, if user inserts some metacharacter Escape method will convert it to literal. GetGroupNames methodGetGroupNames method returns a string array that contains names of all captured groups. If some group is unnamed, then they get indexed numeric name like 1, 2, 3, etc. GroupNameFromNumber methodGroupNameFromNumber method returns group's name (as a string) for given group number. GroupNumberFromName methodGroupNumberFromName method returns group's number for given group name. IsMatch propertyIsMatch property returns true or false, depending if expression found pattern in given text. This method is useful in cases when we don't want to capture data, but only want to know if pattern exists in text. Common use of IsMatch method is for data validation. Match methodMatch method searches text and returns only first result in type of Match class (in System.Text.RegularExpressions.Match). Don't be confused because both Regex.Match() method and Match class have the same name, remember that Regex.Match method returns instance of Match class. Match class has Success property (boolean) that can be used to check if RegEx engine is returned some result from text or not. Use Match.Value property to get actual result as string. Match method returns only first result, so if you want to get all results use Matches method that returns collection of Match objects in form of MatchCollection object. Matches methodMatches method always searches complete text and returns all results as MatchCollection object. MatchCollection contains a collection of Match objects. MatchCollection is read-only (immutable) without public constructor. To access to single Match objects, you can iterate through the members using some loop, like foreach [ C# ] or For Each [ VB.NET ]. Replace methodReplace method is used in search-and-replace scenario. It requires two regular expressions. First regular expression is used to find which strings in text should be replaced (search) and second regular expression is used to build replacement strings (replace). Connection between first and second regular expression is achieved using backreferences. Split methodSplit method splits text into strings and returns string array. Positions (delimiters) for splitting are defined by regular expression, so unlike other methods, Split method returns strings that are not matched by regular expression. If Count parameter is used, you can set maximum number of strings in returned array. Also, StartAt parameter specifies starting position in text where splitting will begin. If delimiter is empty string, method will split text into single characters. If delimiter appears on start or on the end of the text, an empty string will be added on start or end of resulting array string ToString methodToString method returns regular expression specified in Regex constructor. Since Regex class is read-only, regular expression given in constructor can not be changed later. Unescape methodAs opposite to Escape method, Unescape method unescapes any escaped character in text (for example, it replaces \n with n, or \* with *). Notice that Escape method escapes only methacharacters, and Unescape method unescapes methacharacters but also any escaped literal too. Common uses of Regex classRegex class is commonly used for these tasks: There are online examples of our four methods, you can Test .Net Regular Expressions on your own data. Let's see how to every of these tasks could be done. Using Regex to extract data from textRegular expressions are great tool to extract valuable information from large textual data. Of course, it is possible to get this data using classic string manipulation, but any more complex problem will demand endless number of System.String class methods, like IndexOf, SubString, Trim etc. Much simpler, cleaner and faster to implement solution is to use regular expression and extract data with Regex.Match or Regex.Matches method. Regex.Match method returns only one instance of Match class, which represents first result that RegEx engine found in text. Regex.Matches method returns MatchCollection object that is a collection of Match classes. Regex.Matches returns all results in given range. Let see how it works on example, we'll try to find all URLs in given HTML (which is useful if you try to build web spider): [ C# ] /// <summary> [ VB.NET ] ''' <summary> Don't forget to import System.Text.RegularExpressions namespace. You can try Extract Data online application to test if your regular expression extracts data correctly. Extracting data using regular expressions groupsSometimes we don't need complete result. For example, if we need to get text between <title> and </title> tags, that is easily achievable with expression <title>.*?</title>, but result will contain tags "<title>" and "</title>" too. Although this could be solved using look-arround groups, much simpler solution is to use GroupCollection from Regex.Groups property. Regex.Groups can get value of single group using its name or index. Let's see how Groups work on simple example. Here is the code that reads string inside title tag of HTML using named group "<t>": [ C# ] private void findTextBetweenTitleTags(string HTMLCode) [ VB.NET ] Private Sub findTextBetweenTitleTags(ByVal HTMLCode As String) Data validation with Regex classVery common use of regular expressions is data validation. ASP.NET provides RegularExpressionValidator control which is useful for validation of user input on web forms. But for other tasks we can use Regex.IsMatch method. IsMatch method doesn't provide match values like Match or Matchs methods. It just returns true or false value that tell us if regular expression matched given string. .Net data validation code example, that checks if inserted string is a valid e-mail, could look like this: [ C# ] /// <summary> [ VB.NET ] ''' <summary> There is Data Validation online application where you can test your regular expression to see if data validation works well. String replace with RegexIn search-and-replace scenario, we need two regular expressions. First expression is used to find what should be replaced, and second expression is used to build replacement strings. To relate groups in first and second regular expression we'll use backreferences. Here is the example function that looks for valid URLs in input text and converts them to clickable <a> tags which is useful for forums, customer support applications etc.: [ C# ] private string getTextWithLinks(string InputText) [ VB.NET ] Private Function getTextWithLinks(ByVal InputText As String) As String As you see in second expression (string ATagExpression), backreferences can be referenced more than once. In this case $1 backreference is used two times to create clickable links. Here is Search and Replace online application to test expressions in your scenario. Using Regex to split stringYou can use Regex.Split method to split given string into string array. In this case, regular expression is used to define delimiter, so RegEx engine returns parts of text that regular expression didn't match. Theoretically, Regex.Split is in some way like inverted Regex.Matches method. Practically, it depends what is easier to define: wanted or unwanted data in text. Use Regex.Match or Regex.Matches if it is easier for you to write regular expression that matches wanted data, otherwise use Regex.Split method if it is simpler to write expression that match unwanted data. For example let say you want to split text in single words. Delimiter could be empty single or multiple spaces, but also comma, semicolon, new line, etc., or even all that together. Regular expressions offer pretty short and simple solution for this problem: [ C# ] private string[] getWordsFromText(string PlainText) [ VB.NET ] Private Function getWordsFromText(ByVal PlainText As String) As String() Notice that we could use data extraction with Regex.Matches method to get same results. In that case, regular expression would define wanted strings (whole words) instead of unwanted strings (a.k.a. delimiter) in Regex.Split method. Because of that, regular expression would be different too. Instead of "\W+?" if Regex.Matches method is used expression would be "\w+?". You can try Split String web form to test if your regular expression defines delimiter correctly. ConclusionRegular expressions are not solution for every problem. In some cases, for example if you need to get data from XML file, .Net Framework offers specialized classes from System.XML namespace. Although you can use regular expressions to extract data from XML file, that will probably demand more efforts compared to writing simple XPath query. Also, if problem is very simple, and you can extract data from text with one or two String.SubString functions, make it so. On .Net Regular Expressions Syntax page you can find summary of rules used by regular expression language in .Net Framework. Happy coding! Tutorial toolbar: Tell A Friend | Add to favorites | Feedback | Google |