Sei sulla pagina 1di 8

10 Java Regular Expression Examples You Should Know

Published: November 18, 2009 , Updated: January 23, 2010 , Author: mkyong

Regular expression is an art of the programing, its hard to debug , learn and understand, but the powerful features are still attract many developers to code regular expression. Lets explore the following 10 practical regular expression ~ enjoy :)

1. Username Regular Expression Pattern


^[a-z0-9_-]{3,15}$ ^ [a-z0-9_-] underscore , hyphen {3,15} $ # Start of the line # Match characters and symbols in the list, a-z, 0-9 , # Length at least 3 characters and maximum length of 15 # End of the line

==> See the explanation and example here

2. Password Regular Expression Pattern


((?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%]).{6,20}) ( (?=.*\d) (?=.*[a-z]) (?=.*[A-Z]) (?=.*[@#$%]) . {6,20} ) # Start of group # must contains one digit from 0-9 # must contains one lowercase characters # must contains one uppercase characters # must contains one special symbols in the list "@#$%" # match anything with previous condition checking # length at least 6 characters and maximum of 20 # End of group

==> See the explanation and example here

3. Hexadecimal Color Code Regular Expression Pattern


^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$ ^ #start of the line # # must constains a "#" symbols ( # start of group #1 [A-Fa-f0-9]{6} # any strings in the list, with length of 6 | # ..or [A-Fa-f0-9]{3} # any strings in the list, with length of 3 ) # end of group #1 $ #end of the line

==> See the explanation and example here

4. Email Regular Expression Pattern


^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+ (\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$ ^ #start of the line [_A-Za-z0-9-]+ # must start with string in the bracket [ ], must contains one or more (+) ( # start of group #1 \\.[_A-Za-z0-9-]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+) )* # end of group #1, this group is optional (*) @ # must contains a "@" symbol [A-Za-z0-9]+ # follow by string in the bracket [ ], must contains one or more (+) ( # start of group #2 - first level TLD checking \\.[A-Za-z0-9]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+) )* # end of group #2, this group is optional (*) ( # start of group #3 - second level TLD checking \\.[A-Za-z]{2,} # follow by a dot "." and string in the bracket [ ], with minimum length of 2 ) # end of group #3 $ #end of the line

==> See the explanation and example here

5. Image File Extension Regular Expression Pattern


([^\s]+(\.(?i)(jpg|png|gif|bmp))$) ( [^\s]+ ( \. (?i) ( jpg | png | gif | bmp ) ) $ ) # # #Start of the group #1 # must contains one or more anything (except white space) start of the group #2 # follow by a dot "." # ignore the case sensitive checking # start of the group #3 # contains characters "jpg" # ..or # contains characters "png" # ..or # contains characters "gif" # ..or # contains characters "bmp" # end of the group #3 end of the group #2 # end of the string #end of the group #1

==> See the explanation and example here

6. IP Address Regular Expression Pattern


^([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\. ([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.([01]?\\d\\d?|2[0-4]\\d|25[0-5])$ ^ #start of the line # start of group #1 [01]?\\d\\d? # Can be one or two digits. If three digits appear, it must start either 0 or 1 # e.g ([0-9], [0-9][0-9],[0-1][0-9][0-9]) | # ...or 2[0-4]\\d # start with 2, follow by 0-4 and end with any digit (2[0-4][0-9]) | # ...or 25[0-5] # start with 2, follow by 5 and end with 0-5 (25[0-5]) ) # end of group #2 \. # follow by a dot "." .... # repeat with 3 time (3x) $ #end of the line (

==> See the explanation and example here

7. Time Format Regular Expression Pattern Time in 12-Hour Format Regular Expression Pattern
(1[012]|[1-9]):[0-5][0-9](\\s)?(?i)(am|pm) ( 1[012] | [1-9] ) : [0-5][0-9] (\\s)? (?i) (am|pm) #start of group #1 # start with 10, 11, 12 # or # start with 1,2,...9 #end of group #1 # follow by a semi colon (:) # follow by 0..5 and 0..9, which means 00 to 59 # follow by a white space (optional) # next checking is case insensitive # follow by am or pm

==> See the explanation and example here

Time in 24-Hour Format Regular Expression Pattern


([01]?[0-9]|2[0-3]):[0-5][0-9] ( [01]?[0-9] | 2[0-3] ) : [0-5][0-9] #start of group #1 # start with 0-9,1-9,00-09,10-19 # or # start with 20-23 #end of group #1 # follow by a semi colon (:) # follow by 0..5 and 0..9, which means 00 to 59

==> See the explanation and example here

8. Date Format (dd/mm/yyyy) Regular Expression Pattern


(0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[012])/((19|20)\\d\\d) ( #start of group #1 01-09 or 1-9 # ..or # 10-19 or 20-29 # ..or # 30, 31 ) #end of group #1 / # follow by a "/" ( # start of group #2 0?[1-9] # 01-09 or 1-9 | # ..or 1[012] # 10,11,12 ) # end of group #2 / # follow by a "/" ( # start of group #3 (19|20)\\d\\d # 19[0-9][0-9] or 20[0-9][0-9] ) # end of group #3 0?[1-9] | [12][0-9] | 3[01] #

==> See the explanation and example here

9. HTML tag Regular Expression Pattern


<("[^"]*"|'[^']*'|[^'">])*> < ( "[^"]*" | '[^']*' | [^'">] ) * > #start with opening tag "<" # start of group #1 # only two double quotes are allow - "string" # ..or # only two single quotes are allow - 'string' # ..or # cant contains one single quotes, double quotes and ">" # end of group #1 # 0 or more #end with closing tag ">"

==> See the explanation and example here

10. HTML links Regular Expression Pattern HTML A tag Regular Expression Pattern
(?i)<a([^>]+)>(.+?)</a> ( #start of group #1

?i ) <a ( [^>]+ ) > (.+?) </a>

# all checking are case insensive #end of group #1 #start with "<a" # start of group #2 # anything except (">"), at least one character # end of group #2 # follow by ">" # match anything # end with "</a>

Extract HTML link Regular Expression Pattern


\s*(?i)href\s*=\s*(\"([^"]*\")|'[^']*'|([^'">\s]+)); \s* (?i) href \s*=\s* ( "([^"]*") | # '[^']*' | ([^'">]+) ) #can start with whitespace # all checking are case insensive # follow by "href" word # allows spaces on either side of the equal sign, # start of group #1 # only two double quotes are allow - "string" ..or # only two single quotes are allow - 'string' # ..or # cant contains one single / double quotes and ">" # end of group #1

==> See the explanation and example here

Here is a simple Java Link extractor to extract the href value from 1st pattern, and use 2nd pattern to extract the link from 1st pattern value. Of course with some logic as below.

Java Regular Expression Example


package com.mkyong.regex; import java.util.Vector; import java.util.regex.Matcher; import java.util.regex.Pattern; public class HTMLLinkExtrator{ private Pattern patternTag, patternLink; private Matcher matcherTag, matcherLink; private static final String HTML_A_TAG_PATTERN = "(?i)<a([^>]+)>(.+?)</a>"; private static final String HTML_A_HREF_TAG_PATTERN =

"\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))"; public HTMLLinkExtrator(){ patternTag = Pattern.compile(HTML_A_TAG_PATTERN); patternLink = Pattern.compile(HTML_A_HREF_TAG_PATTERN); } /** * Validate html with regular expression * @param html html content for validation * @return Vector links and link text */ public Vector<HtmlLink> grabHTMLLinks(final String html){ Vector<HtmlLink> result = new Vector<HtmlLink>(); matcherTag = patternTag.matcher(html); while(matcherTag.find()){ String href = matcherTag.group(1); //href String linkText = matcherTag.group(2); //link text matcherLink = patternLink.matcher(href); while(matcherLink.find()){ String link = matcherLink.group(1); //link result.add(new HtmlLink(link, linkText)); } } return result; } class HtmlLink { String link; String linkText; HtmlLink(String link, String linkText){ this.link = link; this.linkText = linkText; } @Override public String toString() { return new StringBuffer("Link : ") .append(this.link) .append(" Link Text : ") .append(this.linkText).toString(); } } }

Unit Test HTMLLinkExtratorTest

package com.mkyong.regex; import java.util.Vector; import org.testng.Assert; import org.testng.annotations.*; import com.mkyong.regex.HTMLLinkExtrator.HtmlLink; /** * HTML link extrator Testing * @author mkyong * */ public class HTMLLinkExtratorTest { private HTMLLinkExtrator htmlLinkExtrator; @BeforeClass public void initData(){ htmlLinkExtrator = new HTMLLinkExtrator(); } @DataProvider public Object[][] HTMLContentProvider() { return new Object[][]{ new Object[] {"abc hahaha <a href='http://www.google.com'>google</a>"}, new Object[] {"abc hahaha <a HREF='http://www.google.com'>google</a>"}, new Object[] {"abc hahaha <A HREF='http://www.google.com'>google</A> , " + "abc hahaha <A HREF='http://www.google.com' target='_blank'>google</A>"}, new Object[] {"abc hahaha <A HREF='http://www.google.com' target='_blank'>google</A>"}, new Object[] {"abc hahaha <A target='_blank' HREF='http://www.google.com'>google</A>"}, new Object[] {"abc hahaha <a HREF=http://www.google.com>google</a>"}, }; } @Test(dataProvider = "HTMLContentProvider") public void ValidHTMLLinkTest(String html) { Vector<HtmlLink> links = htmlLinkExtrator.grabHTMLLinks(html); Assert.assertTrue(links.size()!=0); for(int i=0; i<links.size() ; i++){ HtmlLink htmlLinks = links.get(i); System.out.println(htmlLinks); } } }

Unit Test Result


[Parser] Running: E:\workspace\mkyong\temp-testng-customsuite.xml Link : 'http://www.google.com' Link Text : google Link : 'http://www.google.com' Link Text : google

Link : 'http://www.google.com' Link Text : google Link : 'http://www.google.com' Link Text : google Link : 'http://www.google.com' Link Text : google Link : 'http://www.google.com' Link Text : google Link : http://www.google.com Link Text : google PASSED: ValidHTMLLinkTest("abc hahaha <a href='http://www.google.com'>google</a>") PASSED: ValidHTMLLinkTest("abc hahaha <a HREF='http://www.google.com'>google</a>") PASSED: ValidHTMLLinkTest("abc hahaha <A HREF='http://www.google.com'>google</A> , abc hahaha <A HREF='http://www.google.com' target='_blank'>google</A>") PASSED: ValidHTMLLinkTest("abc hahaha <A HREF='http://www.google.com' target='_blank'>google</A>") PASSED: ValidHTMLLinkTest("abc hahaha <A target='_blank' HREF='http://www.google.com'>google</A>") PASSED: ValidHTMLLinkTest("abc hahaha <a HREF=http://www.google.com>google</a>") =============================================== com.mkyong.regex.HTMLLinkExtratorTest Tests run: 6, Failures: 0, Skips: 0 =============================================== =============================================== mkyong Total tests run: 6, Failures: 0, Skips: 0 ===============================================

Potrebbero piacerti anche