8 Regular Expressions You Should Know

2019獨角獸企業重金招聘Python工程師標準>>> hot3.png

Regular expressions are a language of their own. When you learn a new programming language, they're this little sub-language that makes no sense at first glance. Many times you have to read another tutorial, article, or book just to understand the "simple" pattern described. Today, we'll review eight regular expressions that you should know for your next coding project.

Background Info on Regular Expressions

This is what Wikipedia has to say about them:

In computing, regular expressions provide a concise and flexible means for identifying strings of text of interest, such as particular characters, words, or patterns of characters. Regular expressions (abbreviated as regex or regexp, with plural forms regexes, regexps, or regexen) are written in a formal language that can be interpreted by a regular expression processor, a program that either serves as a parser generator or examines text and identifies parts that match the provided specification.

Now, that doesn't really tell me much about the actual patterns. The regexes I'll be going over today contains characters such as \w, \s, \1, and many others that represent something totally different from what they look like.

If you'd like to learn a little about regular expressions before you continue reading this article, I'd suggest watching the Regular Expressions for Dummies screencast series.

The eight regular expressions we'll be going over today will allow you to match a(n): username, password, email, hex value (like #fff or #000), slug, URL, IP address, and an HTML tag. As the list goes down, the regular expressions get more and more confusing. The pictures for each regex in the beginning are easy to follow, but the last four are more easily understood by reading the explanation.

The key thing to remember about regular expressions is that they are almost read forwards and backwards at the same time. This sentence will make more sense when we talk about matching HTML tags.

Note: The delimiters used in the regular expressions are forward slashes, "/". Each pattern begins and ends with a delimiter. If a forward slash appears in a regex, we must escape it with a backslash: "\/".

1. Matching a Username

Matching a username

Pattern:

Description:

We begin by telling the parser to find the beginning of the string (^), followed by any lowercase letter (a-z), number (0-9), an underscore, or a hyphen. Next, {3,16} makes sure that are at least 3 of those characters, but no more than 16. Finally, we want the end of the string ($).

String that matches:

my-us3r_n4m3

String that doesn't match:

th1s1s-wayt00_l0ngt0beausername (too long)

2. Matching a Password

Matching a password

Pattern:

Description:

Matching a password is very similar to matching a username. The only difference is that instead of 3 to 16 letters, numbers, underscores, or hyphens, we want 6 to 18 of them ({6,18}).

String that matches:

myp4ssw0rd

String that doesn't match:

mypa$$w0rd (contains a dollar sign)

3. Matching a Hex Value

Matching a hex valud

Pattern:

Description:

We begin by telling the parser to find the beginning of the string (^). Next, a number sign is optional because it is followed a question mark. The question mark tells the parser that the preceding character — in this case a number sign — is optional, but to be "greedy" and capture it if it's there. Next, inside the first group (first group of parentheses), we can have two different situations. The first is any lowercase letter between a and f or a number six times. The vertical bar tells us that we can also have three lowercase letters between a and f or numbers instead. Finally, we want the end of the string ($).

The reason that I put the six character before is that parser will capture a hex value like #ffffff. If I had reversed it so that the three characters came first, the parser would only pick up #fff and not the other three f's.

String that matches:

#a3c113

String that doesn't match:

#4d82h4 (contains the letter h)

4. Matching a Slug

Matching a slug

Pattern:

Description:

You will be using this regex if you ever have to work with mod_rewrite and pretty URL's. We begin by telling the parser to find the beginning of the string (^), followed by one or more (the plus sign) letters, numbers, or hyphens. Finally, we want the end of the string ($).

String that matches:

my-title-here

String that doesn't match:

my_title_here (contains underscores)

5. Matching an Email

Matching an email

Pattern:

Description:

We begin by telling the parser to find the beginning of the string (^). Inside the first group, we match one or more lowercase letters, numbers, underscores, dots, or hyphens. I have escaped the dot because a non-escaped dot means any character. Directly after that, there must be an at sign. Next is the domain name which must be: one or more lowercase letters, numbers, underscores, dots, or hyphens. Then another (escaped) dot, with the extension being two to six letters or dots. I have 2 to 6 because of the country specific TLD's (.ny.us or .co.uk). Finally, we want the end of the string ($).

String that matches:

john@doe.com

String that doesn't match:

john@doe.something (TLD is too long)

6. Matching a URL

Matching a url

Pattern:

Description:

This regex is almost like taking the ending part of the above regex, slapping it between "http://" and some file structure at the end. It sounds a lot simpler than it really is. To start off, we search for the beginning of the line with the caret.

The first capturing group is all option. It allows the URL to begin with "http://", "https://", or neither of them. I have a question mark after the s to allow URL's that have http or https. In order to make this entire group optional, I just added a question mark to the end of it.

Next is the domain name: one or more numbers, letters, dots, or hypens followed by another dot then two to six letters or dots. The following section is the optional files and directories. Inside the group, we want to match any number of forward slashes, letters, numbers, underscores, spaces, dots, or hyphens. Then we say that this group can be matched as many times as we want. Pretty much this allows multiple directories to be matched along with a file at the end. I have used the star instead of the question mark because the star says zero or more, not zero or one. If a question mark was to be used there, only one file/directory would be able to be matched.

Then a trailing slash is matched, but it can be optional. Finally we end with the end of the line.

String that matches:

http://net.tutsplus.com/about

String that doesn't match:

http://google.com/some/file!.html (contains an exclamation point)

7. Matching an IP Address

Matching an IP address

Pattern:

Description:

Now, I'm not going to lie, I didn't write this regex; I got it from here. Now, that doesn't mean that I can't rip it apart character for character.

The first capture group really isn't a captured group because

was placed inside which tells the parser to not capture this group (more on this in the last regex). We also want this non-captured group to be repeated three times — the {3} at the end of the group. This group contains another group, a subgroup, and a literal dot. The parser looks for a match in the subgroup then a dot to move on.

The subgroup is also another non-capture group. It's just a bunch of character sets (things inside brackets): the string "25" followed by a number between 0 and 5; or the string "2" and a number between 0 and 4 and any number; or an optional zero or one followed by two numbers, with the second being optional.

After we match three of those, it's onto the next non-capturing group. This one wants: the string "25" followed by a number between 0 and 5; or the string "2" with a number between 0 and 4 and another number at the end; or an optional zero or one followed by two numbers, with the second being optional.

We end this confusing regex with the end of the string.

String that matches:

73.60.124.136 (no, that is not my IP address :P)

String that doesn't match:

256.60.124.136 (the first group must be "25" and a number between zero and five)

Advertisement

8. Matching an HTML Tag

Matching an HTML tag

Pattern:

Description:

One of the more useful regexes on the list. It matches any HTML tag with the content inside. As usually, we begin with the start of the line.

First comes the tag's name. It must be one or more letters long. This is the first capture group, it comes in handy when we have to grab the closing tag. The next thing are the tag's attributes. This is any character but a greater than sign (>). Since this is optional, but I want to match more than one character, the star is used. The plus sign makes up the attribute and value, and the star says as many attributes as you want.

Next comes the third non-capture group. Inside, it will contain either a greater than sign, some content, and a closing tag; or some spaces, a forward slash, and a greater than sign. The first option looks for a greater than sign followed by any number of characters, and the closing tag. \1 is used which represents the content that was captured in the first capturing group. In this case it was the tag's name. Now, if that couldn't be matched we want to look for a self closing tag (like an img, br, or hr tag). This needs to have one or more spaces followed by "/>".

The regex is ended with the end of the line.

String that matches:

<a href="http://net.tutsplus.com/">Nettuts+</a>

String that doesn't match:

<img src="img.jpg" alt="My image>" /> (attributes can't contain greater than signs)

Conclusion

I hope that you have grasped the ideas behind regular expressions a little bit better. Hopefully you'll be using these regexes in future projects! Many times you won't need to decipher a regex character by character, but sometimes if you do this it helps you learn. Just remember, don't be afraid of regular expressions, they might not seem it, but they make your life a lot easier. Just try and pull out a tag's name from a string without regular expressions! ;)

  • Follow us on Twitter, or subscribe to the NETTUTS RSS Feed for more daily web development tuts and articles.


Advertisement

轉載于:https://my.oschina.net/u/2363463/blog/635777

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/459017.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/459017.shtml
英文地址,請注明出處:http://en.pswp.cn/news/459017.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

poj 3278 catch that cow BFS(基礎水)

Catch That CowTime Limit: 2000MS Memory Limit: 65536KTotal Submissions: 61826 Accepted: 19329Description Farmer John has been informed of the location of a fugitive cow and wants to catch her immediately. He starts at a point N (0 ≤ N ≤ 100,000) on a num…

制作已編譯的html幫助文件

http://www.cnblogs.com/cm186man/archive/2008/03/10/1098896.html引用 HTML幫助文檔從結構上來看可分為兩個部分&#xff0c;運行器和文檔內容。它的一個好處是能使幫助文檔跨平臺運行&#xff0c;只要有不同平臺上的運行器和瀏覽器&#xff0c;幫助文檔不再需要重新編制&…

matlab %3c handle,volume browser (updated).htm 源代碼在線查看 - Matlab顯式三維地震數據的源代碼 資源下載 蟲蟲電子下載站...

Comments: any comments on this error:??? Error using > timesIntegers can only be combined with integers of the same class, or scalar doubles.Error in > interp3>linear at 368 F (( arg4(ndx).*(1-t) arg4(ndx1).*t ).*(1-s) ...Error in > inter…

PHPer轉戰Android的學習過程以及Android學習

原文作者&#xff1a; eoeadmin原文地址&#xff1a; http://my.eoe.cn/shuhai/archive/19684.html--------------------------------------------這篇文章主要寫了一個PHP程序猿是如何轉戰學習Android的。 第一步&#xff1a;直接跨過java的學習&#xff0c;原因有我之前看過畢…

SQL中實現截取字符串的函數

SQL中實現截取字符串的函數 如果想實現從數據庫中取數據時截取一個字段下的內容或者截取一串字符串&#xff0c;則能夠實現這種效果的函數有Left&#xff0c;Right&#xff0c;SubString三個函數。1.Left函數&#xff1a;Left&#xff08;character_expression , integer_expre…

php時區設置問題,PHP 的時區設置問題_PHP教程

裝上PHP5后你會發現這樣的問題&#xff1a;你也許會發現&#xff0c;輸出的時間和你現在的時間是不相同的。原因是假如你不在程序或配置文件中設置你的服務器當地時區的話&#xff0c;PHP所取的時間是格林威治標準時間&#xff0c;所以和你當地的時間會有出入。格林威治標準時間…

HDU 5392 BC #51

就是求最大公倍數&#xff0c;但要用分解質因子求。 自己寫的WA到爆。。。。 #include<iostream> #include<stdio.h> #include<math.h>#include<algorithm>using namespace std;#define rd(x) scanf("%d",&x) #define rd2(x,y) scanf(&q…

下拉框——把一個select框中選中內容移到另一個select框中遇到的問題

在使用jQuery實現把一個select框中選中內容移到另一個select框中功能時遇到了一個問題&#xff0c;就是點擊按鈕時內容可以到另一個select框中&#xff0c;但是到了另一個select框中的內容卻很快閃退回原來的select框中&#xff0c;代碼如下&#xff1a; <select class"…

windows API 串口編程參考

*************************************************** 更多精彩&#xff0c;歡迎進入&#xff1a;http://shop115376623.taobao.com *************************************************** &#xff08;一&#xff09;Windows API串口通信編程概述 Windows環境下的串口編程與D…

jquery post php返回html,jquery ajax post 提交數據,返回的是當前網頁的html?

代碼如下用戶名:密 碼:$("#login_submit").bind("click",function(){var type "post";var url "index.php?madmin&cindex&acheckLogin";var formArrays $("#admin_form").serializeArray();var requestData …

[轉]Android Studio系列教程六--Gradle多渠道打包

轉自&#xff1a;http://www.stormzhang.com/devtools/2015/01/15/android-studio-tutorial6/ Android Studio系列教程六--Gradle多渠道打包 2015 年 01 月 15 日 devtools本文為個人原創&#xff0c;歡迎轉載&#xff0c;但請務必在明顯位置注明出處&#xff01; 由于國內Andr…

服務器上裝filezilla server后,本地的ftp客戶端連接不上去

公司一臺服務器&#xff0c;上面裝了filezilla server后&#xff0c;按平常配置好了&#xff0c;但是在本地用FTP客戶端不管怎么連接都連接不上&#xff0c;本地FTP客戶端總提示連接失敗&#xff0c;遠程filezilla server的界面也沒有提示有人連接&#xff0c; 仔細看了一下&am…

數據結構與算法之堆與堆排序

在數據結構中&#xff0c;堆其實就是一棵完全二叉樹。我們知道內存中也有一塊叫做堆的存儲區域&#xff0c;但是這與數據結構中的堆是完全不同的概念。在數據結構中&#xff0c;堆分為大根堆和小根堆&#xff0c;大根堆就是根結點的關鍵字大于等于任一個子節點的關鍵字&#xf…

非法操作 login.php,閱文游戲中心 h5游戲接入wiki

閱文游戲中心《h5游戲 CP接口規范》接口要求規范游戲方接口說明&#xff1a;游戲方需按照規范提供&#xff0c;閱文進行調用閱文接口說明&#xff1a;閱文提供&#xff0c;游戲方調用參數 time 為Unix 時間戳(January 1 1970 00:00:00 GMT 起的秒數) &#xff0c;單位為秒編碼統…

串口通信與編程:串口基礎知識

*************************************************** 更多精彩&#xff0c;歡迎進入&#xff1a;http://shop115376623.taobao.com *************************************************** 串口是串行接口&#xff08;serial port&#xff09;的簡稱&#xff0c;也稱為串行通信…

jmeter上傳文件搞了一天,才搞定,沒高人幫忙效率就是低,趕緊記下來,以備后用...

jmeter上傳文件搞了一天&#xff0c;才搞定&#xff0c;沒高人幫忙效率就是低&#xff0c;趕緊記下來&#xff0c;以備后用 先用谷歌瀏覽器抓包&#xff0c;抓到的包類似這樣&#xff1a; 在jmeter里添加一個http請求&#xff0c;配置好參數&#xff0c;方法&#xff0c;端口&a…

自定義dialog

2019獨角獸企業重金招聘Python工程師標準>>> R.layout.layout_insert_dialog自定義布局 View mViewLayoutInflater.from(MainActivity.this).inflate(R.layout.layout_insert_dialog, null); AlertDialog.Builder dialognew AlertDialog.Builder (MainActivity.this…

js unescape 對應php的函數,php實現Javascript的escape和unescape函數

由于需要用到php調用js文件&#xff0c;在網上找了相關的資料&#xff0c;并改寫了相關的方法。php實現 Javascript的escape函數方法&#xff1a;function escape($str) {preg_match_all("/[/xc2-/xdf][/x80-/xbf]|[/xe0-/xef][/x80-/xbf]{2}|[/xf0-/xff][/x80-/xbf]{3}|[…

字符數組,字符串、數字轉化

<p style"margin-top: 5px; margin-bottom: 5px; padding-top: 0px; padding-bottom: 0px; line-height: 26px; word-wrap: break-word; color: rgb(102, 102, 102); font-family: 宋體, Arial; font-size: 16px;">//****************************************…

PE文件RV轉FOA及FOA轉RVA

/************************************************************************/ /* 功能:虛擬內存相對地址和文件偏移的轉換 參數&#xff1a;stRVA&#xff1a; 虛擬內存相對偏移地址 lpFileBuf: 文件起始地址 返回&#xff1a;轉換后的文件偏移地址 */ /*****************…