<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A good url regular expression? (repost)</title>
	<atom:link href="http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/feed/" rel="self" type="application/rss+xml" />
	<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/</link>
	<description>thoughts.each { &#38;:propagandise }</description>
	<lastBuildDate>Sun, 14 Mar 2010 17:21:13 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Alexey Novikov</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-906</link>
		<dc:creator>Alexey Novikov</dc:creator>
		<pubDate>Sun, 14 Mar 2010 17:21:13 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-906</guid>
		<description>By the way, there can be slashes after the anchor symbol (octothorp or dies or sharp). For example, in some gmail links. So if we do not to mark such URLs as invalid, we should include slash in the last part of this regexp:

&lt;code&gt;
(?#Anchor)(?:#(?:[-\w~!$ &#124;/.,*:;=]&#124;%[a-f\d]{2})*)?$
&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p>By the way, there can be slashes after the anchor symbol (octothorp or dies or sharp). For example, in some gmail links. So if we do not to mark such URLs as invalid, we should include slash in the last part of this regexp:</p>
<p><code><br />
(?#Anchor)(?:#(?:[-\w~!$ |/.,*:;=]|%[a-f\d]{2})*)?$<br />
</code></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexey Novikov</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-905</link>
		<dc:creator>Alexey Novikov</dc:creator>
		<pubDate>Sun, 14 Mar 2010 10:09:25 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-905</guid>
		<description>I&#039;ve enhanced it. Now it works with IPs
Even if you write 1.01.001.000
&lt;code&gt;
^(?#Protocol)(?:(?:ht&#124;f)tp(?:s?)\:\/\/&#124;~\/&#124;\/)?(?#Username:Password)(?:\w :\w @)?((?#Subdomains)(?:(?:[-\w] \.) (?#TopLevel Domains)(?:com&#124;org&#124;net&#124;gov&#124;mil&#124;biz&#124;info&#124;mobi&#124;name&#124;aero&#124;jobs&#124;museum&#124;travel&#124;[a-z]{2}))&#124;(?#IP)((\b25[0-5]\b&#124;\b[2][0-4][0-9]\b&#124;\b[0-1]?[0-9]?[0-9]\b)(\.(\b25[0-5]\b&#124;\b[2][0-4][0-9]\b&#124;\b[0-1]?[0-9]?[0-9]\b)){3}))(?#Port)(?::[\d]{1,5})?(?#Directories)(?:(?:(?:\/(?:[-\w~!$ &#124;.,=]&#124;%[a-f\d]{2}) ) &#124;\/) &#124;\?&#124;#)?(?#Query)(?:(?:\?(?:[-\w~!$ &#124;.,*:]&#124;%[a-f\d{2}]) =?(?:[-\w~!$ &#124;.,*:;=]&#124;%[a-f\d]{2})*)(?:&amp;(?:[-\w~!$ &#124;.,*:]&#124;%[a-f\d{2}]) =?(?:[-\w~!$ &#124;.,*:;=]&#124;%[a-f\d]{2})*)*)*(?#Anchor)(?:#(?:[-\w~!$ &#124;.,*:;=]&#124;%[a-f\d]{2})*)?$
&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p>I&#8217;ve enhanced it. Now it works with IPs<br />
Even if you write 1.01.001.000<br />
<code><br />
^(?#Protocol)(?:(?:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w :\w @)?((?#Subdomains)(?:(?:[-\w] \.) (?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))|(?#IP)((\b25[0-5]\b|\b[2][0-4][0-9]\b|\b[0-1]?[0-9]?[0-9]\b)(\.(\b25[0-5]\b|\b[2][0-4][0-9]\b|\b[0-1]?[0-9]?[0-9]\b)){3}))(?#Port)(?::[\d]{1,5})?(?#Directories)(?:(?:(?:\/(?:[-\w~!$ |.,=]|%[a-f\d]{2}) ) |\/) |\?|#)?(?#Query)(?:(?:\?(?:[-\w~!$ |.,*:]|%[a-f\d{2}]) =?(?:[-\w~!$ |.,*:;=]|%[a-f\d]{2})*)(?:&amp;(?:[-\w~!$ |.,*:]|%[a-f\d{2}]) =?(?:[-\w~!$ |.,*:;=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:#(?:[-\w~!$ |.,*:;=]|%[a-f\d]{2})*)?$<br />
</code></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mina</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-897</link>
		<dc:creator>Mina</dc:creator>
		<pubDate>Tue, 23 Feb 2010 07:45:46 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-897</guid>
		<description>Thanks a lot for this wonderful article</description>
		<content:encoded><![CDATA[<p>Thanks a lot for this wonderful article</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: zerkms</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-896</link>
		<dc:creator>zerkms</dc:creator>
		<pubDate>Tue, 16 Feb 2010 00:42:29 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-896</guid>
		<description>in the last comment i mean &quot;&amp; amp;&quot; (without space)</description>
		<content:encoded><![CDATA[<p>in the last comment i mean &#8220;&amp; amp;&#8221; (without space)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: zerkms</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-895</link>
		<dc:creator>zerkms</dc:creator>
		<pubDate>Tue, 16 Feb 2010 00:40:11 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-895</guid>
		<description>what about &quot;&amp;&quot; in urls?</description>
		<content:encoded><![CDATA[<p>what about &#8220;&amp;&#8221; in urls?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rene</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-894</link>
		<dc:creator>Rene</dc:creator>
		<pubDate>Mon, 15 Feb 2010 19:12:15 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-894</guid>
		<description>I tried it with PHP and works flawlessly, I tried like 8 different ones from regexlibrary.com and this one killed them all. I&#039;m just wondering if it will work with JS, I&#039;ll keep you guys posted...</description>
		<content:encoded><![CDATA[<p>I tried it with PHP and works flawlessly, I tried like 8 different ones from regexlibrary.com and this one killed them all. I&#8217;m just wondering if it will work with JS, I&#8217;ll keep you guys posted&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: matapult</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-892</link>
		<dc:creator>matapult</dc:creator>
		<pubDate>Mon, 08 Feb 2010 10:55:37 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-892</guid>
		<description>Nope, didn&#039;t work for me. I tried this ultra-basic implementation in python


---------------------------------------------------------------
#!/usr/bin/env python

import re
line =&quot;this is an example of a url that someone might write in some text http://www.google.com test test test&quot;
urls = []

urls.append( re.findall( r&quot;^(?#Protocol)(?:(?:ht&#124;f)tp(?:s?)\:\/\/&#124;~\/&#124;\/)?(?#Username:Password)(?:\w :\w @)?(?#Subdomains)(?:(?:[-\w] \.) (?#TopLevel Domains)(?:com&#124;org&#124;net&#124;gov&#124;mil&#124;biz&#124;info&#124;mobi&#124;name&#124;aero&#124;jobs&#124;museum&#124;travel&#124;[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(?:(?:(?:\/(?:[-\w~!$ &#124;.,=]&#124;%[a-f\d]{2}) ) &#124;\/) &#124;\?&#124;#)?(?#Query)(?:(?:\?(?:[-\w~!$ &#124;.,*:]&#124;%[a-f\d{2}]) =?(?:[-\w~!$ &#124;.,*:=]&#124;%[a-f\d]{2})*)(?:&amp;(?:[-\w~!$ &#124;.,*:]&#124;%[a-f\d{2}]) =?(?:[-\w~!$ &#124;.,*:=]&#124;%[a-f\d]{2})*)*)*(?#Anchor)(?:#(?:[-\w~!$ &#124;.,*:=]&#124;%[a-f\d]{2})*)?$&quot;, line ) )

print urls
---------------------------------------------------

and it didnt find anything, not even the basic google.com :(</description>
		<content:encoded><![CDATA[<p>Nope, didn&#8217;t work for me. I tried this ultra-basic implementation in python</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br />
#!/usr/bin/env python</p>
<p>import re<br />
line =&#8221;this is an example of a url that someone might write in some text <a href="http://www.google.com" rel="nofollow">http://www.google.com</a> test test test&#8221;<br />
urls = []</p>
<p>urls.append( re.findall( r&#8221;^(?#Protocol)(?:(?:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w :\w @)?(?#Subdomains)(?:(?:[-\w] \.) (?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(?:(?:(?:\/(?:[-\w~!$ |.,=]|%[a-f\d]{2}) ) |\/) |\?|#)?(?#Query)(?:(?:\?(?:[-\w~!$ |.,*:]|%[a-f\d{2}]) =?(?:[-\w~!$ |.,*:=]|%[a-f\d]{2})*)(?:&amp;(?:[-\w~!$ |.,*:]|%[a-f\d{2}]) =?(?:[-\w~!$ |.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:#(?:[-\w~!$ |.,*:=]|%[a-f\d]{2})*)?$&#8221;, line ) )</p>
<p>print urls<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>and it didnt find anything, not even the basic google.com <img src='http://flanders.co.nz/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: J. Benitez</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-891</link>
		<dc:creator>J. Benitez</dc:creator>
		<pubDate>Sun, 07 Feb 2010 04:16:46 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-891</guid>
		<description>There&#039;s no browser that can handle it</description>
		<content:encoded><![CDATA[<p>There&#8217;s no browser that can handle it</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: petrelevich</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-890</link>
		<dc:creator>petrelevich</dc:creator>
		<pubDate>Fri, 05 Feb 2010 20:08:00 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-890</guid>
		<description>What would you say about this?

^(?:(?:(?:http[s]?):\/\/)&#124;(?:www.))(?:[-_0-9a-z] .) [-_0-9a-z]{2,4}[:0-9]*[\/]*$</description>
		<content:encoded><![CDATA[<p>What would you say about this?</p>
<p>^(?:(?:(?:http[s]?):\/\/)|(?:www.))(?:[-_0-9a-z] .) [-_0-9a-z]{2,4}[:0-9]*[\/]*$</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: smb</title>
		<link>http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/comment-page-1/#comment-888</link>
		<dc:creator>smb</dc:creator>
		<pubDate>Thu, 07 Jan 2010 16:37:21 +0000</pubDate>
		<guid isPermaLink="false">http://flanders.co.nz/?p=366#comment-888</guid>
		<description>Don&#039;t care about russian and asian characters. Before you validate an URL, you have to convert it to PunyCode, which only uses A-Z, 0-9 and -. So “中国” ain&#039;t in URLs</description>
		<content:encoded><![CDATA[<p>Don&#8217;t care about russian and asian characters. Before you validate an URL, you have to convert it to PunyCode, which only uses A-Z, 0-9 and -. So “中国” ain&#8217;t in URLs</p>
]]></content:encoded>
	</item>
</channel>
</rss>
