Subscribe to Tutorial Feed

Flash and PHP Bible

The Flash and PHP Bible has been released! The book can be found on Amazon or wherever fine books are sold in your area.

The Flash and PHP Bible has a forum for quick support.

Scriptplayground » tutorials » php » Simple Way to Validate Links

Simple Way to Validate Links

Validate url's before you accept them in your scripts.

<?php
function validLink($link) {
    if(preg_match("/http:\/\//", $link)) {
        return true;
    } else {
        return false;
    }
}
?>

This simple little function checks a link for the correct http://. This could be taken a step further and checked against a series of proper protocals (http, ftp, feed) etc...

Most of the magic is done in the preg_match function. We check for an http://, but must escape certain characters within this function so we use \ to escape the forward slashes /.

That all thats too this function, if you have some ideas or possible expansions. Let me know, maybe they could be added to this tutorial.

Follow Scriptplayground on Twitter (@scriptplay)

Comments: Simple Way to Validate Links

 entity_azirius  Sat Jan 6, 2007 10:55 pm  
Pretty cool, the preg_match() function is pretty wicked! Is there a way to make sure it has a .com/.co.uk/etc?
 Jhecht  Sun Jun 10, 2007 5:47 pm  
Something like this usually is what people do(if it doesn't work, keep in mind i just wrote this RIGHT now)

$preg = "/http:\/\/([a-z0-9]{2,}\-?\.[a-z]{2,3})(\.?[a-z]{2,3})?/i"

I haven't tested that... Maybe Matt knows a better one(considering I'm not too good at regex, it takes me a wile to get it right).
 fqa@vardagsrummet.mine.nu  Sat Dec 1, 2007 12:49 am  
how about this i made yesterday;

THE CODE - one liner
[PHP]$urlregex = "^(https?|ftp)\:\/\/([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)*(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@/&%=+\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?\$";
if (eregi($urlregex, $url)) {echo "good";} else {echo "bad";}[/PHP]

(OPTIONAL: READ BELOW FOR EXPLANATION)

it will validate all these types of urls
[PHP]
// valid urls
$url = "https://user:pass@www.somewhere.com:8080/login.php?do=login&style=%23#pagetop";
$url = "http://user@www.somewhere.com/#pagetop";
$url = "https://somewhere.com/index.html";
$url = "ftp://user:****@somewhere.com:21/";
$url = "http://somewhere.com/index.html/"; //this is valid!!
[/PHP]

THE CODE - broken into section for easy editing and understanding:
[PHP]
// SCHEME
$urlregex = "^(https?|ftp)\:\/\/";

// USER AND PASS (optional)
$urlregex .= "([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?";

// HOSTNAME OR IP
$urlregex .= "[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)*"; // http://x = allowed (ex. http://localhost, http://routerlogin)
//$urlregex .= "[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)+"; // http://x.x = minimum
//$urlregex .= "([a-z0-9+\$_-]+\.)*[a-z0-9+\$_-]{2,3}"; // http://x.xx(x) = minimum
//use only one of the above

// PORT (optional)
$urlregex .= "(\:[0-9]{2,5})?";
// PATH (optional)
$urlregex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?";
// GET Query (optional)
$urlregex .= "(\?[a-z+&\$_.-][a-z0-9;:@/&%=+\$_.-]*)?";
// ANCHOR (optional)
$urlregex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?\$";

// check
if (eregi($urlregex, $url)) {echo "good";} else {echo "bad";}

[/PHP]

all the lines in the code above can be safely removed (except for hostname) if you don't want to allow some URL segment (if you don't want getqueries in your urls, just comment the respective $urlregex .= ....) - but do not reorder them.
the "(optional)" states that the part MAY exist, but url will be valid even if it doesn't contain the part (see the valid urls above).

syntax:
[code] :// [user[:pass]@] hostname [port] [/path] [?getquery] [anchor][/code]
-taking into account allowed safe characters
-assuming .. (dot dot) is never allowed in hostname or path

FEEDBACK IS APPRECIATED
 Eric  Wed Mar 19, 2008 2:00 pm  
Quite nice, i'd needed a good way to validate URL's! the last comment is very good! tnx!
 Toby Wallis  Fri Mar 28, 2008 7:37 am  
Nice... does it handle non-tlds like .co.uk, .net.uk, .cn etc.?

===Toby===
 mkeefe  Fri Mar 28, 2008 1:44 pm  
The tld can be anything given this example, however I recommend something more advanced in order to truly be secure.
 Sam  Tue Apr 14, 2009 1:08 pm  
Good job fqa at vardagsrummet.mine.nu !
tx for sharing
 Josh  Wed Sep 30, 2009 3:14 pm  
Thanks for that. I've been working on that very issue.

See it in action at http://www.MyWeddingHosting.com/blog.php

Josh
 Adidz  Fri Mar 26, 2010 5:53 am  
@fqa: what about an url like this:
http://www-google-com
http://www_google_com
 Jon  Thu Jun 17, 2010 12:21 am  
I've been trying to get this to work with preg_match():

// SCHEME
$urlregex = "^(https?|ftp)://";

// USER AND PASS (optional)
$urlregex .= "([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?";

// HOSTNAME OR IP
$urlregex .= "[a-z0-9+$_-]+(.[a-z0-9+$_-]+)*"; // http://x = allowed (ex. http://localhost, http://routerlogin)
//$urlregex .= "[a-z0-9+$_-]+(.[a-z0-9+$_-]+)+"; // http://x.x = minimum
//$urlregex .= "([a-z0-9+$_-]+.)*[a-z0-9+$_-]{2,3}"; // http://x.xx(x) = minimum
//use only one of the above

// PORT (optional)
$urlregex .= "(:[0-9]{2,5})?";
// PATH (optional)
$urlregex .= "(/([a-z0-9+$_-].?)+)*/?";
// GET Query (optional)
$urlregex .= "(?[a-z+&$_.-][a-z0-9;:@/&%=+$_.-]*)?";
// ANCHOR (optional)
$urlregex .= "(#[a-z_.-][a-z0-9+$_.-]*)?$";

// check
if (eregi($urlregex, $url)) {echo "good";} else {echo "bad";}

The problem seems to be that this is POSIX and for preg it needs to be PCRE. I'm having quite a time with it. Anyone know how to fix it?
 Lucas  Fri Aug 20, 2010 2:13 pm  
eregi will be depreciated in PHP 6 I think, so best probably to not use it for new projects.
 Tony  Thu Sep 23, 2010 10:37 am  
This is a bit too simple - just having http:// doesn't mean it's a valid url - I use the following...

function validateURL($url){
$pattern = '/^(([w]+:)?//)?(([dw]|%[a-fA-fd]{2,2})+(:([dw]|%[a-fA-fd]{2,2})+)?@)?([dw][-dw]{0,253}[dw].)+[w]{2,4}(:[d]+)?(/([-+_~.dw]|%[a-fA-fd]{2,2})*)*(?(&?([-+_~.dw]|%[a-fA-fd]{2,2})=?)*)?(#([-+_~.dw]|%[a-fA-fd]{2,2})*)?$/';
return preg_match($pattern, $url);
}
 W.M.Z  Thu Jan 20, 2011 3:55 am  
nice example dude. Simple is better
 Richard Cummings  Thu Mar 31, 2011 1:01 pm  
These are all great examples. I thank everyone for their contributions!
 fqa  Sun May 8, 2011 7:01 am  
wow is this still alive? my contribution is probably NOT the way i would do it today.

@ Adidz Fri Mar 26, 2010 5:53 am
sure, why not, just enable the:
$urlregex .= "[a-z0-9+$_-]+(.[a-z0-9+$_-]+)*"; // http://x = allowed (ex. http://localhost, http://routerlogin)
instead of the others

@Jon Thu Jun 17, 2010 12:21 am
start new. maybe look into:

@Tony Thu Sep 23, 2010 10:37 am
what does it cover?
Add a comment
The ability to add new comments has been disabled in this archived version.

Loading
Main | Tutorials | Articles