When and How to Validate your PHP variables

Posted: July 28, 2011 in PHP, PHP, Programming, Tutorial, Web Development

Sanitizing and validating user data to make it safe for processing is a very hot topic today, and not at all an exact science. It is very important, as many websites out there are susceptible to many different types of attacks since they don’t try to sanitize or verify any of the data they get from a user.

Now, this is a huge topic, and truly beyond the scope of a single blog post, but I am going to go through the simplest cases which should help most amateur or new web developers protect their site. To make sense out of this tutorial, you should have a basic understanding of:

  • basic PHP syntax
  • how to get input into the $_GET and $_POST super globals
  • basic string manipulation
Validation is important, because it can avoid any type of input that doesn’t even make any sense. This can make sanitizing much simpler, because you have a rough idea of what to expect at that point. Since validation usually comes first in PHP scripts, I will go over that first.
Note: I will be going over the PHP side of things, and won’t really be paying attention to or talking about creating the forms on the front end. That is simple enough and you should know how to do this before reading any further. I will also only be going over validation. I will make my next post about sanitation to complete the tutorial.

We don’t need no.. Validation!

So what is validation exactly? Well consider this: Lets say we have a simple user signup form. This form excepts an email address. We can validate this particular part of the form by making sure that its an actual email address and not some garbage. If it isn’t a valid email address, we can stop the script from inserting stuff into our database, and let the user know through some error message that the email is invalid!

Now, validating an email address may seem like a no-brainer and most secure sites on the internet already do this. Now the question becomes how? Well for emails, doing simple string checks (Like checking for the position of the @ character, or making sure there are no random characters that don’t belong in an email address) can work, the best way is to use what is called regular expressions or regex. Regex itself is a very broad topic, and so I won’t really be going over it beyond a simple explanation. So what is Regex? Well at its base its a mathematical concept that tries to match strings based on a pattern. It uses patterns to do this. You can look at a tutorial for it here: http://www.regular-expressions.info/tutorial.html

So, what kind of pattern do we need for an email address? Well luckily for us and most new web developers, the pattern to match an email address is a commonly used on, and so a quick google search will find us the answer! Consider the following


$pattern = "/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$/";//this is the pattern we will use to determine if its an email

//it may look scary, but don't worry, you don't have to change it or understand it to use it!?
//its a good idea to read through a tutorial for regex so it atleast makes a little bit of sense though

$email = $_POST['email'];//the input from our form

if (!preg_match($pattern, $email)){
echo "Your email was invalid! Please press the back button and enter a valid email address";
exit();
}

So in this code we created our pattern that would match the email address. Then we extracted the input email address from the $_POST super global array, and tried to match them. If they did match, no error would pop up, but if they didn’t then the user would be given an error message and told to re-enter the email address. This is validation folks! A very simple implementation, but validation nonetheless. The pattern may seem scary, but don’t worry about it. We won’t ever really need to change it, and I will go over a simpler regex pattern with a more comprehensive description farther down.

Ok so we know how to validate email addresses. What about other information? Well, besides email addresses, validating other data depends on what you expect, and what rules you have given your users for that specific field. So for example, on one site, a username may have to be at least 6 characters while on another it may have to be at least 4. Some except only numbers and letters, while others allow dots and such. Lets go over some of the most common ways to validate other fields to give you an idea of what to do, and then you can adapt what you have learned to your specific site.

Username/Password minimum length

This is a very common and very easy type of validation. For this, we will use the strlen function. Consider the following


$min_uname_len = 5;//usernames must be 5 characters long at least
$min_pword_len = 4;//passwords must be at least 4 characters long

$username = $_POST['username'];
$password = $_POST['password'];

if (strlen($username) < $min_uname_len){
//error!
echo "Username's must be at least 5 characters. Please go back and try again!";
exit();
}
if (strlen($password) < $min_pword_len){
//error!
echo "Passwords must be at least 4 characters long. Please go back and try again!";
exit();
}

In the code above, we simply use the strlen function to make sure that the password and username are at least the minimum required lengths in order to continue. If they aren’t, like with the email validation, we show the user an error, and stop the execution of the script.

Restricted Characters

This is another very common type of validation that consists of not letting the user use certain characters in their input, or restricting all characters but alpha numeric characters. This, like the above, can be different from site to site, but I will go over a simple case, and you can try to figure out how to adapt it to your specific site.

Here I will use regex again, but it will be a much simpler pattern than the one we used with email. I will try to explain it so that you can alter it to fit your needs. Consider the following:


$pattern = "/[a-zA-Z0-9_-]/";
$username = $_POST['username'];//we want to restrict all characters but numbers, letters, and underscores in the username

if (!preg_match($pattern, $username)){
echo "You can only use numbers, letters and underscores in your username!";
exit();
}

So what does that regex mean? Well the / characters is simply a delimiter which is required in PHP. / is the most commonly used delimiter but you can use any delimiter you want (but if you use a delimiter that may appear in your pattern, you have to escape it in the pattern. Since / doesn’t appear at all in our pattern, we don’t have to escape anything, and our regex looks cleaner!). The [ and ] characters signify that we are using a character class. What this means is that any of the characters specified in that class will match the string. If the string contains characters that aren’t in that class, than the string will not match. The inside of the character class is pretty straight forward. We use whats called a range of characters (the first one being lowercase a to lowercase z, or a-z as we have written. The second is uppercase letters, and the last is numbers). After the three ranges, we also stick an underscore at the end so we can allow underscores to be used in the username. Not too scary right!

And link the rest of the validation code blocks, we show the user an error and exit the script if the username isn’t valid. Regular expressions are an extremely valuable tool, and can be made to match many different types of strings. Here we show how to match based on what characters you will allow, but you could easily alter it to show what character we shouldn’t allow. This I will let you do some research on to try and find out (doing your own research and experimentation is important and a great learning exercise!) If you are having a hard time figuring it out, then shoot me a comment or email and I will try to help out!

The second part of this article can be read here: http://blackscorner.me/2011/08/04/when-and-how-to-sanitize-your-php-variables/

If you have any questions/comments, please don’t be afraid to share! Hope you enjoyed reading this tutorial and learned from it!

Comments
  1. […] on how to validate your PHP variables. That is the precursor to this post (You can read it here: http://blackscorner.me/2011/07/28/when-and-how-to-validate-your-php-variables/) in which I explain the basics of validating your PHP variables that you get for use with MySQL (or […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s