In this tutorial you will learn how to write a function that is identical to striptags() except it doesn't delete unwanted HTML, instead it converts it to harmless HTML equivalents of < for < and > for >.
Let's start with the function.
function replace_tags($string) {
$total_args = func_num_args() + 1;
$args = func_get_args();
for($i = 1; $i < $total_args; $i++) {
$tags[] = $args[$i];
}
$string = htmlentities($string);
foreach($tags as $tag) {
$regexp = "\<\;\s?" . $tag . "(.*?)\>\;(.+)\<\;/\s?" . $tag . "\s?\>\;";
$string = preg_replace("#" . $regexp . "#i", '<' . $tag . '\\1>\\2</' . $tag . '>', $string);
}
return $string;
}
To use this function you could use something like this:
$string = replace_tags($tmp_str, 'a', 'u', 'b', 'i'); to allow the a, underline, bold, and italics tags. It can adapt to however many arguments you pass it to it, just make sure the first argument is the string you wish to modify.
Firstly, let's explain the func_num_args() and func_get_args() functions. The func_num_args() function returns the total number of arguments that are being passed to the function you're calling from inside of. The reason we add 1 to this number is to skip over the $string argument; hence, the for loop starts at 1 rather than the standard 0.
When you call the func_get_args() function inside of a function it retrieves the arguments from the list provided by the code. In this case, we store them in an array called $tags inside of the for loop.
Once you've stored the tags in the array, it's time to retrieve them and run a regular expression on them. The htmlentities() function converts all potentially harmful HTML into harmless equivalents, such as < for < and > for > as explained above.
After we've converted all the tags to < and >, it's time to reconvert those we want to show up. Let's look at the regular expression piece by piece.
\s matches any amount and any kind of white space. This covers people who like to write their HTML like this:
< tag >
That would be valid and would be accepted through this function. However, notice the whitespace catch (\s) has the ? attached to the end of it. This basically means it isn't required, but it will accept it. You can view it as being optional.
You'll also notice the (.*?). This matches anything 0 or more times, which is also set to be "optional". The reason this is there is for HTML tags that require more parts than just the tag, for instance the A tag has href, target, etc. The iframe tag can be used as well as an example. However, you may want to strip the STYLE from the tag so people can't enter CSS on the HTML, but that's up to you.
It also matches an "optional" amount of whitespace at the end of the tags as well.
That's it. After that, the string has be converted how you like it and HTML is no longer a threat to the layout of your page. If you want to include <img> tags you may want to read up on my resizing images tutorial which also provides a comprehensive function which is open for modification.
Enjoy!
Earn real money by betting and investing; or, sponsor, challenge, compete,
recruit, communicate, network, earn money playing games, and much more.