PHP: replace

Pilot-Doofy

Member since: Sep. 13, 2003

Offline.

Member Level 37 Musician

PHP: replace_tags() 2006-02-27 17:02:39

In this tutorial you will learn how to write a function that is identical to striptags() except it doesn't delete unwanted HTML, instead it converts it to harmless HTML equivalents of < for < and > for >.

Let's start with the function.

function replace_tags($string) {
$total_args = func_num_args() + 1;
$args = func_get_args();
for($i = 1; $i < $total_args; $i++) {
$tags[] = $args[$i];
}

$string = htmlentities($string);

foreach($tags as $tag) {
$regexp = "\&lt\;\s?" . $tag . "(.*?)\&gt\;(.+)\&lt\;/\s?" . $tag . "\s?\&gt\;";
$string = preg_replace("#" . $regexp . "#i", '<' . $tag . '\\1>\\2</' . $tag . '>', $string);
}

return $string;
}

To use this function you could use something like this:

$string = replace_tags($tmp_str, 'a', 'u', 'b', 'i'); to allow the a, underline, bold, and italics tags. It can adapt to however many arguments you pass it to it, just make sure the first argument is the string you wish to modify.

Firstly, let's explain the func_num_args() and func_get_args() functions. The func_num_args() function returns the total number of arguments that are being passed to the function you're calling from inside of. The reason we add 1 to this number is to skip over the $string argument; hence, the for loop starts at 1 rather than the standard 0.

When you call the func_get_args() function inside of a function it retrieves the arguments from the list provided by the code. In this case, we store them in an array called $tags inside of the for loop.

Once you've stored the tags in the array, it's time to retrieve them and run a regular expression on them. The htmlentities() function converts all potentially harmful HTML into harmless equivalents, such as < for < and > for > as explained above.

After we've converted all the tags to < and >, it's time to reconvert those we want to show up. Let's look at the regular expression piece by piece.

\s matches any amount and any kind of white space. This covers people who like to write their HTML like this:

< tag >

That would be valid and would be accepted through this function. However, notice the whitespace catch (\s) has the ? attached to the end of it. This basically means it isn't required, but it will accept it. You can view it as being optional.

You'll also notice the (.*?). This matches anything 0 or more times, which is also set to be "optional". The reason this is there is for HTML tags that require more parts than just the tag, for instance the A tag has href, target, etc. The iframe tag can be used as well as an example. However, you may want to strip the STYLE from the tag so people can't enter CSS on the HTML, but that's up to you.

It also matches an "optional" amount of whitespace at the end of the tags as well.

That's it. After that, the string has be converted how you like it and HTML is no longer a threat to the layout of your page. If you want to include <img> tags you may want to read up on my resizing images tutorial which also provides a comprehensive function which is open for modification.

Enjoy!

Merkd.com - It Pays to Play

Earn real money by betting and investing; or, sponsor, challenge, compete,

recruit, communicate, network, earn money playing games, and much more.

…

liljim

Member since: Dec. 16, 1999

Offline.

Send Private Message
Browse All Posts (11,656)

Staff Level 30 Blank Slate

Response to PHP: replace_tags() 2006-02-27 18:06:11

Why are you escaping ampersands and semi-colons in the expression? If you want to get this to work correctly with tags that have quotes in them, you'll have to replace " with ". Having said that, you have to be very careful with stuff like this... Take for example, the following input:

[a href="http://www.google.com" onclick="for(i=0;i<100000000000;i++){alert
('blah');}"]whatever[/a]

Harmless, but irritating nonetheless.

…

Pilot-Doofy

Member since: Sep. 13, 2003

Offline.

Member Level 37 Musician

Response to PHP: replace_tags() 2006-02-27 18:08:03

Oh crap I forgot I escaped those. I was writing it while working on English homework and got tired of looking up Greek writers. :-P Anyway, I didn't feel like safe guarding the code for every possibilty as you explained.

I told the user they may want to check the other input contained in (.*?) for malicious code. As you further explained and gave an example of, it can be irritating sometimes.

Merkd.com - It Pays to Play

Earn real money by betting and investing; or, sponsor, challenge, compete,

recruit, communicate, network, earn money playing games, and much more.

…

Claxor

Member since: Oct. 21, 2005

Offline.

Member Level 12 Blank Slate

Response to PHP: replace_tags() 2006-02-28 14:01:31

One thing to note is that some hosts doesn't allow preg, as they don't have the PCRE librarie installed, making them having to use ereg instead. And as ereg doesn't allow lazy, that could be a bit of a problem :/

PCRE librarie

…

PHP: replace_tags()

FEATUREDCONTENT

Earth Day Collab 2024

WINDOWZ XP 2012 EDITION

Gacha Panic

Main Sections

Extra, Extra!

Community

NG Related