Why would I need a language filter?
Do we really need to answer this. Actually yes, too many times I have visited web sites that allow users to post directly to the web page with no filtering. Do you like your web site being cached by search engines with nothing but obscenities? I would say no. Anyone who needs to filter data directly from an end user should use a language filter.
What will be covered in this tutorial?
I will run you thru the basic set-up of writing a function to handle your obscenities. There are two different ways to go about doing this and we will cover both. One is with a simple array and the other would be to use an outside text file or database filled with curse words ready to compare. I would use either an array or a file personally and those will be the two we will be covering.
Array Based Language FilterCODE
<?php
function language_filter($string) {
$obscenities = array("curse","word"," foul ","language");
foreach ($obscenities as $curse_word) {
if (stristr(trim($string),$curse_word)) {
$length = strlen($curse_word);
for ($i = 1; $i <= $length; $i++) {
$stars .= "*";
}
$string = eregi_replace($curse_word,$stars,trim($string));
$stars = "";
}
}
return $string;
}
?>
The code above is the entire function. Pretty small but very powerful. Let's break down what is happening here. First we name our function to whatever we want, I have appropriately named it language_filter(). Next we pass any arguments to the function we will need. In this case we need the string we want to clean, hence the variable $string.
I have not filled the array with curse words for cleanliness sake. Okay what we do is set up a simple array containing all the obscenities we do not want to show up on our web site. You set up an array using the array() function, then placing words surrounded by quotes and separated by commas. Pretty straight forward. After we have established our array we then start a for_each() loop that will cycle thru each curse word we have listed in the array.
Next we check if the obscenity is contained within the string passed to the function. If it is we continue, otherwise we skip the process for that obscenity and start with the next one. If the curse word is found in the string, we then have to set up our censorship routine. We gather the length of the curse word found and start a for() loop to replace each letter with a star.
After we replace the word with the correct number of stars, we need to now replace the word within the string. To do this we use a function built into PHP called eregi_replace(). It is the same as ereg_replace() except it is case-insensitive. Basically what it does is replace the curse word with the censored version containing all stars. After we have censored the specific word, we clear the $stars variable and start the process again. The process will loop thru every single word in the array until it has completed the entire list. After all the words in the array have been cycled thru, the newly formatted string is returned thru the function call.
*NOTE: Take notice that some of words in the array have a space in front and in back of them. This is no mistake. Some curse words are actually within other words that are not curse words. Using the example above, let's say we want to block the word foul, but only the word and not words like foulplay (not proper, but used for clarity's sake). We need that space before and after to only block the word itself. This will take some experimenting on your part, but it is needed.
Text File Based Language FilterCODE
<?php
function language_filter($string) {
$obscenities = @file("path/to/your/file/foul_language.txt");
foreach ($obscenities as $curse_word) {
if (stristr(trim($string),$curse_word)) {
$length = strlen($curse_word);
for ($i = 1; $i <= $length; $i++) {
$stars .= "*";
}
$string = eregi_replace($curse_word,$stars,trim($string));
$stars = "";
}
}
return $string;
}
?>
The above code is exactly the same as on the previous page, except we are declaring our obscenities into an array from a formatted text file. What is the difference? Not much except preference and neatness within your code. See your array could become quite large and look unwieldy within your code. If you use the text file method, make sure each word is separated by a new line or carriage return, and don't trim your return words, they might lose necessary spacing.
Calling Your FunctionCODE
<?php
$string = "Curse words are not always foul in their language.";
print language_filter($string);
//Would return: ***** ****s are not always **** in their ********.
?>
Since we are returning a variable from the function we can either print it directly or save it to another variable for later use. Either is fine and both are acceptable.
great ur done.... hope u learned..