Products | Scripts | Services | Tutorials | Books | Links | Contact | Bulletin Board

Docs and Tutorials

Uploading scripts | Setting up Apache on NT | Converting MP3 files | Install Modules from CPAN
Optimizing mod_perl | Fight Spam - conceal your emails | Build thumbnails in batch mode

Fight Spam - conceal your emails

Have you ever wondered how did they get your email, and you receive heaps of spam? It is the mail on your web site. Bulk mail advertisers use mechanisms similar to those used by a search engine to index your site contents, and retrieve emails from the static content.

There is a simple way to hide your emails, which consists in HTML-escaping the email link, so it does not look like an email at all. That way, your browser will still display the correct mailto: link, but the indexing programs will not see it. For example, instead of the typical tag:

<a href="mailto:myuser@mydomain.com">

You can use something less revealing, as:

<a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;myuser&#64;mydomain&#46;com">

Here is a script that will search all the static content and replace the mail links with the encoded format. You can save this script as "convertmails.cgi", correct the path to point to your web root, and run it in your browser:

			
#!/usr/bin/perl -w
my $Path  = '/home/mydomain/www';

print "Content-type: text/html\n\n" if $ENV{'HTTP_USER_AGENT'};
my $Found = _ProcessDir($Path);
print "$Found files processed\n";

sub _ProcessDir {
  my ($x);
  my $Path  = shift;
  my $Found = shift || 0;
  opendir(DIR, "$Path");  
  my @files = grep(!/^\.\.?$/,readdir(DIR));  
  closedir(DIR);  
  ITEM: foreach $x(@files) {
    $x =~ /(.*)/;
    $x = "$Path/$1";
    next ITEM if -l $x || ! -w $x;
    if (-d $x) {$Found =_ProcessDir($x, $Found)}
    else {
      next ITEM if -z $x || $x !~ /\.html?$/oi; # zero-size file
      $Found++ if _ProcessFile($x);
    }
  }
  return $Found;
}
sub _ProcessFile {
  my ($x, $y, $FData);
  my $File = shift || return 0;
  open(FILE, "<$File") or return 0;
  while () {$FData .= $_}
  close (FILE);
  my $Found;
  $Found += $FData =~ s{(mailto:)?([a-zA-Z](([\w\-]+\.)*[\w\-]+)*@([\w\-]+\.)+[\w\-]+)} {
    $y = $1;
    $x = $2;
    $x =~ s/\./&#46;/oxg;
    $x =~ s/@/&#64;/oxg;
    $x = '&#109;&#97;&#105;&#108;&#116;&#111;&#58;'.$x if $y;
    $x;
  }geosi;
  if ($Found) {
    open(FILE, ">$File") or return 0;
    print FILE $FData;
    close FILE;
    print "$File\n";
  }
  return $Found;
}