Keep your email secure (and what just doesn’t work)

The best way to keep your e-mail address secure from evil spam bots is some kind of javascript and obfuscation, which unfortunately isn’t always available. There are enough alternatives though.

Usually people tend to replace the ‘@’ with some short replacement like ‘{at}’ or ‘bij’. This just doesn’t help.

Any programmer with a bit of knowledge of regex can create a program that scans for domain names and interprets every small bit of text in front of it as an @ sign.

Some smarter people also replace the dot. This works, unless your email-host uses a easily recognizable TLD (.com) or domainname (gmail.com).

Also putting ‘SPAM’ in your email-adress some.personREMOVETHISFORSPAM@foo.bar is easily filtered.

Best thing is to use something out of the box.

For instance, my email address is X@Y, where:
X = bas.westerbaan
Y = gmail.com
Also I’ve got an email-address on w-nz.com, namely bas.westerbaan.

Or even maybe w-nz.com@bas.

Update on the anti-email-harvester mailto links

In the previous post I described a simple though effective method to get rid of the constantly cleverer spam email harvester bots.

I’ve made a little update on the algorithm, it now uses only 1 number for each character and uses a cascading incremental xor transform.

Python code for the algorithm itself:

def alphaicx(s):
    ret = ""
    cascvalue = 0
    for i in range(0, len(s)):
        ret = ret + chr(ord(s[i]) ^ cascvalue)
        cascvalue = (ord(ret[i]) + 1) % 255 
    return ret
def betaicx(s):
    ret = ""
    cascvalue = 0
    for i in range(0, len(s)):
        ret = ret + chr(ord(s[i]) ^ cascvalue)
        cascvalue = ((ord(ret[i]) ^ cascvalue) + 1) % 255
    return ret

I designed the algorithm in Python. Python is great for that kind of stuff.

As you can see there are 2 functions, when you encode something with alphaicx you can decode it with betaicx, and visa versa. betaicx creates tougher code though. This encryption is pretty lousy, but hard enough to stop spam bots.

I’ve ported betaicx to PHP, and alphaicx to Javascript. The running example (very usefull though) has been updated.

The PHP/Javascript code for the function:

function JSBotProtect($text){
	$cxred = "0";
	$cascval = 0;
	for($i = 0; $i < strlen($text); $i++){
		$value = (ord($text[$i]) ^ $cascval);
		$cxred .= "," . $value;
		$cascval = (($value ^ $cascval) + 1) % 255;
	}
	return <<<EOF
<script type="text/javascript">var cxred=String.fromCharCode({$cxred});
var uncxred=""; var cascval=0;for(i=1;i<cxred .length; i++)
{uncxred+=String.fromCharCode(cxred.charCodeAt(i)^cascval);
cascval=((uncxred.charCodeAt(i-1))+1)%255;}document.write(uncxred);</script>
EOF;
}

I’ll more compact uncxred storage. Probable just normal hex, or when I can get it working BASE64.

Protecting your email address against spam bots

Spam bots get smarter these days in harvesting email addresses. They usualy use a regex which searches for ‘.. dot .. ltd’, which isn’t that resource intensive. When that is done a more advanced regex is put in there to get the email adress somehow removing stuff like ‘spam’.

Using normal javascript encoding doesn’t work anymore, for it isn’t that hard for a spider to regognize encoded strings and decode them, whether this is in javascript code or normal html escapes.

Therefore we need to get more inventive:

function JSBotProtect($text){
	$xorred = "0";
	$layer = "0";
	for($i = 0; $i < strlen($text); $i++){
		$layerbit = mt_rand(0, 255);
		$xorred .= "," . (string)(ord($text[$i]) ^ $layerbit);
		$layer .= "," . (string)$layerbit;
	}
	return <<<EOF
	<script type="text/javascript">
		var xorred = String.fromCharCode({$xorred});
		var layer = String.fromCharCode({$layer});
		var unxorred = "";
		for(i = 1; i < xorred.length; i++){
			unxorred += String.fromCharCode(
				xorred.charCodeAt(i)^layer.charCodeAt(i));
		}
		document.write(unxorred);
	</script>
EOF;
}

This PHP function returns a javascript block of code which stores the sensitive string like an email address in 2 parts, which when xorred with eachother result in the original email address.

An implementation to get a mailto: link