After four years of checking the web pages it indexes for malware that could infect customers and 3 million warnings per day to users about to click on search results that might net them a dose of malware as well as information, Google has learned a few lessons about how hard it is to spot malware.
It issued a report explaining how hard it is to identify malware embedded in a malicious site, and what it has done about it.
Usually the problem with tech-industry surveys is that the number of respondents or volume of data aren't high enough to offer a statistically valid sample.
Not this time. Google's analysis covers 160 million web pages on 8 million sites, focusing on both the social and technical engineering malware distributors use to get their bits of nasty to launch on your hard drive.
The most frequent use of social engineering is to have malware pose as an anti-virus product or browser plugin, but only 2 percent of the sites went the social route.
Most are drive-by downloads – which exploit a weakness in the browser to keep it from blocking malware from downloading and installing automatically.
Drive-bys are where the malware and anti-malware are most directly at odds. Antivirus and operating-system vendors update their recognition tables and plug vulnerabilities so quickly that malware writers have to keep switching from one exploit to another to keep from being flagged and blocked.
The freakiest technique is "IP cloaking" – a way to hide a malware-download site in plain sight by setting it up to it will show legitimate content to anything it identifies as a malware scanner and malicious content to everyone else.
In their blog about the report, Google's security team makes a point of saying they run their scans in different ways to make it harder for IP-cloaked sites to identify them as a scanner rather than a browser.
Judging by the steep increase in use of cloaking for individual sites and entire domains, Google's emulation isn't all that effective.
Cloaking based on IP addresses is simple and dirt cheap, requiring only a few tweaks to the web server so it knows what content to serve to whom.
The number of malware sites that hide behind IP cloaks went up from about 7 percent in 2008 to 49 percent in 2010, Google report said.
Cloaking can also be used to increase traffic by offering more than one type of content or keywords to spiders.
That's why, according to Optimization-Nation, you can do a keyword search, click on a good result and end up on a page with no content related to the keyword.
The Google team talk a lot about what they do to get around the blocks malware writers put up and four methods of malware search they use so they can compare results that will highlight as malicious sites whose results from the different approaches don't match.
They don't say much about any long-term or highly successful countermeasure to IP cloaking, which makes me think – along with the impressive variety of really well informedpeople who are really enthusiastic about cloaking and do not have the best interests of the rest of us in mind – we're going to see a lot more trouble with cloaked sites in the future than we have until now.
If there's one thing malware distributors have gotten good at, it's hitting the users where the malware scanners ain't. From a malware perspective, cloaking seems like a perfect answer.