Anyway, Ohye says, Google's algorithm does not even use the public PageRank ratings to decide how to rank search results. Instead, it uses a completely different, nonpublic database, whose values (fractional numbers rather than a zero to 10 scale) are updated continuously.
As for how this works in practice, Ohye uses the hypothetical example of a news site with a breaking story whose owners find that a lot of other sites are linking to the page with that story.
"They might decide to try to sell links from that page, since it has PageRank," she says, "so we may decide that the rest of the site is good, but we are unsure about the links from that particular page. So we adjust the database so that links from that article don't propagate PageRank. So many link buyers are getting nothing."
The first search engines rated pages purely on their content, spawning keyword-stuffing schemes in which selected words were added to a page to suggest that it was about a popular topic. The added words were then hidden using various coding tricks, such as using white text on a white background or indenting the text so far that it wouldn't appear on the screen.
A variant of this is called cloaking, where the search engine is shown one thing (usually text) and the user is shown something else (usually ads). There are also screen scrapes, which are spam sites composed of material copied from other sites just to draw traffic for ad revenue.
"Do any of these things, and you will probably get caught," Fox warns. "Once Google has found a site that uses one technique, they can use that knowledge to find all other sites that use that technique."
Finally, there is outright hacking, which involves taking over sites with poor security for use in linking schemes. Ohye says that hacking appears to follow two-year cycles, as new techniques erupt and are then brought under control with security patches. Currently, things are under control, she says.
Crime and punishment
There are risks in using these methods in order to boost your rankings. J.C. Penney got off lightly.
"The likelihood of being caught in a few days is low, within six months is pretty good, and within two to four years it's nearly impossible to avoid," Fishkin says. "I have seen black-hat techniques that worked for multiple years, but no one who used a particular technique in 2007 is using that technique today. It takes Google a while to catch up, but it does catch up."
Sign up for Computerworld eNewsletters.