Microsoft Research has released a new tool to help pinpoint large-scale typo-squatters that are known to be gaming pay-per-click domain parking services.
The lightweight prototype, called Strider URL Tracer, builds on the work within Microsofts Cybersecurity and Systems Management group to keep tabs on a sophisticated typo-squatting scheme that uses multilayer URL redirection to make money from Googles AdSense for domains program.
Yi-Min Wang, who heads up the groups work in Redmond, Wash., said URL Tracer can be used as a parental control tool to block inappropriate ads from being served from Web sites that are set up to deliberately lure kids who accidentally misspell a popular domain.
One live example, Wang said, is the way the virtual pet site at NeoPets.com has been targeted by typo-squatters to serve pornographic-themed ads if it is misspelled. One such misspelling, neoppets.com, is currently serving ads promising naked photos of Britney Spears or other adult images.
He said the group analyzed typo-squatting on 50 popular childrens sites and found more then 7,000 typo-domains. About 2,685 of those domains were active, and a total of 110 were serving questionable content.
“Four domains redirected to adult sites directly, 36 domains contained at least one conspicuous link to an adult site, and the remaining domains displayed at least one conspicuous adult-category link to a page of adult ads listings,” Wang said.
Most of the ads were being served from Oingo.com, a domain parking service that powers Googles popular Adsense for domains program. The domain parking service is aimed at Web sites that generate more than 750,000 page views per month and, according to Googles own boast, Adsense for domains is now powering over 3 million domain names.
“This is a huge, lucrative business,” Wang said, noting that the typo-squatters have been monitoring his groups published work “on a daily basis” and have been moving domains being parking services to dodge detection.
Wangs group has meticulously tracked the typo-squatting scheme for several months as part of its Strider Typo-Patrol project, and he says its clear that big-name trademark owners with high-traffic Web sites are a major target.
In an interview with eWEEK, Wang said URL Tracer can also serve as a typo-patrol tool used by trademark owners who want to monitor typo-domains. “It is often too expensive for target-domain owners to investigate and take actions against a large number of individual typo-domains,” he said, adding that a feature built into URL Tracer can take a target domain name and automatically generate and scan its typo-neighborhood.
The tool uses five programmatic typo-generation models—deliberate missing-dot typos, character omission typos, character permutation typos, character replacement typos and character insertion typos—to pinpoint potential domain-registration structures that are being used to steal traffic from large brands.
Next Page: Targets include MySpace, Slashdot and Amazon.
Targets Include MySpace, Slashdot,
Amazon”>
Wang said high-traffic properties that are a constant target include MySpace.com, Slashdot, Amazon.com, Expedia, Washington Post, New York Times, Microsoft.com and DisneyChannel.com. Deliberately misspelled domains for several major banking and financial services Web sites are also a constant target, he said.
The URL Tracer utility provides four main functionalities. It supports a “URL Scan History” view that records the time stamp of each primary URL visited and its associated secondary URLs, grouped by domains. It also supports an alternative “Top Domains” view that, for each secondary URL domain, displays all the visited primary URLs that generated traffic to it.
For every URL displayed in either of the views, the tool provides a right-click menu with two options: the “Go” option that allows the URL to be revisited (so that the user can figure out which ad came from which URL) and the “Block” option that allows blocking of all future traffic to and from that domain.
“Its basically an extension of HoneyMonkey,” Wang said, referring to another project within his group that helps Microsofts security teams find the source of zero-day exploits targeting the Windows XP operating system.
The Typo-Patrol scanner built into the tool currently consists of a network of 17 machines, each running a daemon process that monitors its own input-request queue residing in a folder on a central management machine. According to Wang, when a list of typo-domains is dropped into the queue, the daemon fetches the list and launches virtual machines to visit each domain.
The daemon copies all recorded data to the host machine, including information on all secondary URLs visited, the content of all HTTP requests and responses, and optionally a screen shot. Upon completing the scan of the entire list, the daemon copies all data to its output folder on the central management machine, Wang said.
Recorded data in the output folder is inserted into a typo-domain database for data queries and analysis.