Several data management challenges arise in the context of Internet advertising networks, where Internet advertisers pay Internet publishers to display advertisements on their Web sites and drive traffic to the advertisers from surfers' clicks. Although advertisers can target appropriate market segments, the model allows dishonest publishers to defraud the advertisers by simulating fake traffic to their own sites to claim more revenue. This paper addresses the case of publishers launching fraud attacks from numerous ma- chines, which is the most widespread scenario. The difficulty of uncovering these attacks is proportional to the number of machines and resources exploited by the fraudsters. In general, detecting this class of fraud entails solving a new data mining problem, which is finding correlations in multidimensional data. Since the dimen- sions have large cardinalities, the search space is huge, which has long allowed dishonest publishers to inflate their traffic, and deplete the advertisers' advertising budgets. We devise the approximate SLEUTH algorithms to solve the problem efficiently, and uncover single-publisher frauds. We demonstrate the effectiveness of SLEUTH both analytically and by reporting some of its results on the Fastclick network, where numerous fraudsters were discovered.
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Computer Science(all)