New CMU algorithm detects online fraud
A team at Carnegie Mellon has developed an algorithm to see whether a review on Amazon or Yelp has been faked or if a politician has bought their Twitter followers.
Christos Faloutsos, a professor of machine learning and computer science at Carnegie Mellon, devised the method, called FRAUDAR, along with his data analytics team. This new algorithm will help social media platforms identify fraudulent users and see through the camouflage that makes them look legitimate.
“We’re not identifying anything criminal here, but these sorts of frauds can undermine people’s faith in online reviews and behaviors,” Faloutsos said in a university press release.
He noted in that same release that most social media platforms try to flush out such fakery, and FRAUDAR’s approach could be useful in keeping up with the latest practices of fraudsters.
FRAUDAR relies on graph mining, a method analyzing data for patterns, which is a specialty of Faloutsos and his group.
Each social media interaction is plotted on a graph, with users represented by dots and interactions between users represented by lines.
Using tools like Faloutsos’ NetProbe, the goal is to find a pattern called “bipartite core,” which are groups of users that interact with members of a second group, but not with each other.
This pattern is indicative of fraudulent accounts, which follow other accounts to inflate their reputation. These frauds have fake interactions with real accounts and can post either flattering or unflattering reviews depending on their purpose.
In recent years, however, fraudulent accounts have learned how to blend in, linking their account to popular sites and celebrities or simply hijacking legitimate accounts, which is where FRAUDAR comes in.
The algorithm first identifies legitimate accounts, essentially accounts that follow a few people, who post occasional reviews, and have otherwise normal behavior.
As the program eliminates legitimate accounts, the camouflage of the fraudsters becomes more obvious, and bipartite cores become easier to spot.
To test the algorithm, FRAUDAR was applied to Twitter data from 2009 for 41.7 million users and 1.47 billion followers. FRAUDAR was apple to identify more than 4,000 accounts with suspicious activity that were previously unidentified as fraudulent, including those that used TweepMe and TweeterGetter, known follower-buying services.
The group selected 125 followers and 125 followees at random from suspicious groups and two control groups of 100 users.
The accounts were then examined for links with malware, scam, and robot-like behavior. They found that 57 percent and 40 percent of the followers and followees, respectively, in the suspicious groups were labeled fraudulent.
In comparison, only 12 percent and 25 percent of followers and followees in the control group were labeled fraudulent.
The group found that 41 percent and 26 percent of followers and followees, respectively, from the suspicious accounts included advertising for follower-buying services. There were few mentions of such activity found in the control groups.
“The algorithm is very fast and doesn’t require us to target anybody,” Faloutsos said in the university press release.
FRAUDAR has been made available as an open-source code. “We hope that by making this code available as open source, social media platforms can put it to good use,” Faloutsos said.