A team of researchers and academics from universities around the world have come together to develop an algorithm to combat human trafficking. The algorithm is called InfoShield and can search hundreds of thousands of online advertisements for escorts in order to identify victims of human trafficking.
It is currently being studied for effectiveness and could prove to be an important tool in addressing an increasingly disruptive practice affecting young girls and women around the world.
“Given a million escort ads, how can we spot near duplicates? Such microclusters of advertisements are usually signals of human trafficking. How can we visually summarize them to get law enforcement to action? Can we make a common tool that works for different languages? The detection of micro-clusters of near-duplicate documents is useful in several additional settings, including detecting spam bots in Twitter ads, plagiarism, and more, ”explains the study.
“While INFOSHIELD is general, our main motivation lies in the almost double detection and summary in escort ads. Human trafficking (HT) is a dangerous societal problem that is difficult to address. It is estimated that 24.9 million people are in forced labor, 55% of whom are women and girls who make up 99% of the victims in the commercial sex industry, ”he adds.
The algorithm examines large chunks of online ads and identifies clusters of the same ad as a potential signal that this escort may be linked to human trafficking. The team’s research has shown that typically one person controls the accounts of four to six victims at the same time.
“By looking for small clusters of ads that contain similar wording rather than analyzing standalone ads, we can find the ad groups that are most likely to be organized activities, which is a strong signal for human trafficking (HT),” he said Study adds.
“Our algorithm can summarize the millions of ads and highlight the common parts. If they have many things in common, this is not guaranteed, but it is very likely that there is something suspicious, ”explains Christos Faloutsos, co-author of the study.
The algorithm’s potential is considerable, as it is able to search four million documents in about eight hours on a standard notebook. This is a job that would take a team of investigators weeks and months to complete with far less accuracy, which is why it could be an important tool for future law enforcement.
“Our experiments with real-world data show that INFOSHIELD correctly identifies Twitter bots with an F1 score of over 90% and detects human trafficking ads with an accuracy of 84%,” the study emphasizes.
It remains to be seen whether InfoShield will find its way into the field, but given its early success, it should definitely do so.
To read the report on the algorithm, its design and methods, click here.
[Image – Photo by Alexandar Todov on Unsplash]