ISLAMABAD: Around 60 researchers from 30 academic institutions and 11 different countries have been selected to conduct research on the impact of social media over elections using Facebook data.
According to a press release issued by Facebook, these researchers were selected independently by Social Science One and Social Science Research Council which have been described as Facebook’s “partners” as part of this initiative.
As part of the research initiative, Facebook said that it had been working on building “a first-of-its-kind data sharing infrastructure to provide researchers access to Facebook data in a secure manner that protects people’s privacy.”
As a result, the researchers will have access to three types of data sets according to Facebook.
One data set is called CrowdTangle. This comprises of posts from public pages, public groups and verified profiles over Facebook and Instagram. According to Facebook, CrowdTangle will allow “researchers to track the popularity of news items and other public posts across social media platforms.”
The second data set is called Ad Library API. This will house data on ads related to politics or issues on Facebook in the US, UK, Brazil, India, Ukraine, Israel and the EU.
The third data set to which researchers would have access is called Facebook URL data sets. This includes URLs that have been shared on Facebook by at least 100 unique Facebook users on average who have posted the URL with public privacy settings. According to Facebook:”This dataset includes the URL link and information on the total shares for a given URL, a text summary of content within the URL, engagement statistics such as the top country where the URL was shared, and information related to the fact-checking ratings from our third-party fact-checking partners.”
While the researchers will have access to the first two data sets from today, the third data set would not be accessible before a training that is scheduled to be held in June.
Regarding the safety and security of data sets, Facebook noted: “We’ve consulted with some of the country’s leading external privacy advisors and the Social Science One privacy committee for recommendations on how best to ensure the privacy of the data sets shared and have rigorously tested our infrastructure to make sure it is secure. Some of these steps include building a process to remove personally identifiable information from the data set and only allowing researcher access to the data set through a secure portal that leverages two-factor authentication and a VPN. In addition to building a custom infrastructure, we’re also testing the application of differential privacy, which adds statistical noise to raw data sets to make sure an individual can’t be re-identified without affecting the reliability of the results. It also limits the number of queries a researcher can run, which ensures the system cannot be repeatedly queried to circumvent privacy measures.”
The company expressed hope that this testing would lead to other benefits by letting the company unlock more data sets to the research community safely and securely.