Using Big Data to sample ethnic minorities in western European countries: Muslims
Carsten Broich (Sample Solutions BV)
Maja Koceva
Keywords: Social media, big data, sentiment analysis, and emerging technologies
AbstractRandom Digit Dialing (RDD) sample consists of randomly generated phone numbers usually stratified by some specific parameters. Thus, RDD sample is usually used for national representative studies. However, very often there is a need of targeting specific audience where the pure RDD sample provides a low incidence rate and hence it is inappropriate due to high cost. Therefore, a need for other sampling methodologies derives. The target audience could differ in terms of age, gender, ethnicity etc. Nowadays, living in a multicultural society, studies upon ethnic groups are of immense importance. Given the political events in the Middle-East and the migration of Muslim populations from The Middle East to Western and Northern European countries, Muslim ethnic group in different non-Muslim countries gains a big attention. Many research agencies are interested in understanding the behavior of Muslim people.
Sampling Muslims populations within a population can be achieved in various ways. In recent years an onomastic approach using the listed sample or targeted RDD sample has been used which has led to large biases.
This paper outlines three main methodologies in target audience selection: list-assisted, geo-targeted RDD sample and social media approach. The list-assisted approach considers the residential listings containing phone numbers for a specific country. It narrows down the target population from the general population. Geo-targeted RDD sample focuses on large cities while small rural areas are excluded since it is difficult to detect the smallest administrative unit populated by Muslims. Social media approach includes a subset of generated cell phone RDD sample which is linked to various social media and public data sets (Big Data). Moreover, the paper provides the results of a pilot project of social survey upon Muslims in various Western European. Consequently, the theoretical advantages and disadvantages are accompanied by those ones derived from the fieldwork for this quota sampling. Lastly, improvements proposals to overcome some of the disadvantages are being outlined. The proposed recommendations are aiming towards better target audience sampling in future using a true probability sample.