Supporting refugees and host communities through data-driven humanitarian action

David Pastor, experto en Big Data en el itdUPM

According to UNHCR, there are 70.8 million forcibly displaced population in the world. Out of this group, 41.3 million are internally displaced population, 25.9 million are refugees and 3.5 are asylum seekers.

Right now, UNHCR identifies 23 refugee situations, from which three of them (Syria, South Sudan and Afghanistan) mean the 57% of displaced refugees. The largest part of this displaced population (80%) finds a settlement in countries neighbouring their countries of origin. Turkey hosts 3.7 million refugees, followed by Pakistan, Uganda, Sudan and Germany.

In order to understand the problems and find human-centric solutions to these challenges, it is necessary to design and implement new methodologies and strategies. Data and AI-driven mechanisms can change how humanitarian actors and policy makers manage refugee situations at different stages. We overview here the opportunities brought by data, in concrete, mobile phone data, also known as Call Detail Records (CDRs). This data, collected by telcos, contains the information of communication events: user calling, user called, timestamp, geolocation and other complementary fields. After anonymization and aggregation, it can be used to obtain indicators well suited for enhancing humanitarian operations.

CDRs allow the systematic analysis of mobility flows with high granularity. It is possible to characterize collective patterns, cluster populations and detect anomalies to better assess refugee situations and potentially activate operations. This dynamic data is suitable to track affected population nearly real-time. The visualization of their activity in time and space can help understand how the situation evolves and improve planning.

Migrations produce changes in population distribution that affect the layout of refugee settlements, the appearance of migratory hubs or the configuration of neighbourhoods within cities. This has consequences on food supplies, access to public services, housing prices and inequalities and segregation in certain areas. The latter challenge can be approached by measuring the dynamics of population and their networks using CDRs. Of note, integration is considered the most suitable solution for refugee population that can not be repatriated or resettled in a third country and it is key to design policies in hosting communities. Inclusiveness is also key to self-development, fuel host community and reduce the dependency on humanitarian aid as highlighted by the CRRF. UNHCR Innovation, UN Global Pulse and UPM investigated how data can be used to monitor and understand integration

CDRs enable for computing spatial segregation indicators such as dissimilarity and isolation with high spatial granularity. These metrics can be related to qualitative indicators of residential segregation used by UNHCR. This is a significant improvement for monitoring and assessment systems as it is possible to understand variability in terms of geographical locations such as different urban neighbourhoods. Future improvements comprise integrating this data with other data such as accessibility data or house prices to understand how refugee population can have access to basic rights and ensure their integration.

Point for mobile charge in a refugee camp in Shire (Ethiopia)

Integration is considered a complex, multi-dimensional, non-linear and long term dynamic process. The analysis of how social nets evolve through time is key to be able to quantitatively measure social segregation. According to UNHCR, integration can be achieved when refugee population enjoys the same rights and participates in local activities and groups. CDRs can help reconstruct the social graph and obtain indicators that measure how refugees communicate with other groups and locations with a certain level of refugee population. Homophily and diversity of interconnections were shown as good proxies of integration that have to be further enriched with other data. Thus, by labelling regions, the temporal evolution of the social nets can be monitored. 

By integrating mobility and networks, it will be possible to discover factors of migrations at different scales of the settling population and also evaluating the impact of policies and financial programmes.

Can we imagine human-machine data-driven mechanisms that activate and optimize humanitarian action through real-time proxies? The implications for advocacy and global accountability and transparency are substantial. This way of working has great potential to overcome the limitations of current systems based in situ assessments carried out by humanitarian actors or governments. This vision requires innovation, both in technology and management.

The challenges are significant. First, establishing a common framework for accessing data real-time from different operators in several countries with heterogeneous legal frameworks. Currently, working groups (EC B2G, UN Privacy Group) are discussing about ethics, privacy and consequences for business of data sharing approaches to leverage private sector data for development and humanitarian applications. Second, ensuring refugees’ privacy and safety through technology and protocols. Third, developing methodologies to identify patterns of the population of interest among all the aggregated patterns of the whole population. While AI can help identifying behaviours provide data of a control population, this data is not available and the characterization has to be done through holistic approaches. Finally, transforming organizations to be data-driven, which requires creating new capacities.

In 2016 UNHCR, in collaboration with UNGP, approached how mobile phone data could be the basis for monitoring systems. A project undertaken in Senegal in synergy with the D4D Challenge promoted by Orange allowed obtaining the conclusions presented. Although this project was focused on Senegal, it introduces clues on how to scale up data-driven operations considering geographical, urban and social factors. 

More recently, the international D4R Challenge made available aggregated and anonymized CDRs from Turkey by Türk Telecom. In this occasion, data contained labelled information about syrian refugee status, allowing characterizing refugee patterns. Being syrian migration in Turkey the biggest situation identified by UNHCR, this Challenge was a unique opportunity to understand how to improve humanitarian systems.

Overall, the indicators described are key to measure the impact of humanitarian help from individual, collective and systemic perspectives, as well as to help designing better policies for a more sustainable and inclusive development. Many efforts must still be done to leverage this technology to support refugees and hosting communities but it is an opportunity that we should not miss.