# Combining disparate data sources for improved poverty prediction and mapping

1. aComputer Science and Engineering, State University of New York, Buffalo, NY 14221;
2. bEarth and Life Institute–Environment, Université Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium
1. Edited by Anthony J. Bebbington, Clark University, Worcester, MA, and approved September 26, 2017 (received for review January 9, 2017)

1. View larger version:
Fig. 2.

Quantiles of predicted (Left) and actual (Right) MPI at the commune level. The urban centers are depicted by small circles on the map. The communes in the Dakar and Thiès regions are shown enlarged.

2. View larger version:
Fig. S6.

Residual vs. fit plots to predict incidence of poverty (H) using CDR (Top) and environmental (Bottom) data. (Left) Linear (elastic net regression). (Right) Nonlinear (GPR). Linear model fits indicate nonlinearity in the data. The residuals for GPR are normally distributed. Shapiro–Wilk test statistic: CDR, 0.97 (P value <mml:math><mml:mrow><mml:mo><</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>?</mml:mo><mml:mn>9</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math><10?9); environmental, 0.95 (P-value <mml:math><mml:mrow><mml:mo><</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>?</mml:mo><mml:mn>9</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math><10?9).

3. View larger version:
Fig. 3.

Predictive power of the Gaussian process model. Left denotes the comparison of actual and predicted MPI values for all communes and urban areas of Senegal. The rural and urban areas are differentiated using blue and red colors, respectively. The size of the circle denotes the variance of the MPI prediction for that commune. Top Right shows how the actual and predicted values compare for asset ownership, while Bottom Right shows the comparison for years of schooling.

4. View larger version:
Fig. S5.

Relationship between precision of estimates of poverty and the population density of each commune.

5. View larger version:
Fig. S1.

Spearman correlation matrix between individual deprivations, H (headcount of poverty), A (intensity of poverty), and MPI at the commune level.

6. View larger version:
Fig. S2.

Visualization of selected features using elastic net regularization on environmental data for prediction of selected deprivations. The rows represent the features, which are ranked according to their weights from positive (marked green) to negative (marked red). Different features groups are color-coded. Features related to food availability are given in black color, whereas those related to food accessibility are colored green. The land cover features are colored yellow, and the features detailing economic activity are in red color. Finally, features depicting access to services are shown in blue. The cells in white were given 0 weights by our model.

7. View larger version:
Fig. S3.

Visualization of selected features using elastic net regularization on CDR data for prediction of selected deprivations. The rows represent features, which are ranked according to their weights from positive (marked green) to negative (marked red). The columns are the various deprivations. The feature groups are color-coded. Features related to diversity features are colored blue. Those related to spatial aspects are colored yellow. The features related to active behavior are marked in black. The features related to basic phone use are in red, and those related to regularity are in green. The cells in white were given 0 weights by our model. Legend in parentheses corresponds to the different variation in weights. H and A weights vary between 1.85 and <mml:math><mml:mrow><mml:mo>?</mml:mo><mml:mn>1.85</mml:mn></mml:mrow></mml:math>?1.85, and for others the weights vary between 5.5 and <mml:math><mml:mrow><mml:mo>?</mml:mo><mml:mn>5.5</mml:mn></mml:mrow></mml:math>?5.5.

8. View larger version:
Fig. 4.

The uncertainty associated with each dataset evidenced by the most accurate one (denoted as CDR and ENV) for the average intensity of poverty (A) (Left) and prediction of the headcountof poverty (H) (Right).

9. View larger version:
Fig. S4.

The highest deprivation by commune as predicted by our model for each dimension of global MPI (from top to bottom: education, health, and standard of living).

#### Online Impact

• 7638311322 2018-02-22
• 9654151321 2018-02-22
• 1588961320 2018-02-22
• 5712971319 2018-02-22
• 5536211318 2018-02-22
• 4417061317 2018-02-22
• 3024201316 2018-02-21
• 4658931315 2018-02-21
• 3216561314 2018-02-21
• 1965251313 2018-02-21
• 970811312 2018-02-21
• 609011311 2018-02-21
• 3219131310 2018-02-21
• 613261309 2018-02-21
• 6972481308 2018-02-21
• 2758991307 2018-02-21
• 5213301306 2018-02-21
• 6402651305 2018-02-21
• 975701304 2018-02-20
• 619701303 2018-02-20