Abstract
Machine learning has been increasingly used for making informed public policy decisions, however, its application in the area of social protection in developing societies has been largely overlooked. We have employed unsupervised machine learning K-means clustering technique for exploring a big data that comprised of 88 attributes and 570 instances for better targeting of households that are in urgent need of welfare from the government. The clusters formed showed common patterns relating to insecurities in terms of loss of income and property, unemployment, disasters and disease etc. faced by households in each cluster. We found that households falling in rural areas jurisdictions face severe insecurities compared to other localities and are in urgent need of social protection interventions. We concluded that by employing K-means clustering unsupervised machine learning approach big data (even if it is limited) can be explored effectively for better targeting of social protection interventions for both developing and smart societies. The unsupervised machine learning technique presented in this study is an efficient approach because it can be used by societies that are facing data constraints and can achieve optimal results for increasing the welfare of poor by using the said approach.
Similar content being viewed by others
Notes
As the purpose of this paper is not to explain the social policy therefore a very brief explanation is provided here.
88 attributes were collected against a household.
Australian National University ethics protocol: 2019/377
In Pakistan, the Multidimensional poverty Index (MPI) is a way of measuring poverty. MPI combines various deprivations that affect a household across three dimensions: education, health, and living standards and 11 indicators spread across these 3 dimensions. A household is considered multi-dimensionally poor if it is deprived in at least 33% of the weighted indicators [40]. Details of the cities is provided in Table A of the appendix
In various ML techniques labels are assigned to instances and trainig data is used for constructing models. However, in k means clustering no training data is used and no labels are assigned to to instances for forming clusters.
In DBSCAN dense region is a proximity, where the minimum number of instances are accumulated to establish a new cluster.
In Pakistan, rural and urban areas are present within the geographical limits of a city. Rural areas are generally referred to villages where the process the urbanization is limited or has not taken place and people rely on informal employment mechanisms such as agriculture etc. Whereas, urban areas are generally referred to cities where process of urbanization has taken place and there are opportunities of formal employment. For administering rural areas government has formed union councils and for urban areas municipal corporations are present.
During the survey, it revealed that in order to fulfill the expenses of marriage some households took loans and some had to use their savings, therefore marriage is considered as a shock.
Insurance provided in shape of rotating savings and credit associations where every member contributes towards this fund and get the cash in time of need.
Zakat is one of the five pillars of Islam and is mandatory on every Muslim who is financially stable. According to Islamic teachings, zakat is paid @2.5% of the wealth to the poor and needy Muslims as an obligation. It is applicable on every Muslim who owns 613.35 g of silver, or 87.49 g of gold or who owns one or more assets liable, equal in value to 613.35 g of silver or 87.49 g of gold. Zakat is given to Muslims: who are poor and not have any income source etc.
References
Freeman C, Louçâ F (2001) As time goes by: from the industrial revolutions to the information revolution. Oxford University Press, New York
Buchel O (2015) Big data: a revolution that will transform how we live, work, and think. J Inf Ethics 24(1):132
Patty JW, Penn EM (2015) Analyzing big data: social choice and measurement. PS: Polit Sci Polit 48(1):95–101
Grimmer J (2015) We are all social scientists now: how big data, machine learning, and causal inference work together. PS: Polit Sci Polit 48(1):80–83
Thierer AD, Castillo O’Sullivan A, Russell R (2017) Artificial intelligence and public policy. George Mason University, Mercatus Center
Naudé W, Dimitri N (2019) The race for an artificial general intelligence: implications for public policy. AI & Soc:1–13
Ballestar MT, Doncel LM, Sainz J, Ortigosa-Blanch A (2019) A novel machine learning approach for evaluation of public policies: an application in relation to the performance of university researchers. Technol Forecast Soc Chang 149:119756
Athey S (2017) Beyond prediction: using big data for policy problems. Sci (New York, N.Y.) 355:483–485
Kemshall H (2002) Risk, social policy and welfare. Open University Press, Buckingham
Kangas O, Palme J (2009) Making social policy work for economic development: the Nordic experience: making social policy work. Int J Soc Welf 18:S62–S72
Gough I, Wood GD (2004) Insecurity and welfare regimes in Asia, Africa, and Latin America: social policy in development contexts. Cambridge University Press, Cambridge
Mumtaz Z, Whiteford P (2017) Social safety nets in the development of a welfare system in Pakistan: an analysis of the Benazir income support Programme. Asia Pac J Public Adm 39(1):16–38
Hindman H (2015) Building better models: prediction, replication, and machine learning in the social sciences. Ann Am Acad Pol Soc Sci 659(1):48–62
Andini M et al (2018) Targeting with machine learning: an application to a tax rebate program in Italy. J Econ Behav Organ 156:86–102
Kleinberg J, Ludwig J, Mullainathan S, Obermeyer Z (2015) Prediction policy problems. Am Econ Rev 105(5):491–495
Athey S, Imbens GW (2017) The state of applied econometrics: causality and policy evaluation. J Econ Perspect 31(2):3–32
Burscher B, Vliegenthart R, De Vreese CH (2015) Using Supervised Machine Learning to Code Policy Issues: Can Classifiers Generalize across Contexts? Ann Am Acad Pol Soc Sci 659(1):122–131
Kasy M (2018) Optimal taxation and insurance using machine learning — sufficient statistics and beyond. J Public Econ 167:205–219
Chalfin A, Danieli O, Hillis A, Jelveh Z, Luca M, Ludwig J, Mullainathan S (2016) Productivity and selection of human capital with machine learning. Am Econ Rev 106(5):124–127
Ashrafian H, Darzi A (2018) Transforming health policy through machine learning. PLoS Med 15(11):1002692
Brady ES et al (2017) Machine-learning algorithms to code public health spending accounts. Public Health Rep (1974) 132(3):350–356
Pan I, Nolan LB, Brown RR, Khan R, van der Boor P, Harris DG, Ghani R (2017) Machine learning for social services: a study of prenatal case Management in Illinois. Am J Public Health 107(6):938–944
Benites-Lazaro LL, Giatti L, Giarolla A (2018) Topic modelling method for analyzing social actor discourses on climate change, energy and food security. Energy Res Soc Sci 45:318–330
Hino M, Benami E, Brooks N (2018) Machine learning for environmental monitoring. Nat Sustain 1(10):583–588
Boran A (2012) Poverty: malaise of development. University of Chester Press, Chester
Aspalter C, Pribadi KT (2017) Development and social policy: the win-win strategies of developmental social policy. In: London;New York. Routledge, Taylor and Francis Group
Barrientos A (2013) Social assistance in developing countries. Cambridge University Press, Cambridge
World Bank (2018) The State of Social Safety Nets 2018. World Bank, Washington, DC
Abu Sharkh M, Gough I (2010) Global welfare regimes: a cluster analysis. Glob Soc Policy 10(1):27–58
Monchuk V (2013) Reducing poverty and investing in people: the new role of safety nets in Africa. The World Bank, Washington, DC
Pritchard B (2014) Feeding India: livelihoods, entitlements and capabilities. Routledge, New York and Oxfordshire
Hilbert M (2016) Big data for development: a review of promises and challenges. Dev Policy Rev 34(1):135–174
Rodriguez MZ et al (2019) Clustering algorithms: a comparative approach. J PLOS ONE 14(1). https://doi.org/10.1371/journal.pone.0210236
Sze-To A, Wong AK (2018) Discovering patterns from sequences using pattern-directed aligned pattern clustering. IEEE Trans Nanobioscience 17(3):209–218
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, no 34. AAAI Press, Portland, pp 226–231
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Lovmar L, Ahlford A, Jonsson M, Syvänen AC (2005) Silhouette scores for assessment of SNP genotype clusters. BMC Genomics 6(1):35
Kodinariya TM, Makwana PR (2013) Review on determining number of cluster in K-means clustering. Int J 1(6):90–95
UNDP (United Nations Development Programme) (2016) Multidimensional poverty in Pakistan, Islamabad: UNDP. Accessible at: http://www.pk.undp.org/content/pakistan/en/home/library/hiv_aids/Multidimensional-Poverty-in-Pakistan.html. Accessed 22 Apr 2020
Acknowledgments
The authors would like to thank Mr. Sohail Sarwar for their valuable contribution in providing technical support for the successful completion of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
ESM 1
(PDF 462 kb)
Rights and permissions
About this article
Cite this article
Mumtaz, Z., Whiteford, P. Machine Learning Based Approach for Sustainable Social Protection Policies in Developing Societies. Mobile Netw Appl 26, 159–173 (2021). https://doi.org/10.1007/s11036-020-01696-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-020-01696-z