Microsoft partners with Argentinian Province to predict teenage pregnancies and school drop-outs

Microsoft building

Local and federal governments in Argentina continue to invest in a future of algorithmic and databased governance. On the national level, the federal government has developed a database profiling citizens based on socioeconomic data in order to allocate social benefits more efficiently as well as a database that stores civilian biometric data to improve public safety and criminal investigations

In June 2017, the local government of the province of Salta partnered with Microsoft to create and deploy two different predictive tools optimized for identifying teenage pregnancies and school dropouts (Ortiz Freuler and Iglesias 2018). Trained from private datasets made available by the province’s Ministry of Early Childhood, the two systems identify those with the highest risk of teenage pregnancy and dropping out of school, alerting governmental agencies upon determining high-risk subjects. 

The systems create predictions after analyzing up to 80 variables related to the person (with categories of variables including personal, education, health, employment, housing, family, etc.) (ibid). Such variables have been found capable of causing undue discrimination (ibid). Additionally, the systems have been critiqued for their opacity and error ratio. While the models utilize a type of machine learning algorithm called a “two class boosted decision tree” (which is more understandable and explainable than various alternatives), both systems still operate within a black box. Lacking transparency, the system makes it impossible for a citizen or researcher to trace the logic of the algorithm and understand why some subjects receive the ‘high-risk’ label. The systems also have a significant error ratio, with the teenage pregnancy identification algorithm generating a 15% false positive rate (or, incorrectly labeling 15% of the sample size as high-risk when they should not receive that label) and the school dropout identification algorithm generating a 20% false positive rate (ibid). Despite these criticisms, other provinces in Argentina (Tierra del Fuego, La Rioja, Chaco and Tucumán) as well as a province in Colombia (Guajira) are negotiating with the Salta government to adopt the systems.

This case of algorithmic governance in Argentina also raises concerns regarding its position within a trend that scholars call digital colonialism, or the foreign development and deployment of technological infrastructure within developing countries for the extraction of their raw data (Kwet 2019). Similar to how historical colonial powers built railways and waterways to gain access to native people’s natural resources, digital colonists build digital access routes to mine native people’s raw data. Scholars have recognized a pattern of LMICs (low- and middle-income countries) partnering with foreign firms to build databased infrastructure in the name of development (Taylor and Broeders 2015). Often an unforeseen consequence of these datafication projects (or initiatives “generating digital data that is machine-readable and computationally manipulable, particularly for ‘big data’ analytics”) is the creation of asymmetric power dynamics that exploit developing countries (ibid). In this specific case, the two algorithmic systems arose from a public-private agreement between the government of Salta, a resource-starved administration with demanding social problems, and Microsoft, one of the largest technology corporations in the world (ibid). Due to a lack of data processing expertise and infrastructure, Salta outsourced this responsibility to Microsoft and allowed them full access to a database full of sensitive civilian information. Given the project’s opacity, it is unclear how Microsoft is analyzing, storing and processing the province’s data. Above all, the underlying unequal power dynamic between an international corporation with a trillion-dollar valuation and an under-resourced Argentinian provincial government should raise ethical questions.