What Does Data Do?

@DavidIanSkinner on the socio-technical ‘racialization’ our police service’s Big Data

Sociologists have rightly highlighted the key role of digital data in contemporary organisation and government. In their critical engagement with the Big Data debates and elsewhere, sociologists focus on the constructed and politicised character of data. But, in doing this important work, they have largely taken at face value a technocratic, functional view of data: i.e. that it is an instrument for knowing, operating, administering and organising. But perhaps we also need to be more curious about the varied, intended and unintended, social effects of data.

I will explain this by discussing my research into the police national forensic DNA database (NDNAD). My primary interest in the NDNAD has been the use of racialized data (that is data organised using racial/ethnic categories) in the operation, governance, and contestation of the database. As I wrote in an article in Sociology last year the NDNAD is racialized in a number of different ways. This can be seen firstly in the disproportionate numbers of people from minorities on the database: some estimates suggest that upwards of 70% of all ‘black’ men in the UK aged between 18 and 35 have profiles stored on the database. The DNA profiles in the NDNAD are routinely classified by “ethnic appearance,” which enables the monitoring of this disproportionate representation of non-white “ethnic minorities”. However this classification also facilitates research aimed at developing techniques to “ethnically profile” unknown suspects using crime scene DNA.

This double-edged role of data in monitoring and profiling underlines the slipperiness of ‘race’ in these processes. This is further stressed by the varied systems of categorization at play. Until recently profiles were assigned an ethnicity on the judgment of a police officer using Police National Computer witness identification codes. Meanwhile, data used to discuss the disproportionality of the racial composition of the NDNAD was derived using a different set of categories (the 16+1 Census classification) and a different means of classification (self-identification).

Susan Leigh Star’s notion of the boundary object is over-used and often abused but in this case it is helpful, as is Star’s sister concept of boundary infrastructures (discussed here). In the NDNAD ‘race’ is a bio-social-informational hybrid that depends on its mutability and overt contingency to operate across institutional contexts and locations: the custody suite; the forensic laboratory; the parliamentary committee and so on. This hybrid is at once an operational, administrative and research object. This blurring is interesting (and not just because it illustrates the contemporary propensity to repurpose data) since it has been an effective means of deflecting criticism of the use of race data in UK forensics.

It could be tempting to view race data as an add-on to the main business of the NDNAD: the collection of genetic material; its translation into digital profiles; and the storage, searching and matching of those profiles. However the ethical work around the database has been crucial to its successful operation. As it grew in size during the last decade it faced a serious crisis of legitimacy. Discussion of race data has been a continual feature of political debates about the NDNAD and of the systems of governance of the database set up to address that crisis. The data (its collection, publication and discussion) has been used to generate public trust in the NDNAD and associated scientific and policing practices by demonstrating openness, ethical scrutiny and compliance with equalities legislation.

The NDNAD seems an open-and-shut case of the coming together of socio-technology and institutional racism. Yet in parliamentary committees and in the internal systems of NDNAD governance, discussion of discrimination and structural disadvantage has been largely avoided or postponed. It is striking how data features in this. Firstly the comparison with different race data sets on arrest and incarceration are used to suggest that, since the disproportionality of the NDNAD is comparable to that in other aspects of criminal justice, the database itself is race-neutral. Secondly, there is a continual focus on the inadequacy of the existing race data and the categories used to construct the data; thus final judgment on the fairness of the database always awaits better, fuller information. Thirdly what ethical discussion of ‘race’ and the database has taken place has transposed consideration of racism into discussion of people’s right to self-identify the ethnicity of their DNA profile.

The example of the NDNAD should therefore make us think more broadly about the performativity of data and the inseparability of the technical and the discursive in socio-technical infrastructures of data. Race data says and does many things in the NDNAD and has been used to obscure its racist outcomes. Moreover the collection and discussion of race data has other stigmatizing and legitimating effects, suggesting to those who operate the system and beyond that, however hard they try to be fair, they cannot help but confront the reality of black criminality.

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation