The title of this blog comes from a new 51 page report, Big Data and Data Protection, published on 28 July by the Information Commissioner's Office (ICO) in the UK. The report is based on research into big data and privacy undertaken since June 2013, including secondary research, interviews with practitioners and experts in the field.
This is a very timely report. It's been published at a time when two organisations, Facebook and OKCupid have been accused of using personal data for analysis and experiments without gaining the consent of those concerned. Christian Rutter, Founder of OKCupid said in defence of what they've done: 'But guess what everybody: if you use the internet you're subject of hundreds of experiments at any given time on every site. That's how websites work.'
Not according to the ICO, hence the title of this blog which is a quote from the report, and in the introduction the ICO also says: 'Benefits cannot simply be traded with privacy rights'.
As I've said before, there's no single accepted definition of Big Data ('BD' from here onwards), but the ICO describes Gartner's that is based on the 3Vs: Volume, Variety and Velocity, discussing the issues under these three headings.
Of course, a lot of BD contains no personal data, so there's no privacy issue there, but one key attribute of much BD analysis is based on integrated datasets, which may include a mix of personal and non-personal data, thereby potentially making subject to data privacy legislation.
As the ICO advises: 'Complexity of BD analytics is not an excuse for failing to obtain consent where it is required'. This brings into play two other key issues discussed at length in the report; firstly, re-purposing of data and secondly anonymisation.
Re-purposing and consent
Dan Nunan discussed the first in his IJMR paper 'Market research and the ethics of big data': '…the challenge is raised over what consent is being sought, given that the purpose of data collection may not be known…Furthermore, big data is built upon the use of unstructured data, which, by definition, are collected without necessarily having knowledge of the purpose it will be put to in the future'.
No excuse in the eyes of ICO – consent is necessary for re-purposing beyond the reassurances given at the time the personal data was collected. Does this mean wide, meaningless statements being used in an attempt to cover all boxes, or the need for retrospective consent when the boundaries are about to be breached?
Here's an example from the ICO that is highly pertinent to research:
'There is also a difference between using personal data when the purpose of the processing fits with the reason that people use the service and one where the data is being used for a purpose that is not intrinsic to the delivery of the service. A retailer using loyalty card data for market research is anexample of the former.
'A social media company making its data available for market research is an example of the latter. This does not mean that the latter is necessarily unfair; this depends on what people are told when they join and use the social media service. It is important to make people aware of what is going to happen to their data if they choose to use the service.'
The use of algorithms raises the stakes, as it may be difficult to predict what the outcome will be, and who it will apply to. However, as the ICO points out, the current law requires human oversight in making such decisions. The report also discusses graduated consent, and exchanging consent for additional benefits.
However, the key test is whether the re-purposing is compatible with the original purpose when the data was collected – discussed in detail within the report.Transparency is at the heart of the report, and the ICO presents detailed rebuttals of the main reasons given for why BD changes the rules: unwillingness of people to read lengthy privacy notices; analytics that are too complex to explain in simple terms; inability to foresee uses of the data.
However, re-purposing, I believe will be a big challenge, and increasingly so over time.
Of course, one solution lies in the second key issue, anonymisation. This is a concept familiar to all market and social researchers; the basis of the promise made to participants since the advent of sector codes going back to the late 1940s.
The ICO has already issued detailed guidance on anonymisation, including research based examples.
The problem in the context of market research is the use of BD to target the 'sample of one' with customised offers, but this as a holy grail has been around since the dawn of database marketing. But now the data and privacy issues are now more complex. However, the principles are equally applicable to the provision and delivery of public services, for example customised healthcare treatments.
Anonymisation should also be a key consideration when obtaining data from third parties – is personal level data really necessary? Otherwise, the 'buyer' needs to ensure that they execute a sufficient level of due diligence to ensure they have the right to use the data at a personal level, for example, has consent been obtained for the intended purpose (or compatible with it).
The increasing challenge here is determining a 'safe' level of anonymisation as it becomes increasingly easier to stitch data together to identify individuals.
Here, the ICO discusses the importance of use:
'How big data is used is an important factor in assessing fairness. Big data analytics may use personal data purely for research purposes, eg to detect general trends and correlations, or it may use personal data in order to make decisions affecting individuals, such as setting their insurance premium. Any processing of personal data must be fair, but if the analytics are being used to make decisions affecting individuals, the assessment of fairness must be even more rigorous.'
The example from Target, a retailer in the USA, provides an interesting illustration about applying the fairness test.
The research exemption
This section is worth reading in detail. In the opening paragraph, the ICO says:
'The term 'research' is not defined in the DPA, but we consider that it can include not only historical orscientific research, but also research for commercial purposes such as market research. However, the exemptions can only apply if the research is not used to make a decision affecting an individual and if it is not likely to cause substantial damage or distress to an individual.
The exemptions only relate to principle 2 (incompatible purposes), principle 5 (retention) and the subject access right. This means that if personal data is obtained for one purpose and then re-used for research, the research is not an incompatible purpose.
Furthermore, the research data can be kept indefinitely. The research data is also exempt from the subject access right, provided, in addition, the results are not made available in a form that identifies any individual. Our Anonymisation code of practice discusses the research exemption in more detail.
It always comes back to the benefits of anonymisation.
The report discusses the implication of all the Principles within the current law, and how BD challenges the issues covered by them. For example, what minimisation means and justifying keeping historic data (although the ICO comment that their research didn't indicate that commercial organisations are necessarily storing older data more readily).
The complexity of Subject Access is also discussed in detailed, and the need for innovative solutions. For example, Acxium is piloting a web portal that enables people to see the data held about them. Other issues covered in detail are Security, Outsourcing, International transfer, and of course the importance of conducting Impact Assessments and having a policy of Privacy by Design.
An international message
International readers might question why this report matters to them. Well, the ripples from EU based legislation spread across the world; privacy legislation is also becoming more pervasive, in over 100 countries to date; big data is sourced internationally. Therefore, any report on this topic from an established data privacy regulator is worth more than a passing glance.
As the USA Federal Trade Commission (FTC) stated in a judgement on a broken promise several years ago, 'You can't change the rules after the games been played'. Many argue that BD is a game changer, needing new rules and the sort of flexibility the FTC ruled against. This ICO report argues why the 'old' rules are still fit for purpose, and the reasons for taking a tough line on calls for flexibility. Building trust is crucial if the benefits of BD are to realised.
The report cites the TACT values introduced by Aimia, responsible for the Nectar loyalty programme: Transparency; Added Value; Control and Trust. These are the principles for the future if we expect consumers to trust us with their data.
Increasingly, organisations such as Aimia and IBM (also described in the report) are responding to this need.
This post was first published on the International Journal of Market Research website.