Random Thoughts

The Ethics Of Big Data

Some time ago I received a review copy of a book called “Ethics Of Big Data” from O’Reilly; I didn’t get round to writing a review of it here for a number of reasons but, despite its flaws (for example its brevity and limited scope), it’s worth reading. It deals with the ethics of data collection and data analysis from a purely corporate point of view: if organisations do not think carefully about what they are doing then

“Damage to your brand and customer relationships, privacy violations, running afoul of emerging legislation, and the possibility of unintentionally damaging reputations are all potential risks”

All of which is true, although I think what irked me about the book when I read it was that it did not tackle the wider and (to my mind) more important question of the social impact of new data technologies and their application. After all, this is what you and I do for a living – and I know that I haven’t spent nearly enough time thinking these issues through.

What prompted me to think about this again was a post by Adam Curtis which argues that the way that governments and corporations are using data is stifling us on a number of levels from the personal to the political:

“What Amazon and many other companies began to do in the late 1990s was build up a giant world of the past on their computer servers. A historical universe that is constantly mined to find new ways of giving back to you today what you liked yesterday – with variations.

Interestingly, one of the first people to criticise these kind of “recommender systems” for their unintended effect on society was Patti Maes who had invented RINGO. She said that the inevitable effect is to narrow and simplify your experience – leading people to get stuck in a static, ever-narrowing version of themselves.

Stuck in the endless you-loop.”

Once our tastes and opinions have been reduced to those of the cluster the k-means algorithm has placed us in we have become homogenised and easier to sell to, a slave to our past behaviour. Worse, the things we have in common with the people in other clusters become harder to see. Maybe all of this is inevitable, but if there is going to be an informed debate on this then shouldn’t we, as the people who actually implement these systems, take part in it?

4 thoughts on “The Ethics Of Big Data

  1. Every period in the history has its own boundaries. Centuries ago the average-people world was the village, anything outside that was unreachable. Evolution of transportation and then communication changed the boundaries, and today it’s common to see communities that are based on personal interests instead of territory. Also the language is no longer the barrier it was 20-30 years ago, thanks to an increasing knowledge of English and to automatic translators.
    At the same time, the overflow of information makes it easier to delegate the choice of the next movie to an algorithm. But the word of mouth, the news, or the random event might change that. It happens every day.
    I am not afraid of an algorithm acting as a personal agent. It’s just that today it’s still too dumb to be reliable, so I don’t trust it too much.
    What worries me is that in today’s world, too many people trust who talks about chemical trails in the sky. When someone see them as an opportunity (they can be easily manipulated), I see them as a result of the “I only believe what I like to believe” behavior, which is the reason why one could be worried about a “recommender system” manipulating society.
    To me, the problem is the lack of critical spirit. Without that, the manipulation always come from somewhere, removing an algorithm, there will be another tool for that.

  2. Just demoing Power Query’s ability to pull in Facebook feeds legitimately freaked out some of my colleagues. I think a lot of people are afraid to broach the topic.

  3. I’d love to see a wider debate involving people who understand the future negative—and positive—potential of big data, too. I’ve presented on the loss of privacy and the risk of tracing back in 2001 at TechEd, to mostly an amused audience, and a decade later, sadly, civilisation has traded privacy, and maybe liberty, for technological convenience.

    Having said that, a good recommender system would not be based purely on matching you to a cluster, or perhaps (as in collaborative filtering) to a subset of other individuals who have expressed similar interests in the past. A good recommender will also look at ways to expand the individual’s horizon. At the very least, it should offer some items which might be unlikely choices, and arguably some “curated” items that the organisation wishes to promote. The ultimate value of a good recommender is measured not just by its commercial success, but also by the person’s perception: while satisfaction is important, and it is driven by taste similarity, the surprise factor, that is the ability to deliver satisfaction from unlikely recommendations, is crucial.

    Not many companies can do it, at all.

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.