PDF of articles here: Are we the New Digital Soylent Green - PDF
("Soylent Green Is People").
We all worry about protecting our privacy but surprisingly we give it up daily. If the supermarket clerk asked to see your driver's license you would think twice. You stand at the checkout wondering if anyone can see that 4-digit PIN you tap in. However, we give away private details without a thought all through the day. If the clerk 'liked' your PIN, would you let that clerk or anyone else see it?
In addition, social media and the internet is grabbing and analyzing data about us on a scale that could not have been imagined a few years ago.
You sit down to watch the latest series on Netflix. You open your browser. You log onto Facebook. You check your email. You spit into that 23 & Me or Ancestry DNA test tube. Have you noticed how all of these companies know what you like, what you have been looking for and how targeted emails, suggestions and advertisements keep popping up? This isn't magic or coincidence. It is big data analytics, Machine Learning and AI all building a profile of your every digital breath.
We all have a digital footprint that is shared online. This can be used to cross reference all the information available to build bigger and better profiles. All in the name of marketing. In some cases, it is even the absence of information that can help build a better profile. In my years analyzing redaction (the art of removing identifying information from documents etc.) or 'black lining' as it is sometimes called, I have seen how missing information is easily extrapolated.
This is just the tip of the digital iceberg though. Without data, AI and Machine Learning would starve. So, they need more data to feed the beast. Luckily for them, we are happy to oblige and provide a veritable feast of personal information with abandon. In most cases we don't even know we are doing it. Social media and all of the electronic devices we invite into our lives, are purposely designed to provide the feedback we all crave.
A quick look at the much-publicized FBI redacted meeting notes from 2016 shows how even professionals can still get it wrong. We assume that just the Personal Information or PII (as it is still called outside of newer privacy legislation such as the GDPR) needs to be removed to protect our identity but fail to understand that we are more than just an email and date of birth. We are what we do and say and we share that information without a thought.
Digital photographs contain mammoth amounts of detail from the location, time taken, who you are with, type of camera/phone used and more. Upload that photo and it gets tagged to your own digital profile after which it is liked (or disliked etc.) connecting it to many more profiles.
In those innocuous posts to Facebook, Twitter, Instagram and more, we provide detailed insight into every aspect of our lives. Political views, personal preferences, people and places we are connected to, all in a single photograph. Until recently, the text was the primary target for profiling and provided a wealth of data to create our digital profile. Now we can analyze images and other 'structured data' (documents, pictures, video etc.), read the metadata and more. This can be cross referenced to our and every other digital profile in near real time. Imagine the wealth of data hidden in the random Snapchat image that really has no value to you, other than to allow you to text a message!
I recently posted that we don't think about unstructured data enough. Consider the following example. Someone emails (or posts) a picture of you in front in a doorway holding a birthday cake saying "Happy 30th, Dave!" Photo taken on smart phone weeks earlier. How much Personal Information does this image disclose?
With enough data (easily obtained in our 'selfie-obsessed' age) and technologies such as point clouds i.e. Photosynth, we can even work out the exact spot the photographer was standing in and who else was in the room.
If anyone has watched 'The Circle', you may be shocked to know that even without those little cameras, your digital profile is working against you. Now add in the Internet of Things (IoT), voice and video enabled devices and we open ourselves to even more data collection. Ask Google for the local pizza store and suddenly you are bombarded with pizza coupons. Sometimes you don't even have to interact to be targeted. In some countries, the cell towers are weaponized for marketing. Just being in a location can trigger geolocation texts. All this information goes back to be stored and analyzed at a later date. As AI gets smarter, the information that can be gleaned grows exponentially.
At present, it isn't perfect because multiple people can use multiple devices. I, for one, would rather not have Netflix assume I am an avid Paw Patrol watcher and so push more programs like that. However, when my grand-daughter comes to stay, Paw Patrol rules! These nuances on whose data belongs to whom will soon be extrapolated as more data is analyzed and AI learns. In the meantime, we are exposed to potentially unfair bias, misleading (and annoying) advertisements and suggested posts/shows. What if that misinformation gets tagged to your credit score though (spoiler…it probably already has to some extent)? Now what happens when there is yet another security breach? Would you rather criminals (or governments) had correct or incorrect data on you?
We trust that all that data is secure. We assume it is unusable beyond its original purpose for disclosure. However, large companies are looking at ever new ways to use that data to build a better profile. These profiles are more valuable than the best Cryptocurrency in circulation. The more detail they contain, the more valuable they are. It is frightening enough just thinking what a marketing team might use these profiles for… what about your insurance company? Now what about all those data breaches that keep happening?What if your whole life is stolen and becomes available? You can change your credit card number, you might even be able to get new ID or even a new gender but can you really change the fundamental way you are and how you behave?
Imagine you are using Facebook and Google for the first time. You provide both with an email address and password. At this point they know nothing of you…or do they? They will know where you logged in from, what type of devices you use and have access to the history of the devices including websites, apps and more. This is before you even send an email or make your first post. Fill in the complete profiles they ask for and they already have enough information to fill out more than just a basic credit card application.
What if you use a different email and password for each? Do you also use different devices? The answer is probably no. So, the digital profiles get connected and now they have even more information about you. At this point you still haven't done anything.
However, once that first email comes back, you get a warm fuzzy feeling…'somebody cares about me'. Post your first photo or status and wait for the likes. The feedback starts to become addictive, as it was designed. How do I get more likes? Make more posts, provide more information, make controversial statements. Each of these adds to the detailed picture of you. Then you like someone else's post and suddenly you are seeing posts that are similar in your daily feed. You see one that says, “So and So likes Diply”. Did you like that post because they liked it? Did you notice that it didn't say “So and So likes this specific post”? Probably not. However, the system now knows a little bit more. Sometimes you ignore the feed it gives you. That provides a wealth of information about you also. As these systems talk to each other and exchange information they get to know you piece by piece. But they provide anonymous data, I hear the professionals say. I will again point to the exercise in redaction. If enough information is shared, you don't need an obvious Personally Identifiable Information (PII) connection to have exposed Personal Information.
Digital profiling is like the world's biggest game of Clue (or Cluedo for the Brits out there). No-one knows who is in the envelope but if you ask enough questions you can narrow it down really quickly.
For an example of this see my article on the Einstein Puzzle Rebooted.
Now imagine the wealth of information attached to some of the most sophisticated analytical systems imaginable and it is easy to see how vulnerable we can be.
AI and Machine Learning is now taking that one step further in order to 'guess' what you will do next, even before you do it. A company recently created a profiling system to 'guess' if a realtor's contact is likely to sell their house soon. All the realtor has to do is upload his contact list. Did his contacts (maybe not all clients) give him permission for that? Now suddenly they are receiving posts on house renovations, moving companies and more. They have no idea where these came from though.
So, before you make that next post, upload that next picture or even send that next email, consider what you are giving up. Read those privacy policies a little more carefully. Even if the company that collects the data does no harm, when a security breach occurs, the new possessor of your information probably didn't read the privacy policy either. They may also have breached multiple systems so are able to build digital profiles more sophisticated than Google or Facebook could dream of.
The genie is out of the bottle and is not likely to be put back anytime soon. It is critical that we have enforceable privacy, compliance and security in place to protect the ever growing knowledge pool about every digital breath you take...