Saturday, March 10, 2012

Will Google “Do No Evil”?

Google captures and keeps a vast amount of personal information about its users. What do they do with all that data? Despite some very persistent misconceptions, the answer is “Nothing bad”. But they could do a much better job ensuring that no one can ever do anything bad with that data—ever. Here is a rather simple but accurate description of what they do with what is gleaned from searches, email, browsing, documents, travel, photos, and more than 3 dozen other ways that they learn about you:

  • Increase the personal relevance of advertising as you surf the web

  • Earn advertising dollars–not because they sell information about you–but
    because they use that data to match and direct relevant traffic toward you


These aren’t bad things, even to a privacy zealot. With or without Google, we all see advertising wherever we surf. Google is the reason that so many of the ads appeal to our individual interests.

But what about all that personal data? Is it safe on Google’s servers? Can they be trusted? More importantly, can it someday be misused in ways that even Google had not intended?

I value privacy above everything else. And I have always detested marketing, especially the unsolicited variety. I don’t need unsolicited ‘solutions’ knocking on my door or popping up in web surfing. When I have needs, I will research my own solutions—thank you very much.

It took me years to come to terms with this apparent oxymoron, but the personalization brought about by information exchange bargains are actually a very good bargain for all parties concerned, and if handled properly, it needn’t risk privacy at all! In fact, the things that Google does with our personal history and predilections really benefits us, but...

This is a pro-Google posting. Well, it’s ‘pro-Google’ if they “do no evil” (Yes—it’s the Google mantra!). First the good news: Google can thwart evil by adding a fortress of privacy around the vast corpus of personal data that they collect and process without weakening user services or the value exchange with their marketing partners. The not-so-good news is that I have urged Google to do this for over two years and so far, they have failed to act. What they need is a little urging from users and marketing partners. Doing no evil benefits everyone and sets an industry precedent that will permeate online businesses everywhere.

The CBS prime time television series, Person of Interest, pairs a freelance ‘James Bond’ with a computer geek. The geek, Mr. Finch, is the ultimate privacy hack. He correlates all manner of disparate data in seconds, including parking lot cameras, government records, high school yearbook photos and even the Facebook pages of third parties.

[caption id="" align="alignleft" width="300"] Mr. Finch & Eric Schmidt: Separated at birth?[/caption]

It’s an eerie coincidence that Google Chairman, Eric Schmidt, looks like Mr. Finch. After all, they both have the same job! They find a gold mine of actionable data in the personal dealings of everyone.

Viewers accept the TV character. After all, Finch is fictional, he is one of the good guys, and his snooping ability (especially the piecing together of far-flung data) is probably an exaggeration of reality. Right?!

Of course, Eric Schmidt & Google CEO Larry Page are not fictional. They run the largest data gathering engine on earth. I may be in the minority. I believe that Google is “one of the good guys”. But let’s first explore the last assumption about Mr. Finch: Can any organization correlate and “mine” meaningful data from a wholesale sweep of a massive eavesdropping machine and somehow piece together a reasonable profile of your interests, behavior, purchasing history and proclivities? Not only are there organizations that do this today, but many of them act with our explicit consent and with a disclosed value exchange for all that personal data.

Data gathering organizations fall into three categories, which I classify based on the exchange of value with web surfers and, more importantly, whether the user is even aware of their role in collecting data. In this classification, Google has moved from the 2nd category to the first, and this is a good thing:

  1. Organizations that you are aware of–at least peripherally–and for which there is a value exchange (preferably, one that is disclosed). Google comes to mind, of course. Another organization with informed access to your online behavior is your internet service provider. If they wanted to compile a dossier of your interests, market your web surfing history to others, or comply with 3rd party demands to review your activities, it would be trivial to do so.

  2. Organizations with massive access to personal and individualized data, but manage to “fly beneath the Radar”. Example: Akamai Technologies operates a global network of servers that accelerate the web by caching pages close to users and optimizing the route of page requests. They are contracted by almost any company with a significant online presence. It’s safe to say that their servers and routers are inserted into almost every click of your keyboard and massively distributed throughout the world. Although Akamai’s customer relationship is not with end users, they provide an indirect service by speeding up the web experience. But because Internet users are not actively engaged with them (and are typically unaware of their role in caching data across the Internet), there are few checks and on what they do with the click history of users, with whom they share data, and if–or how–individualized is data is retained, anonymized or marketed.

  3. National governments. There is almost never disclosure or a personal value exchange. Most often, the activity involves compulsory assistance from organizations that are forbidden from disclosing the privacy breach or their own role in acts of domestic spying.


[caption id="attachment_1193" align="alignright" width="200"]The NSA is preparing to massively vacuum data from everyone, everywhere, at all times The US is preparing to spy on everyone, everywhere, at all times. The massive & intrusive project stuns scientists involved.[/caption]

I have written about domestic spying before. In the US, It has become alarmingly broad, arbitrary and covert. The über secretive NSA is now building the world’s biggest data gathering site. It will gulp down everything about everyone. The misguided justification of their minions is alternatively “anti-terrorism” or an even more evasive “911”.

Regarding, category #2, I have never had reason to suspect Akamai or Verizon of unfair or unscrupulous data mining. (As with Google, these companies could gain a serious ethical and market advantage by taking heed of today’s column.) But today, we focus on data gathering organizations in category #1—the ones with which we have a relationship and with whom we voluntarily share personal data.

Google is at the heart of most internet searches and they are partnered with practically every major organization on earth. Forty eight free services contain code that many malware labs consider to be a stealth payload. These doohickeys give Google access to a mountain of data regarding clicks, searches, visitors, purchases, and just about anything else that makes a user tick.

It’s not just searching the web that phones home. Think of Google's 48 services as a marketer’s bonanza. Browser plug-ins phone home with every click and build a profile of user behavior, location and idiosyncrasies. Google Analytics, a web traffic reporting tool used by a great many web sites, reveals a mountain of data about both the web site and every single visitor. (Analytics is market-speak for assigning identity or demographics to web visits). Don’t forget Gmail, Navigate, Picassa, Drive, Google Docs, Google+, Translate, and 3 dozen other projects that collect, compare and analyze user data. And what about Google’s project to scan everything that has ever been written? Do you suppose that Google knows who views these documents, and can correlate it with an astounding number of additional facts? You can bet Grandma Estelle’s cherry pie that they do!

How many of us ever wonder why all of these services are free to internet users everywhere? That’s an awful lot of free service! One might think that the company is very generous, very foolish, or very unprofitable. One would be wrong on all counts!

Google has mastered the art of marketing your interests, income stats, lifestyle, habits, and even your idiosyncrasies. Hell, they wrote the book on it!

But with great access to personal intelligence comes great responsibility. Does Google go the extra mile to protect user data from off-label use? Do they really care? Is it even reasonable to expect privacy when the bargain calls for data sharing with market interests?

At the end of 2009, Google Chairman, Eric Schmidt made a major gaffe in a televised interview on CNBC. In fact, I was so convinced that his statement was toxic, that I predicted a grave and swift consumer backlash. Referring to the Billions of individuals using Google search engine, investigative anchor, Maria Bartiromo, asked Schmidt why it is that users enter their most private thoughts and fantasies. She wondered if they are aware of Google’s role in correlating, storing & sharing data—and in the implicit role of identifying users and correlating their identities with their interests.

Schmidt seemed to share Bartiromo’s surprise. He suggested that internet users were naive to trust Google, because their business model is not driven by privacy and because they are subject to oversight by the Patriot Act. He said:
If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place. If you really need that kind of privacy, the reality is that search engines -- including Google -- do retain this information for some time and it's important, for example, that we are all subject in the United States to the Patriot Act and it is possible that all that information could be made available to the authorities.

At the time, I criticized the statements as naive, but I have since become more sanguine. Mr. Schmidt is smarter than me. I recognize that he was caught off guard. But clearly, his response had the potential to damage Google’s reputation. Several Google partners jumped ship and realigned with Bing, Microsoft’s newer search engine. Schmidt’s response became a lightning rod–albeit brief–for both the EFF (Electronic Freedom Foundation) and the CDT (Center for Democracy & Technology). The CDT announced a front-page campaign, Take Back Your Privacy.

But wait...It needn’t be a train wreck! Properly designed, Google can ensure individual privacy, while still meeting the needs of their marketing partners - and having nothing of interest for government snoops, even with a proper subpoena.

I agree with the EFF that they undermine Google’s mission. Despite his high position, Schmidt may not fully recognize to that Google's marketing objectives can coexist with an ironclad guarantee of personal privacy – even in the face of the Patriot Act.

Schmidt could have had salvaged the gaffe quickly. I urged him to quickly demonstrate that he understands and defends user privacy. But I overestimated consumer awareness and expectations for reasonable privacy. Moreover, consumers may feel that the benefits of Google’s various services inherently trade privacy for productivity (email, taste in restaurants, individualized marketing, etc).

Regarding a damning consumer backlash for whitewashing personal privacy with their public, I was off by a few years, but in the end, my warnings will be vindicated. Public awareness of privacy and especially of internet data sharing and data mining has increased. Some are wondering if the bargain is worthwhile, while others are learning that data can be anonymized and used in ways that still facilitate user benefits and even the vendor’s marketing needs.

With massive access to public data and the mechanisms to gather it (often without the knowledge and consent of users), comes massive responsibility. (His interview contradicts that message). Google must rapidly demonstrate a policy of “default protection and a very high bar for sharing data. In fact, Google can achieve all its goals while fully protecting individual privacy.

Google’s data gathering and archiving mechanism needs a redesign (it’s not so big a task as it seems): Sharing data and cross-pollination should be virtually impossible – beyond a specified exchange between users and intended marketers. Even this exchange must be internally anonymous, useful only in aggregate, and self expiring – without recourse for revival. Most importantly, it must be impossible for anyone – even a Google staffer – to make a personal connection between individual identities and search terms, Gmail users, ad clickers, voice searchers or navigating drivers!

I modestly suggest that Google create a board position, and give it authority with a visible and high-profile individual. (Disclosure, I have made a “ballsy” bid to fill such a position. There are plenty of higher profile individuals that I could recommend).

Schmidt’s statements have echoed for more than 2 years now. Have they faded at all? If so, it is because Google’s services are certainly useful and because the public has become somewhat inured to the creeping loss of privacy. But wouldn’t it be marvelous if Google seized the moment and reversed that trend. Wouldn’t it be awesome if someone at Google discovered that protecting privacy needn’t cripple the value of information that they gather. Google’s market activity is not at odds with protecting their user’s personal data from abuse. What’s more, the solution does not involve legislation or even public trust. There is a better model!

They are difficult to contain or spin. As Asa Dotzler at FireFox wrote in his blog, the Google CEO simply doesn’t understand privacy. Here in USA, Schmidt’s statements have become a lightning rod for both the EFF and CDT (Center for Democracy & Technology). The CDT has even launched a front page campaign to “Take Back Your Privacy”.

Google’s not the only one situated at a data Nexus. Other organizations fly below the radar, either because few understand their tools or because of Government involvement. For example, Akamai probably has more access to web traffic data than Google. The US government has even more access because of an intricate web of programs that often force communications companies to plant data sniffing tools at the junction points of massive international data conduits. We’ve discussed this in other articles, and I certainly don’t advocate that Wild Ducks be privacy zealots and conspiracy alarmists. But the truth is, the zealots have a leg to stand on and the alarmists are very sane.