Think data leaks are bad now? Wait until genAI supersizes them

The concept of data leakage — and all of its privacy, legal, compliance and cybersecurity implications — today has to be fundamentally re-envisioned, thanks to the biggest IT disruptor in decades: generativeAI (genAI).

Data leakage used to be straight-forward. Either an employee/contractor was sloppy (leaving a laptop in an unlocked car, forgetting highly-sensitive printouts on an airplane seat, accidentally sending internal financial projections to the wrong email recipient) or because an attacker stole data either while it was at rest or in transit. 

Those worries now seem delightfully quaint. Enterprise environments are entirely amorphous, with data leakage just as easily coming from a corporate cloud site, a SaaS partner, or from everyone’s new-favorite bugaboo: a partner’s large language model (LLM) environment. 

Your enterprise is responsible for every bit of data your team collects from customers and prospects. What happens when new applications use your old data in new ways? And what then happens when that customer objects? What about when a regulator or a lawyer in a deposition objects?

When the walls are this amorphous, how precisely is IT supposed to be in control? 

Consider this scary tidbit. A group of Harvard University students started playing with digital glasses to leverage real-time data access. The most obvious takeaway from their experiment is it that it can be a highly effective tool for thieves (conmen, really). It allows someone to walk up to a stranger and instantly know quite a bit about them. What a perfect way to kidnap someone or steal their money. 

Imagine a thief using this tool to talk his/her way into a highly-sensitive part of your office? Think about how persuasive it could make a phishing attack.

As bad as that all is, it’s not the worst IT nightmare — that nightmare is when the victim later figures out the misused data came from your enterprise database, courtesy of a detour through a partner’s LLM. 

Let’s step away from the glasses nightmare. What happens when an insurance company uses your data to deny a loan or your HR department uses the data to deny someone a job? Let’s further assume that it was the AI partner’s software that made a mistake. Hallucinations anyone? And that mistake led to a destructive decision. What happens then?

The underlying data came from your confidential database. Your team shared it with genAI partner 1234. Your team hired 1234 and willingly gave them the data. Their software screwed it up. How much of this is your IT department’s fault?

There is a terrible tendency of litigation to split fault into percentages and to give a healthier percentage to the entity with the deepest pocket. (Hello, enterprise IT — your company quite likely has the deepest pocket.) 

There are several ways to deal with these scenarios, but not all of them will be particularly popular.

1. Contractual — put it in writing. Have strict legal terms that put your AI partner on the hook for anything it does with your data or any fallout. This won’t prevent people from seeing the inside of a courtroom, but at least they’ll have company.

2. Don’t share data. This is probably the least popular option. Set strict limits on which business units can play with your LLM partners, and review and approve the level of data they are permitted to share.

When the line-of-business chief complains — virtually guaranteed to happen — tell that boss that this all about protecting that groups’ intellectual property and, in turn, that LOB chief’s bonus. Mention that this preserves their bonus and watch the objections melt away.

3. Impose stiff punishments for Shadow AI violations. In theory, you can control contacts and data access with your key genAI partners. But if your people start feeding data into ChatGPT, Perplexity or their own account on CoPilot, they need to know that they will be discovered and that two violations mean termination.

First, you need to take this request up as high as you can to get in writing that it will happen. Because, trust me, if you say that a second violation will result in termination, and then some top-tier salesperson violates and does not get fired, wave bye-bye to your credibility. And with that, any chance people will take your rules seriously. Don’t threaten to fire someone until you are certain you can.

Maybe something equally effective would be canceling their next two bonus/commission payments. Either way, find something that will get the attention of the workforce.

4. The anti-contract. Lawyers love to generate 200-page terms of service that no one reads. I just need to remind you that such terms will be ignored by courtroom juries. Don’t think you can really right-click your legal exposure away.

This is triply the case when your customers are outside the United States. Canada, Europe, Australia and Japan, among others, focus on meaningful and knowing consent. Sometimes, you are banned from forcing acceptance of the terms if you choose to use the product/service.

5. Compliance. Do you even have legal permission to share all of that data with an LLM partner? Outside the US, most regulators are told that customers own their data, not the enterprise. Data being mis-used — as in the Harvard glasses example — is one thing. But if your genAI partner makes a mistake or hallucinates and sends flawed data out into the world, you can be exposed to pain well beyond simply sharing too much info. 

You can never have too many human-in-the-loop processes in place to watch for data glitches. Yes, it will absolutely dilute genAI efficiency gains. Trust me: for the next couple of years, it will deliver a better ROI than genAI will on its own. 

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *