Hiding in Plain Sight: Shadow AI Data Leaks

Jul 1

After three decades in cybersecurity, I've learned that many security disasters start with innocent misuse of exciting technologies.

Right now, AI is the shiny new thing. This feels familiar to those of us who lived through the internet, mobile, and cloud revolutions. However, there are three critical differences—it's easy, it's cheap, and it's viral.

Shadow AI is moving much faster than its predecessors.

In my mind it all started with desktop folder sharing in Windows 3.11 (NetBEUI, blech). Since then, user-accessible IT capabilities and consumerization of IT in general has evolved and grown into a gargantuan set of capabilities—and risks. Generative AI presents several key differences that make it a catalyst for geometric shadow IT growth.

It's fast and easy. Within minutes, almost any user with a problem can get genuinely useful help from large language or reasoning models. Code generation capabilities are stunningly good, enabling people who've never seen a line of code to create sophisticated applications—even right in the UI, no stack or environment required. Even implementation of sophisticated agentic architectures is easy through the use of online services like Zapier and n8n. And for developers and tech employees, AI enabled code development tools are... just wow. Irresistible.

It's also cheap. Users can achieve quite a lot for free, and even high-end subscriptions are only a couple hundred dollars a month. Many employees see this as a tiny personal investment in an unbelievable productivity boost. And if their company doesn't understand or authorize it... set up an account with a Gmail address and eat the cost themselves, it's worth it. What could go wrong?

And once someone sees what a coworker is doing with AI, they're all in. Then they're going to tell two coworkers, who will each tell two more coworkers... you get the idea. Bottom line, the lure of AI is irresistible and will spread like wildfire. And for sure, it carries the potential for incredible gains in productivity and quality.

But uncontrolled, it's also a systemized data leakage nightmare.

Small and mid-sized enterprises have a tougher road to hoe.

Today's reality: most organizations aren't prepared to control or secure AI.

There are droves of research reports, like this recent one from Gallup, that tell the tale. Many of these research studies focus on large enterprises. But in this article I'm speaking to the mid-enterprise CTOs, CISOs, and other data stewards in smaller organizations—because you've got a more challenging situation in many ways.

Unlike large enterprises, smaller organizations have much smaller budgets and teams to tackle these issues. Many mid-enterprise tech orgs are struggling with what they've already got. Add to the mix a cadre of end-users who are willing to use AI to achieve their goals—unsupervised and unsanctioned—and ugliness prevails.

The rest of this article is to help those mid-enterprise stakeholders get a jump start on getting AI under control, without massive budget outlay.

(Note: For the large-enterprise crowd, McKinsey Digital has put together a great reference architecture for large-scale AI control).

Start Here: Tackle shadow AI data leakage first

Based on recent incident response work and conversations with dozens of mid-sized business leaders, there's one AI security risk that's actively causing breaches right now. Fix this first, and you'll prevent the source of most AI-related incidents I'm seeing in the field with mid-enterprises.

Remember when your biggest worry was employees using Dropbox without approval?

Shadow AI makes that look quaint. Recent research shows 38% of employees admit to sharing confidential data with AI platforms without approval. For every employee willing to come clean, you can bet there's at least one more who isn't. But does this mean that the LLM provider is using your data?

If your employees are using personal accounts, yes they are. And many employees do just that, often to dodge existing corporate bans on AI usage. A study by Harmonic Security (May 2025) observed that 45% of sensitive AI interactions occurred using personal LLM subscriptions. Of those, 58% used Gmail addresses—so you can forget about looking for ChatGPT subscriptions in emails.

The breadth of shadow AI is clear. And so is the threat. Every prompt to ChatGPT, Claude, or that helpful coding assistant is a potential data leak. Unlike traditional shadow IT, AI tools learn from your data. When employees paste proprietary code into ChatGPT for debugging, that code can resurface later in response to other ChatGPT users (even your competitors). One electronics company learned this the hard way when their actual source code appeared in another LLM user’s sessions—and it wasn’t one of their users. Another, a healthcare-related company, had to face down extremely hefty fines when a conscientious employee blew the whistle to HHS about teammates uploading PHI to LLMs.

The problem is not just employees violating policy—it's the clear and present danger to security, compliance, and your business.

Get busy or get bit

So how do you get a handle on this without derailing your team or blowing your budget?

The good news is that you can start identifying and controlling shadow AI risks using tools you already have, with minimal impact on your technical teams. The steps below are designed to give you quick wins and actionable intelligence before you invest in new solutions or overhaul existing processes. Having seen this play out with shadow IT multiple times, please be careful not to make this seem like a witch hunt… that will backfire spectacularly.

How to Know You're at Risk (Do This Right Now)

Take a minute to look in the corporate mirror, as it were. If you see any of these issues smiling back at you then you probably want to dig deeper.

Your company hasn't spent any time looking into AI usage by employees
You don't have an approved AI tool inventory or AI usage policies
Employees report using AI tools "occasionally" but "can't" specify which ones or what they're doing
Your DLP tools, firewalls, and outbound proxies aren't configured to detect AI service interactions
You're seeing unusual outbound traffic to AI service endpoints

Digging Deeper (This Week):

If you think you're at risk, take some time this week and implement these simple things with your existing tools to get a clearer view of the scope of shadow AI in your company.

Use your existing firewall logs to identify connections to ai.openai.com, claude.ai, gemini.google.com, zapier.com, n8n.cloud
Do an anonymous, non-confrontational, one-question employee survey asking "Which AI tools have worked best for your work?" (prepare to be shocked)
Do a basic DLP update to add AI service domains to your existing DLP rules (see above)
Cost: $0-500 in staff time

Establishing Early Control (30 days)

Once you've confirmed shadow AI is an issue in need of attention, it's time to establish visibility and control without breaking the bank or starting a revolt.

In-depth shadow AI discovery: Use tools like Netskope or Zscaler's free trials to identify all AI service usage. Most of these integrate with existing firewalls.
Approved tools list: Start with ChatGPT Enterprise ($20/user/month) or Anthropic Claude for Business, which has SOC 2 compliant options available. AI providers generally don't keep input for training by default if the user is logged in with a business account, and with business services you have more control. Make sure your employees use business accounts!
Have a conversation with employees: Set some guidelines first, then policies. Listen to the responses. Listen to the needs. It will tell you a lot about how you can support their use of AI capabilities.

Note: I can’t say it enough, an "AI enablement" posture, not "AI restriction" posture, is critical here. Multiple waves of consumerized IT have taught us that you're not going to stop them, and honestly, you probably don't want to since these tools really can increase productivity dramatically. And if they're used right, they can also capture and codify corporate knowledge (contact me if you want to explore this further).
Budget: Under $2,000

Integrate and Operationalize (60+ days, ongoing)

Full implementation of reasonably comprehensive AI controls doesn't require ripping and replacing your current tools or rewriting your SOC 2 and PCI playbooks. Here are the most common examples I see implemented sooner than later in mid-enterprise organizations (YMMV). You can certainly go much further than this if the situation warrants it. Contact me if you have a more complex situation and want to soundboard ideas and concerns.

Security Platforms:

CrowdStrike/SentinelOne: Add AI service domains to application control policies.
Zscaler/Netskope: Enable AI service visibility in existing cloud access security broker (CASB) rules.
Microsoft Defender: Use built-in DLP templates for AI services (available in most E5 licenses).
Splunk/Chronicle: Create alerts for anomalous data volume to AI services.

Compliance Integration:

SOC 2: AI usage falls under access control (CC6.1) and data classification (CC6.7). Document approved tools in your risk assessment to begin.
ISO 27001: Start by updating your asset inventory to include AI tools and data flow documentation.
GDPR: Ensure approved AI tools include data processing agreements (DPAs).
PCI DSS: AI interactions that involve cardholder data (CHD) are in scope for PCI controls. Ensure AI tools are not used to process, transmit, or store CHD unless explicitly validated for PCI compliance. Focus on network segmentation, DLP enforcement, and user training to avoid accidental exposure.
Plan Ahead: Consult with your auditors or advisors for further guidance—and do it well before the next audit.

Track Success Metrics That Matter

You can't manage what you don't measure—and measuring the right things is what separates successful AI governance programs from security theater. These metrics help you track progress, justify investment, and identify problem areas before they become incidents. Focus on leading indicators that predict success rather than lagging indicators that only confirm failure.

Week 1 Baseline:

Number of unique AI services accessed
Volume of data sent to AI services
Number of employees using unauthorized AI tools

30-Day Success Indicators:

80% reduction in unauthorized AI service connections
95% of AI-using employees on approved tools
Zero security incidents related to AI data leakage

90-Day Business Impact:

Productivity improvements (measure task completion times)
Reduced compliance findings in audits
Employee satisfaction with AI tool access (they'll thank you for providing better tools)

What this looks like in practice

The examples below come directly from recent client projects, all triggered by the simple early steps outlined above. In each case, a small amount of effort revealed significant hidden risks—and unlocked fast wins with outsized impact. What's particularly noteworthy is that these issues were hiding in plain sight, discoverable through basic firewall log analysis and employee surveys, without engaging more complex security tooling.

Case Study 1: A 300-person software company discovered employees were pasting customer support tickets (containing PII) into ChatGPT personal accounts to generate response templates. Total cost to fix: $800 for ChatGPT Enterprise licenses and 8 hours of policy development. Potential cost if discovered during a compliance audit: $50,000+ in fines plus reputation damage.
Case Study 2: A financial services firm (150 employees) implemented our 90-day approach. Results: 90% reduction in unauthorized AI usage, 28% increase in development team productivity, zero AI-related compliance findings in their recent SOC 2 audit.
Case Study 3: A 20-person devops team in an early-growth SaaS startup initially resisted AI governance. After implementing GitHub Copilot Enterprise and Claude for Business, productivity increased 35% and security incidents dropped to zero. The team now actively helps identify new AI use cases, the client is engaging some extremely compelling AI innovations, and executive management is sleeping better at night.

Selling this to your teams

No security program succeeds without support from the people—and nowhere is that more true than with AI.

Your employees (especially technical teams) are likely already using AI tools, are experiencing genuine productivity benefits, and will react badly to those benefits being ripped away. If you position AI governance as restriction rather than enablement, you'll quickly earn the moniker of "dinosaur"—someone stuck in the past, trying to stymie innovation and progress. Instead, lead with better tools and clearer guidelines.

The Wrong Message: "We're restricting AI tools for security reasons."
The Right Message: "We want to give you supported, enterprise-grade AI tools that are faster, more reliable, and integrated with our systems."

Technical teams in particular will respond to “what else I get” when you position an AI program that supplants shadow AI. Here are some examples of how to position for buy-in:

Lead with capability: "ChatGPT Enterprise has 32k context windows vs. 4k in the free version"
Emphasize integration: "Direct GitHub integration means no copy-pasting code"
Address performance: "Guaranteed uptime and faster response times"

And most importantly, listen. Make it a conversation. If your employees feel that their input, ideas, and needs are valued, they’re far more likely to get on board—and often will start contributing and owning the effort.

The Path Forward: Start small, scale smart

You don't need a million-dollar security budget to address AI risks effectively. Focus on shadow AI first—it's the risk that's actively causing breaches today, and it's the cheapest to fix. Once you've got shadow AI under control, then worry about AI-enhanced security tools and identity management. But start with what's bleeding first.

Reality check #1: I know this looks straightforward on paper, but the devil is in the details. Every organization has unique constraints, legacy systems, and political realities that can complicate even simple-sounding implementations. A comprehensive implementation playbook with detailed risk assessment frameworks, change management strategies, and resource-constrained alternatives would require a much longer format than a blog post allows. If you're wrestling with the practical complexities of rolling this out in your specific environment, feel free to reach out.

Reality check #2: I also recognize that for some teams, even these "budget-friendly" numbers can be challenging if you're operating with minimal security staff and tight budgets. There are open-source alternatives, phased implementation approaches, and creative resourcing strategies that can make this work even in highly constrained environments. The detailed guidance for implementing AI security on a shoestring budget deserves its own dedicated discussion—comment below or contact me directly if this is your situation and I'll share some specific low-cost approaches.

Final Thoughts

I've spent my career helping organizations navigate major technology transitions—from mainframes to client-server, from on-premise to cloud, and now to AI. Two final thoughts from a battle-scarred veteran:

The AI security challenge is real, urgent, and solvable but it requires immediate attention.
The organizations that thrive are the ones that act early, not perfectly.

If you're reading this and thinking "we need help," you probably do. The good news is that you're not alone, and you don't have to figure this out by yourself. Get in touch and I'll be happy to help.

Thanks for the read.

PS: For those people asking what was hiding in plain sight…

Carson Sweet