Netskope debuts as a Leader in the 2024 Gartner® Magic Quadrant™️ for Single-Vendor Secure Access Service Edge Get the report

close
close
  • Why Netskope chevron

    Changing the way networking and security work together.

  • Our Customers chevron

    Netskope serves more than 3,400 customers worldwide including more than 30 of the Fortune 100

  • Our Partners chevron

    We partner with security leaders to help you secure your journey to the cloud.

A Leader in SSE.
Now a Leader in Single-Vendor SASE.

Learn why Netskope debuted as a leader in the 2024 Gartner® Magic Quadrant™️ for Single-Vendor Secure Access Service Edge

Get the report
Customer Visionary Spotlights

Read how innovative customers are successfully navigating today’s changing networking & security landscape through the Netskope One platform.

Get the eBook
Customer Visionary Spotlights
Netskope’s partner-centric go-to-market strategy enables our partners to maximize their growth and profitability while transforming enterprise security.

Learn about Netskope Partners
Group of diverse young professionals smiling
Your Network of Tomorrow

Plan your path toward a faster, more secure, and more resilient network designed for the applications and users that you support.

Get the white paper
Your Network of Tomorrow
Introducing the Netskope One Platform

Netskope One is a cloud-native platform that offers converged security and networking services to enable your SASE and zero trust transformation.

Learn about Netskope One
Abstract with blue lighting
Embrace a Secure Access Service Edge (SASE) architecture

Netskope NewEdge is the world’s largest, highest-performing security private cloud and provides customers with unparalleled service coverage, performance and resilience.

Learn about NewEdge
NewEdge
Netskope Cloud Exchange

The Netskope Cloud Exchange (CE) provides customers with powerful integration tools to leverage investments across their security posture.

Learn about Cloud Exchange
Netskope video
The platform of the future is Netskope

Intelligent Security Service Edge (SSE), Cloud Access Security Broker (CASB), Cloud Firewall, Next Generation Secure Web Gateway (SWG), and Private Access for ZTNA built natively into a single solution to help every business on its journey to Secure Access Service Edge (SASE) architecture.

Go to Products Overview
Netskope video
Next Gen SASE Branch is hybrid — connected, secured, and automated

Netskope Next Gen SASE Branch converges Context-Aware SASE Fabric, Zero-Trust Hybrid Security, and SkopeAI-powered Cloud Orchestrator into a unified cloud offering, ushering in a fully modernized branch experience for the borderless enterprise.

Learn about Next Gen SASE Branch
People at the open space office
Designing a SASE Architecture For Dummies

Get your complimentary copy of the only guide to SASE design you’ll ever need.

Get the eBook
Make the move to market-leading cloud security services with minimal latency and high reliability.

Learn about NewEdge
Lighted highway through mountainside switchbacks
Safely enable the use of generative AI applications with application access control, real-time user coaching, and best-in-class data protection.

Learn how we secure generative AI use
Safely Enable ChatGPT and Generative AI
Zero trust solutions for SSE and SASE deployments

Learn about Zero Trust
Boat driving through open sea
Netskope achieves FedRAMP High Authorization

Choose Netskope GovCloud to accelerate your agency’s transformation.

Learn about Netskope GovCloud
Netskope GovCloud
  • Resources chevron

    Learn more about how Netskope can help you secure your journey to the cloud.

  • Blog chevron

    Learn how Netskope enables security and networking transformation through secure access service edge (SASE)

  • Events and Workshops chevron

    Stay ahead of the latest security trends and connect with your peers.

  • Security Defined chevron

    Everything you need to know in our cybersecurity encyclopedia.

Security Visionaries Podcast

Data Lakes, Security, & Innovation
Max Havey sits down with guest Troy Wilkinson, CISO at Interpublic Group (IPG), for a deep dive into the world of data lakes.

Play the podcast Browse all podcasts
Data Lakes, Security, & Innovation
Latest Blogs

Read how Netskope can enable the Zero Trust and SASE journey through secure access service edge (SASE) capabilities.

Read the blog
Sunrise and cloudy sky
SASE Week 2024

Learn how to navigate the latest advancements in SASE and Zero Trust and explore how these frameworks are adapting to address cybersecurity and infrastructure challenges

Explore sessions
SASE Week 2024
What is SASE?

Learn about the future convergence of networking and security tools in today’s cloud dominant business model.

Learn about SASE
  • Company chevron

    We help you stay ahead of cloud, data, and network security challenges.

  • Customer Solutions chevron

    We are here for you and with you every step of the way, ensuring your success with Netskope.

  • Training and Accreditations chevron

    Netskope training will help you become a cloud security expert.

Supporting sustainability through data security

Netskope is proud to participate in Vision 2045: an initiative aimed to raise awareness on private industry’s role in sustainability.

Find out more
Supporting Sustainability Through Data Security
Netskope’s talented and experienced Professional Services team provides a prescriptive approach to your successful implementation.

Learn about Professional Services
Netskope Professional Services
Secure your digital transformation journey and make the most of your cloud, web, and private applications with Netskope training.

Learn about Training and Certifications
Group of young professionals working
Post Thumbnail

On the latest episode of Security Visionaries, host Max Havey sits down with guest Troy Wilkinson, CISO at Interpublic Group (IPG), for a deep dive into the world of data lakes. In this conversation Troy offers his perspective on why data lakes have become an important aspect of modern security strategies, the sorts of challenges CISOs often run into with data lakes, and advice he would offer to other security looking to protect their data lakes.

In the past, I really had to be cognizant of what data I was bringing into my SIEM, and what data I can do correlations on. So, there were limitations and decisions had to be made as a security leader, such as: I can’t bring in that very voluminous data source because it’s too expensive to do that, but I really wanted to. And so now with the data lake structure, you’re able to bring that in at a much lower cost and use it to do correlation searches you’ve never been able to do before.

—Troy Wilkinson, CISO at Interpublic Group (IPG)
Troy Wilkinson, CISO at Interpublic Group (IPG)

 

Timestamps

*00:00: Introduction*06:31: Data Lakes as a Threat Surface
*00:34: Importance of Data Lakes in Security*08:47: Data Poisoning in AI Models
*01:55: Benefits of Data Lakes*13:19: Protecting Data Lakes from Threat Actors
*03:17: Cost Efficiency of Data Lakes*14:24: Strategies for Protecting Data Lakes
*05:54: Challenges of Using Data Lakes*15:36: The Future of Data Lakes

 

Other ways to listen:

green plus

On this episode

Troy Wilkinson
Chief Information Security Officer with Interpublic Group

chevron

Troy Wilkinson - Chief Information Security Officer with Interpublic Group

Troy Wilkinson is currently the Chief Information Security Officer with Interpublic Group, a Fortune 500 company. Prior to his current position, Wilkinson served as the Chief Information Security Officer and Chief Information Officer for several large multi-national firms.

Wilkinson is a worldwide speaker on cybersecurity, co-authored an Amazon Best Seller, and has been featured on NBC, CBS, and Fox news stations. He is a consultant with several VC firms and has contributed to numerous national syndicated publications on cybersecurity topics including ransomware, DDoS, cyber-crime trends, and cyber security careers.

LinkedIn logo

Max Havey
Senior Content Specialist at Netskope

chevron

Max Havey

Max Havey is a Senior Content Specialist for Netskope’s corporate communications team. He is a graduate from the University of Missouri’s School of Journalism with both Bachelor’s and Master’s in Magazine Journalism. Max has worked as a content writer for startups in the software and life insurance industries, as well as edited ghostwriting from across multiple industries.

LinkedIn logo

Troy Wilkinson - Chief Information Security Officer with Interpublic Group

Troy Wilkinson is currently the Chief Information Security Officer with Interpublic Group, a Fortune 500 company. Prior to his current position, Wilkinson served as the Chief Information Security Officer and Chief Information Officer for several large multi-national firms.

Wilkinson is a worldwide speaker on cybersecurity, co-authored an Amazon Best Seller, and has been featured on NBC, CBS, and Fox news stations. He is a consultant with several VC firms and has contributed to numerous national syndicated publications on cybersecurity topics including ransomware, DDoS, cyber-crime trends, and cyber security careers.

LinkedIn logo

Max Havey

Max Havey is a Senior Content Specialist for Netskope’s corporate communications team. He is a graduate from the University of Missouri’s School of Journalism with both Bachelor’s and Master’s in Magazine Journalism. Max has worked as a content writer for startups in the software and life insurance industries, as well as edited ghostwriting from across multiple industries.

LinkedIn logo

Episode transcript

Open for transcript

0:00:00.7 Max Havey: Hello, and welcome to another edition of Security Visionaries, a podcast all about the world of cyber, data and tech infrastructure, bringing together experts from around the world and across domains. I'm your host, Max Havey. And today we're diving into the world of Data lakes with Troy Wilkinson, CISO at Interpublic Group, also known as IPG. Troy welcome to the show.

0:00:21.5 Troy Wilkinson: Thanks Max. Really a great pleasure to be here.

0:00:23.8 Max Havey: Glad to have you. So to get things started here, can you take us through just what are the concept of data lakes and why are they important? As an aspect of modern security.

0:00:34.2 Troy Wilkinson: Yeah, absolutely. I think it's important to take a little bit of a step back and talk about the reason we collect data in the first place, everything that we do feel see and touch and technology has some type of machine logs that come out of it. Some of those event logs are just normal log-ins, log-outs, but some of that is a very important security telemetry. And so what we've been doing over the last 25, 30 years, is really trying to decide what's important to us from a security operations perspective, what data we need to collect, what data is important to an event or an incident, and then really kind of diving into the logic or the data science behind how we can tie those incidents or events together, and so this has been a data problem for the longest period of time, and we are really stepping into the next generation or the next frontier of this data by decoupling the data from the analytics for so long, we've been asked to put this data into a singular place, what I like to call legacy SIM, where you pipe your data into a massive database and then you do the analytics on top of it there to gather insights from all of your incidents, now with the data lake structure, being able to put your data into a common schema in a Data lake allows you to decouple that data from your analytics, so if the next whiz-bang AI solution comes along and you wanna apply that AI to.

0:01:55.3 Troy Wilkinson: This data set, that's great. It's a flip of a switch, you don't have to move your data into a new solution, you don't have to port it anywhere, you can just apply those new analytics, and I think this really gives security leaders and secure operators flexibility of how they do security operations and correlation searches into their data lakes, and so I really feel like that flexibility, that transparency and the data ownership, and really being able to decide how long you keep that data is a really important decision-making criteria for Data lakes and how theirs is gonna change the industry of security operations.

0:02:30.5 Max Havey: To an extent, it's sort of a place that serving as a repository for all of this data that organizations that they've created over all these years, and that they can now use that data for whatever purposes they may need, whether that's with an AI model or with analytics or what have you, but it's essentially something that helps them keep it all contained in a way that they can keep it secure as well.

0:02:51.7 Troy Wilkinson: Yeah absolutely, and I wanna touch on cost as well, so if you think about the cost of data has tremendously come down, so storing data in the cloud is less than pennies per gigabyte now, so you're able to store more data, so in the past, you really had to be cognizant of what data I'm bringing in to my SIM and what data I can do correlations on, so there were limitations, and I may decide as a security leader, I can't bring in that very voluminous data source because it's too expensive to do that, but I really wanted to.

0:03:17.7 Troy Wilkinson: And so now with the data lake structure, you're able to bring that in at a much lower cost and use it to do correlation searches, you've never been able to do before. So as an example, DNS logs are usually very noisy and very unusual, so a lot of secure leaders don't bring them in, however, they're very valuable in times of incidents or if you wanna go back and see if a user went to a particular site and really get down into the weeds. So having that data in a data lake where it's very cheap storage, you're able to have that for long-term and very in-depth investigations, especially in a forensic investigation after an incident.

0:03:51.5 Max Havey: Totally the advent of having that cheap storage and being able to even just have all of this data ultimately creates new opportunities for how you can best use it and having more storage is leading to more innovation with this data, and more exciting things that folks in security and elsewhere can do with this data.

0:04:08.7 Troy Wilkinson: Absolutely, and another thing to mention is that being able to store this data over time allows the security leader to apply different types of analytics to it. As an example, today we have multiple types of AI-generated searches and AI-generated correlation events, and being able to stitch together telemetry from all of your data sources at scale and at speed or we've never been able to do that before. Now, that was the promise of SIM in the past, to bring all of your data into a single place, let's do all this fancy interpretation of it, however, I think that we just never got there from a street operators perspective at scale because of the expense, Because of the knowledge that it took to run that and because of the upkeep of it, we were on-prem for a long time, so the data center full of servers that you had to maintain, and then we moved into a cloud era where now you're SIM is in the cloud, and it's very expensive with the compute power needed to do those highly complex analytic, being able to de-couple your data and most importantly, having that data in a common schema or the open Cyber Security scheme of framework, so that every log source is in the same schema, so that a host name is a host name and a computer is a computer, and an IP address is an IP address, you don't have to translate that, you don't have to look across multiple indexes or data sources and translate it.

0:05:24.2 Troy Wilkinson: In other words, it's all in the same language, you can ask questions of your data at scale and across multiple different places, and so that really helps find that needle in the stack of needles, as we like to say, to find threat actors doing bad things, moving laterally exporting your infrastructure, your servers, your cloud, really tying it together where you may have missed those insights before.

0:05:45.8 Max Havey: Absolutely. And that sort of brings me to my next thought here, what are some of the challenges that you ran into as a CISO when it comes to using Data lakes and protecting Data lakes?

0:05:54.6 Troy Wilkinson: Well, I think that the challenges tend to be the same as they are for any type of data source, you have to have data protections in place, you have to have data ownership and lineage, you have to make sure that you're deprecating data in the right time frame as your regulatory requirements that you have. So you still have the same data protection concerns that you would with any other data source.

0:06:14.5 Max Havey: Absolutely, and then in that same sort of vein, why have Data lakes become an increasingly important threat surface to protect from malicious actors and other folks who are either trying to get in there or to poison that data, why is that becoming an important threat surface to keep in mind for security practitioners?

0:06:31.9 Troy Wilkinson: Yeah, good question. I think that from a data perspective, threat actors are always looking for data to exfiltrate, I think we've seen that as a rising theme across the threat actors in the past few years, the recent snowflake incidents that we've seen across multiple large organizations show us that threat actors are looking for large data sources to exfiltrate, and so data protections are extremely important, certainly data protections and exfiltration is top of the threat actors playbook, and so we are always looking to protect that. I think threat actors are really intent on getting to company's data and they find it very valuable. We used to see ransomware attacks where it was just encrypt the servers and hold the companies for ransom, now they're actually exfiltrating that data. And so they're secondary and even tertiary data, ransomware, where you say, If you don't pay us, we're gonna release your data to the public, so data has become a monetized commodity for the threat actors can continue to be a target.

0:07:24.6 Max Havey: Absolutely, and you've seen that with corporations or organizations that have had like giga leaks, I remember Nintendo specifically, there have been some kind of large scale like entertainment corporations and other folks throughout industries over the years where they've had those sorts of huge leaks of data, and I think that's an interesting point that there are these troves of data now that maybe weren't there 15, 20 years ago, just because we are able to keep a hold of it now.

0:07:49.2 Troy Wilkinson: As we look at data sets we look at the Sony hack and exfiltrating movie information. You look at the other banking industries where they're trying to exfiltrate information on customers, I think that every data set is as unique and needs protecting, but if you think about the security Data lakes that we're talking about here in the security telemetry for security operations threat actors could gain a very big insight into what a customer does to protect themselves, which would give them a path to take advantage of them even more, in other words, they could find ways to get into their back-ups, into their databases, into their servers, and so this security telemetry is very valuable to threat actors as well, so we even need to put more guard rails around our Data lakes.

0:08:29.4 Max Havey: Absolutely, and then, I know we talk about the idea of using Data lakes as something to help train AI models and things of that sort. I know the idea of poisoning data is something that is a real risk when it comes to training, generative AI and other AI models, how is that an issue, and what are some ways that folks can think about protecting against that when it comes to Data lakes?

0:08:47.5 Troy Wilkinson: So as we look at large language models and other types of foundation models for artificial intelligence that we're feeding ourselves, so this is a model that you're building and maintaining on-premise or in your own cloud. I think it's really important to understand that this data poisoning option is there for threat actors to take advantage of. You need to have input validation, you need to make sure that nobody's able to basically poison the inputs and also to exfiltrate, even if you have a rag architecture or a reference architecture of sharing an AI model, you can still have some of the data poisoning at the input level, and you can also have data exfiltration where there is an exchange between the input from the user and the exchange with the underlying foundation model, so I think it's so important to protect all the components of that.

0:09:32.9 Troy Wilkinson: And It's a different genre of security at this point, where we're seeing AI protection, protecting the Foundation Model, protecting and detecting data poisoning also bias, and that bias can be inherent bias or it can be unknown bias, where you don't even realize that your model is turning into some big algorithm that is taking you down the wrong path, so as it relates to security, Data lakes and SIM soar and security operations, I think we're a little bit of a long way from there, I think that we're pretty safe on that because we're not implementing or instituting AI models on top of our security Data lakes at scale, yet there are vendors that are doing that behind the scenes, so they would have a big challenge to protect those underlying models, but for us, I think as practitioners across the industry, being able to get all of our data into a central Data lake and imply advanced analytics to it is still what I would consider machine learning and some of the older school type of security correlation searches. Now, the best part about a Data lake, again, is having your data in a common schema and in a centralized place like that, you're able to change out analytics, so if the next AI solution comes along, let's say in the next 12 months, where.

0:10:44.9 Troy Wilkinson: Security operators say I wanna apply that new AI to my Data lake, it's very easy to flip that switch and do it without having to move that data, so we have the flexibility there, but I don't think that we are to the point yet to protect that Foundation model on our data lake.

0:10:58.8 Max Havey: Absolutely, and it gets back to what you were saying about how the idea is that all of the data is sort of speaking the same language, that everything there does need to be decoded in a way that is going to confuse your security operators and such. And I think it's especially interesting considering how quickly security and AI and all technology innovation is moving at this point, we're seeing new solutions popping up every other week at this point, it feels like, and being able to adjust that data and apply it accordingly, if you do see a solution that is coming your way, I think that's really exciting and really interesting, and it speaks volumes for what you can do with innovation down the road here.

0:11:33.2 Troy Wilkinson: Absolutely, I think one of the most unique advantages now that I see in the near term for AI is being able to translate complex queries for a security operator to write in natural language. I think that security operations team have gotten really adept at writing complex scripts and queries to query their data, but training that next generation of security operators is gonna be much easier if they can just ask questions of their data, show me where this is, or show me where I have this vulnerability being able to ask just normal questions and then have the AI translate that into a complex query that can search the Data lake very quickly is gonna help us have better outcomes and faster. I also think that the data lake is gonna empower us to keep this data for longer periods of time, so that if you have a breach, if a company has a breach, you can look back and stitch together to telemetry that you may not have had the option to before.

0:12:25.6 Troy Wilkinson: As an example, the IBM Ponemon Institute last year, so the average length of a breach before detection is about 180 days, so that's six months before a company realizes threat actors are in their environment, and so if you're not keeping six months of that full telemetry from your firewalls and your endpoint detection and response and your antivirus, you're gonna be missing some of those critical data components to put the story back together of how that threat actor got in and what they did in the beginning of this access, data lake allows you to store that data at very low cost over longer periods of time, and so you're able to then go back and use that in your investigation to find out exactly what happened from the moment of entry all the way through today.

0:13:06.3 Max Havey: And in that same sort of thought there, have there been any major security incidents that you've seen reported that have come as a result of improperly secured Data lakes, and if so, are there any major lessons that can be learned from those sorts of incidents?

0:13:19.6 Troy Wilkinson: Yeah, I think the recent Snowflake issue is a good example. So this is a massive database Data lake that customers use for a variety of reasons, we use Ticketmaster, which is one of the most well-known incidents relating to snowflake this year, I believe that it's a good example of how to use proper cyber hygiene, having all of your accounts behind multi-factor authentication, having the right application firewalls in place to make sure that those service accounts are protected, so I think just those best practices of accessing the data or Data lake is so important in this, being able to craft that and have that right cyber hygiene as key to success.

0:13:57.7 Max Havey: Absolutely, you don't wanna have a situation where you have passwords in plain text, or things laying around that shouldn't be laying around when you're dealing with data of this volume and of this sensitivity.

0:14:08.3 Troy Wilkinson: Absolutely.

0:14:09.1 Max Havey: Bring us around here. What are some strategies or advice that you'd recommend to CISO's and other security practitioners when it comes to protecting Data lakes, beyond just broad cyber hygiene, are there any other pieces of advice or strategies you'd wanna recommend to folks?

0:14:24.8 Troy Wilkinson: Yeah, I think from a protection perspective of Data lakes, I think that you have to decide what's right for your particular company, you can run this on-premise, you can run this in the cloud, and all of the same protections that you would normally do apply here, so the initial access, the multi-factor authentication, your admin credentials using a privileged access manager, all of the same type of protection you would put around any other software as a service or an on-premise application with a high value or critical data. But most importantly, I think that data lakes are a great option for folks who are looking to decide how they change the future of their security operations, correlation searches. I think it is a proper time in the industry for SIM and soar for this next generation of Data lakes coming out. You have so many on the market, I don't wanna name names, but there are many vendors who are getting into the Data lake genre, and as long as you have that common schema, you're able to port your data if you need to, you're able to maintain it longer and you're able to do correlations at speed and at scale, which is so important at the security operation center.

0:15:24.8 Max Havey: Absolutely. And to bring everything home here, Troy, what's something that excites you the most about the future of what we can accomplish with security, data lakes and things of this sort? What excites you most about that sort of innovation looking ahead of the future?

0:15:36.8 Troy Wilkinson: Yeah, I think there's two things here. Number one is being able to bring in more telemetry or more security data that we haven't before for a variety of reasons, and putting it into that common schema so we can do an advanced analytics. That's number one, it definitely has shown that we are increasing our capabilities at scale here with the help of Data lakes and also with the advent of AI and some of the analytics that we're applying there, but number two is really cost, being able to reduce your cost of data is helpful to bring in more data and have it stored so that you can do this correlations across a wider data set, and that is so important when you think about all the vast amounts of very noisy data sources from your cloud trail logs and your flow logs and your DNS logs, things that people traditionally would not collect or store for a period of time, now you can actually store and do correlation searches against.

0:16:27.7 Troy Wilkinson: Gives us hope that we can find things faster, the whole reason that companies store this type of data and maintain it over time is to find bad guys faster, find the threat actor who's trying to take advantage of your company faster, and I believe that Data lakes really empower that by having the ability to do advanced analytics on larger data sets and speed and at scale, I say that many times, I like that term, because if we're able to do this on a larger scale and really bring in this data and do it faster, you're gonna empower the security operators to be able to act faster and to stop bad guys faster and to get them out of your system faster, so just giving us a leg up, the threat actors are always evolving we have to stay up with them And I think this gives us a really good opportunity to do that.

0:17:09.0 Max Havey: Absolutely, Troy I think that brings us to the end of our questions here, so I just wanna thank you for joining us today. This was a fascinating conversation, and I think we've learned a lot here about Data lakes and what to be excited for as things continue to innovate in this world, looking ahead.

0:17:23.4 Troy Wilkinson: Absolutely, Max, thank you for having me. And look forward to the next one.

0:17:26.1 Max Havey: Yeah, absolutely. And you've been listening to the Security Visionaries podcast. I've been your host, Max Havey, and if you enjoyed this episode, share it with a friend and subscribe to Security Visionaries on your favorite podcast platform, there you can listen to our back catalog of episodes and keep an eye out for new ones dropping every other week. Hosted either by me or my co-host the wonderful Emily Wearmouth. And with that, we will catch you on the next episode.

Subscribe to the future of security transformation

By submitting this form, you agree to our Terms of Use and acknowledge our Privacy Statement.