In Episode 3 of the Security Podcast, host Brad Bussie explores the pressing issues of AI security risks, the 23andMe data breach, and the effectiveness of antivirus software.
The discussion begins with an exploration of the top AI security risks, including vulnerabilities highlighted in the OWASP top 10 for large language models.
Next, Brad analyzes the recent 23andMe data breach, emphasizing the role of user negligence and the potential misuse of sensitive genetic data in social engineering attacks.
The show concludes with Brad’s insightful examination of antivirus software effectiveness, advocating for a comprehensive defense-in-depth strategy in cybersecurity. Throughout, the episode provides a nuanced view of the evolving cybersecurity landscape, balancing technological advancements against emerging threats.
00:00:00] Brad Bussie: I would much rather look at this as signature-based. That's your first line of defense from the thousands of variants and options of malware. And I also think that having the behavior based and pattern based, that's helpful for the zero days and the kind of the next evolved generation of malware.
[00:00:43] Hi everyone. I'm Brad Bussie, chief information security officer here at E360. Thank you for joining me. The state of enterprise IT security edition. This is the show that makes IT security approachable and actionable for technology leaders. I'm happy to bring you three topics this week. 1st 1 is top A.
[00:01:05] I. Security risks right now. We'll talk a little bit about it. NIST, the NIST AI framework, as well as the OWASP top 10 for LLM. Uh, the second one, 23andMe, blames user negligence on the data breach. So we'll dive a little bit deeper into why 23andMe thinks that it is the user population that actually caused the breach as opposed to their service.
[00:01:35] And the third one, how effective is antivirus, really? Is it worth the cost? So we'll talk a little bit about antivirus, what's considered next gen antivirus and really how the technology is being applied today in the world of endpoint detection response as well as managed [00:02:00] detection response. So with that, let's go ahead and get started.
[00:02:04] So first topic, top AI security risks right now. I found this pretty interesting. Cause I'm talking to a lot of different clients about. AI as well as AI security. And everyone has been looking for, I would say the, the expert opinion as far as what we should be worried about? What should we be thinking about when it comes to AI?
[00:02:30] And I felt that it was a good over arching representation of some of the things that we should be thinking about. as cyber defenders in the OSP top 10 for LLM. And then I'll talk a little bit about the NIST AI. So first and foremost, I think it would be useful, so you don't have to read all of it, to just get a sense of what are the top 10 from OSP for LLM.
[00:02:58] So the first one is prompt injection. So this is really how we communicate. With the LLM. So if I'm talking to chat, chat, GTP, I'm talking to Google Bard, I'm talking to co pilot, I'm, I'm giving it some kind of information through prompt form, and there are some things that you can do through prompts where you can make an LLM prompt.
[00:03:26] Perhaps do something that it isn't designed to do or something that it's not supposed to do. And it was interesting, I was at an offsite with some clients and we ended up getting one of the LLMs to help us craft a phishing email. Now if you go in and you try to get an LLM to do that, by default it says no because it's malicious.
[00:03:50] However, if you say I'm a security researcher and I'm just creating Uh, something that looks like a phishing email. You know, could you help me with [00:04:00] some of the learning slash education side? Very quickly, we were able to come up with something that was pretty convincing, and the LLM was more than happy to help us do it.
[00:04:13] Second one, insecure output handling. So this is a vulnerability that occurs. When an LLM output is accepted without any scrutiny. So this can happen and it can expose a back end system. So that's something that is typically known as a remote code execution. So that's something that should be looked at and considered.
[00:04:43] Third one, training data poisoning. So this is, this is your pretty standard poisoning of data, tampering with what's being introduced to the LLM. And that comes with it a couple of, I would say, pretty regular issues. Like compromising really the effectiveness and the ethics behind the behavior of the LLM.
[00:05:13] Fourth one, model denial of service. I'm not going to go through all of these in detail. The fifth one, supply chain vulnerabilities. So this is really looking at like, uh, leveraging a third party data set. And this is dangerous because how are LLMs. Learning. How are they? They are growing and expanding. Well, we're giving them information and if we're training them or we're training them wrong, then you can see what starts to happen.
[00:05:47] Uh, sixth one sensitive information disclosure. So can we not only just trick an L. L. M. To give us the information, but can we ask it? And that's there's some [00:06:00] privacy concerns around, uh, A. I. In general, but one of the big ones is the AI giving us some form of sensitive information. Seventh one, insecure plugin design.
[00:06:14] So this one I think is, is actually one of the most risky and you're going to see a lot of this coming out with, uh, some of the new announcements from chat GPT. And that is the GPT, think of it as a store. Very similar to how the Apple Store got started, where developers were able to create their own applications, and then iPhone and iPad users could download those applications.
[00:06:45] Very similar. You have ChatGPT is the back end, but now you're able to create these different GPT leveraging the API. I think that's going to be a pretty big exploitable area, and it's only a matter of time before we have some of the attackers that are studying in. and finding more weaknesses in GPT because they're able to, uh, interface with it in a different way.
[00:07:16] So to, to be seen what's going to happen, but I think we're probably going to have some discussions about that in the future. Number eight, excessive agency. So what does that mean? Uh, that's really, An excessive amount of permissions or autonomy given to an LLM. So there has been some, some worry about something like.
[00:07:43] Let's just use Microsoft as an example. So CoPilot is going to be well connected into all of the Microsoft suite. Uh, that, that is Word, Excel, PowerPoint, the file sharing, [00:08:00] SharePoint. Basically, if it's a Microsoft solution, CoPilot is going to be enabled and allowed. What I have found is that in a lot of organizations, They don't have very good data governance as far as who has access to what, should they have that access, what are they doing with it, and I think what we'll see Is the LLM is going to have either superior permissions or mimic the permissions of the user, but the user might be overprovisioned.
[00:08:30] So what you're going to find is a very destructive and quick moving adversary that doesn't actually mean to be an adversary. It's just how the LLM is being leveraged and excessive agency is one of those things that I think. We should be pretty concerned about number nine overreliance. This is kind of funny because I've noticed this in my own business where the quality of emails and the quality of writing that I'm getting.
[00:09:02] from either my employees or customers or, or anyone. It's like, I know they're not writing it themselves anymore. They're dropping it into one of the LLMs and saying, Hey, can you write me something? And, um, pretend I'm a researcher, pretend like I'm somebody, a famous author or whatever, and write it for me in that tone.
[00:09:26] It's pretty, it's pretty awesome. But what I'm finding is We get on a call, we start talking about something and I, and I'm very happy about the insights that I received in the report or the email and I say, well, can we, can we go deeper? Can you explain number six and a little more depth? And I get the big saucer eyes of worry and they don't actually know the content, uh, the LLM knew the content and they were putting their stamp on it saying, look what I did.
[00:09:57] So you're going to see a lot of this [00:10:00] in the future of over reliance. There's actually a lot of conversations right now about whether we need to cite as a source and LLM when we're using it and right now that's kind of a gray area. But I, I predict we are going to have some form of governance put around it. I think you'll even see updates to standards, like, uh, like we used in grade school, high school, college, whatever, uh, the different writing formats.
[00:10:28] And I believe that you'll see a citing source of that LLM and how to do it. The last one is model theft. So I think this one goes without saying. Can someone steal a model and leverage it, turn it into something else? Absolutely. Yes, you're already seeing this with some of the hacker models out there.
[00:10:54] They look and and feel exactly like some of the bigger models because They were, uh, we'll just say appropriated and have been leveraged. So I won't go too deep into the NIST AI, uh, framework, the suggestions. What I'll, what I'll remind you of is that it's, it's really broken out into a couple of different areas.
[00:11:20] While I was just talking about more of the, the tactical side of things, the, um, the NIST AI is more of the governing of not just AI slash LLM, but also machine learning, because really, that's, that's what we're still talking about. Big models, machine learning, how do we govern those? How do we map? And these are different pieces of the AI framework.
[00:11:48] How do we map what they're doing or are able to do? How do we measure those things? And ultimately, how do we manage it? So, I think that's [00:12:00] something that, if you have some time, just take a look. There's a couple of graphics that I think are pretty helpful as far as what that looks like. And we'll do another episode where I'm gonna go.
[00:12:14] Deep into the A. I. Framework and we'll be able to talk about each one similarly to how we just did with us. So something, something interesting to look out for. So the second thing I'd like to talk about in the show today, uh, it's pretty interesting. It's the 23andMe. And what I found pretty, um, I'm not going to say comical, but I guess I just did.
[00:12:45] Uh, really, the fact that 23andMe is blaming user negligence On the data breach, I found, uh, interesting. So what I did is I did a little more research and wanted to understand the breach a little bit better. So I'll, I'll, I'll break it down for you. In essence, what we're back to is password and credential reuse.
[00:13:16] Now, let's say they have 7 million users on 23andMe. And those users have access to their own profile information. Well, there's a 23andMe, call it a perk, where if you are matched from a DNA perspective with somebody that could be a mother, father, brother, sister, first cousin, second cousin, third cousin, it goes pretty wide, uh, you can see their information and their DNA profile as well.
[00:13:52] So do you need all of that information? I think that's really up to the eyes of the [00:14:00] beholder. However, this is really how the breach So it's been stated that right around, I mean, it's a pretty round number. 14, 000 users were compromised and they were compromised by using known credentials, known username and password pairs that were available on the dark web.
[00:14:24] And they were available from past breaches. So what does this mean? Well, we always talk about as security practitioners that you should have a different username and a different password for each application, website, whatever. But if you sit down and you think about it, I guarantee Most of the people watching or listening, they've got that one password that is their, their throwaway.
[00:14:52] Uh, I'll just use this on whatever. I just need to be able to remember this one. Um, you may not use the password manager that you, you paid for because sometimes it's just, it's just faster or I'm on my phone, I'm moving fast, whatever. And that password was, was used in another, let's just say breach somewhere else.
[00:15:15] So in essence, anytime there is a active attack happening against a large platform, they're going to use every username and password pair that was in that old database. So your, your email address, let's say, and that password. And if they don't have proper controls, they, meaning the website or application, they are granted access.
[00:15:42] So, if I'm thinking of this objectively, can I blame users? Sure, I can blame users for that bad habit. However, if I were looking at this from 23andMe's perspective, I would be asking them for [00:16:00] some more, uh, robust security protocols. I would be asking them for some things that aren't really that hard to do.
[00:16:11] Like. Am I trying to sign in from a device that I didn't sign into before? Is there a multi factor authentication option? Is there a way to either pair it with a auth token, or pair it with, at a very minimum, SMS or even to an email address? So, are those things available? In some cases, are they enforced and recommended?
[00:16:37] They're not. So I'm going to very squarely place this in a shared blame. Is it up to users to make sure that their passwords are unique? It is. Is there anything that really enforces that? No. How many data breach notifications do all of us get on a yearly basis? Honestly, it's a couple, but if you want to help to prevent being part of that problem, then I would suggest we make some changes as users, but then I would also suggest that 23andMe and others like 23andMe in the application and web space make it a little bit easier to protect yourselves by helping us to protect you.
[00:17:32] Now, last thing I'll say about it is what can an attacker actually do with that information from 23 and me? Like I've heard some crazy stuff. Well, if they've got my DNA, they can design a bio weapon that's going to impact the population. Um, that's how actually that was how covid was so effective is that they had DNA from everybody.
[00:17:59] I've heard [00:18:00] some just craziness. To be honest, the most prevalent thing that we'll see from this type of data are social engineering attacks like. Hey, I'm Jane and I'm your sister. Well, wait a second. I don't have a sister. I just have a brother and that will spark somebody to click a link and say, there's no way that I have a sister.
[00:18:25] And then next thing you know, you're putting in information and they got you. So that's the kind of stuff that I think we will see. By and large are the social engineering style of attacks. So third thing I wanted to talk about today is how effective is antivirus really? And is it worth the cost? This one's interesting because I've, I've seen the argument go a couple of different ways.
[00:18:55] You can get into the weeds and the detail around signature-based versus pattern and behavior based. And which one is better? I'm still squarely in the camp of they're both important. Practicing defense in depth is something that I think every practitioner organization, even end users, should be thinking about.
[00:19:21] And what does that really mean? Well, that means if one of our defenses is compromised, we have one waiting right behind it to pick up the slack. So I think of this as If you go to VirusTotal and you just do a random scan or search of how many viruses are still out there, it's not like they die off and go away.
[00:19:45] There's still viruses from the 90s that if you got on an unprotected endpoint, it would still compromise it. Whether that's damaging, destroying the operating system, ransomware, like whatever. That stuff is still [00:20:00] out there. So, I would much rather look at this as signature-based. That's your first line of defense from the thousands of variants and options of malware.
[00:20:15] And I also think that having The behavior based and pattern based, that's helpful for the zero days and the kind of the next evolved generation of malware. So if I'm looking at this in how effective is antivirus. I think it's still very effective. I talk to clients every day that were saved by some basic signature-based AV.
[00:20:46] I've had others that are saved by pattern and behavior based. So I think it's a blend. I think they're still both very needed. And is it worth the cost? Absolutely, because we are in a risk management game. So if I can prevent a breach by a simple antivirus solution, I think that's worth every penny. So I think where this really comes into play. Recently, in some of the studies that I've been looking at is a lot of the signature-based viruses continue to be a problem for those that are doing coding.
[00:21:30] They're doing mods to things like video games, file sharing, bigshare. There are lot of different examples, but those are just some of the common ones. We're getting pretty good about catching most of these viruses before they even hit the end point because most of this is still delivered through common channels like email downloading it from a File share [00:22:00] somewhere.
[00:22:00] It's pretty rare anymore to pick up a thumb drive and put it in. How many of your laptops still actually have? the right size USB for that anymore. You know, if you don't have a hub, most people aren't going to pick up a USB off the street anymore and plug it in to see what it is. I still do, but I've got a machine that is just for that and it's something that I could throw away.
[00:22:25] And I'm always just interested in which virus it is. But anyway, so something, uh, something to think about is practicing defense in depth. Is antivirus worth it? 100%. Is it worth the cost? Again, 100%.
Thanks again everybody for spending some time with me and E360 security. Have a great day.