Let's talk about two stories this week. First we'll dive into the CrowdStrike fiasco from a few weeks ago and then we'll see what happens when a company accidentally hires a North Korean spy.
Discord: https://discord.gg/bJauPBBhHn
Website: https://whattheshellpod.com
Instagram: https://instagram.com/shell_pod
Watching the Watchmen
Quis custodiet ipsos custodes. Who will guard the guards themselves. Or maybe something you all might be a bit more familiar with, Who watches the watchmen.
Since the show went on hiatus a couple interesting things happened that I wanted to take a bit of a deeper dive around. Incidents where the people and technology that we rely on to protect our own systems and security were met with trouble themselves.
In July, 2 stories really stood out to me in the cyber space. They're stories I've discussed with my friends, my family, and my coworkers. So now, I want to talk about them with you. The first one I'm sure you're accutely aware of. It was the day the earth stood still while Crowdstrike held millions of system in limbo due to a faulty upgrade. What happened there and how do we learn from it?
The second one, you might not be as familiar with. What do you do when the company you charge for teaching and learning about cyber security is caught accidentally hiring a foreign spy? The short story, you use it as a learning exercise. The long story, well stick around and we'll chat about that too.
I'm John Kordis and on this episode of the show I'm going talk to you about What the Shell is going on out there in the world of cyber security that even those who we look to for protection are being got.
[Intro]
On July 19th, 2024 the world saw what was quite possibly the biggest disruption in technology that I can recall. Honestly, it reminded me of everything that people said would happen during Y2K. Planes were grounded, hospitals stalled, even emergency services were hit.
Myself, I was on a bus tour over in Scotland when this happened. I remember popping open my phone to see a couple tweets and posts in the infosec community about a bad update to Crowdstrike, but at first I didn't really think much of it. Then I popped over to reddit to check on some news for the day from the cyber security subreddits that I follow and saw more posts about it. Now that's not inherently uncommon, people talk in the community. But the posts had hundreds and hundreds of comments and it was still pretty early in the day in Scotland, so in the US people weren't even waking up.
I started to look into what was actually happening and it was alarming to say the least. So let's step back because I think we're at the point where most of you are familiar. You all heard about the outage but might have some questions like: Okay but what is crowdstrike? Why didn't my own computer get hit if this was so widespread? Was this a cyber attack? Well let's dive in, maybe in that order.
I want to start with Crowdstrike. For those of you who aren't directly in the field this might be the first time you're hearing that name. They're basically a massive provider of cyber security products ranging from services to software. In this case the product we want to focus on is end point detection and response, or an EDR solution.
I want you to imagine your computer or phone as a busy, bustling city. Just like a city needs security to protect it from threats like burglars or vandals, your computer needs protection from cyber threats like viruses or hackers.
An Endpoint Detection and Response (EDR) system is like a security team for each individual device—such as your computer or smartphone. It continuously monitors the device for any suspicious activity or threats, much like security cameras and patrols keep an eye on a city. If the EDR system detects something unusual or potentially harmful, it alerts you and takes action to address the threat.
In essence, an EDR helps ensure that each device remains secure by actively looking out for and responding to potential dangers. But in order to do that, it often needs pretty in depth levels of access, so when there is a risk of failure the failure can be rather impactful.
These systems are incredibly valuable. It's why even after an attack happens the investigators are able to triage and sleuth their way into figuring out the fine details of the situation at the time it occurred.
The thing is, an EDR solution like this is largely an enterprise product, meaning that it's meant for companies and organizations to use across the board. While you could technically put the product on your own system that's not really the use case for common person, maybe if you've got a lab or you want to really test this stuff out but people like your family probably only ever see something like Crowdstrike if it's on their work laptop.
So that bit right there? When this software caused the world to stop that's why you probably didn't notice anything on your home computer or laptops.
Speaking of causing the world to stop, what actually happened here? Well flash back to July 19th, around 4AM UTC Crowdstrike pushed out what they call a "Rapid Response Content Update".
These rapid response updates aren't inherently versions of the software being delivered but are a set of critical behavioral pattern identification signatures. The crowdstrike threat detection engineers will build these and deploy them in reaction to attacks and intelligence based on what's going on in the world so that if it hits your machine they can identify and respond to it. Think back to our episodes on APTs like Fancy Bear. If Fancy Bear is utilizing a new method to deliver ransomware to hosts then Crowdstrike might analyze this, create a signature, and rapidly deploy it to make sure that teams can respond appropriately.
These Rapid Response updates are stored ina proprietary file format that contains some configuration data that can map the signatures to behavior like blocking, observing, or preventing it.
They're a pretty common part of the threat identification and response lifecycle, despite the name making it feel like it's a critical drop everything effort these happen all the time.
But what was special about this one? Well for that I'm going to go to the report that Crowdstrike themselves published on it.
Well remember how I mentioned that Rapid Response are delivered as Templates? Those template types all represent a specific sensor capability and their behavior is all configured dynamically through the content updates themselves.
So when some new templates were deployed and the systems started blue screening people were frantically trying to figure out the root cause.
Microsoft and Crowdstrike have both come out to talk about this but the analysis all seems to confirm that it was due to a ready out of bounds memory safety error in a specific driver.
What does that mean? According to Crowdstrike themselves it means that it started a chain that quote
"When received by the sensor and loaded into the Content Interpreter, problematic content in Channel File 291 resulted in an out-of-bounds memory read triggering an exception. This unexpected exception could not be gracefully handled, resulting in a Windows operating system crash (BSOD)."
Yeah, not gracefully handled is right. It's all to say that because there was no built in way to handle this error properly, your computer just stalled out if it was impacted. And let's step back to talk about that impact because like I said at the top of the show, this was one of, if not the, most disruptive cyber events to have ever occurred. Let's give you all some context:
We'll start with some broad numbers. It was estimated that over 8.5 million systems got hit. In the grand scheme of things that is not a massive number, but because of who Crowdstrike does business with it was a number of important groups. I'm going to run through piece by piece what some of the major impacts were to this.
Let's start with some state issues, in many parts of the country emergency services like 911 that relied on windows machines to be a critical part of the infrastructure were hit. You might have seen some of your local authorities going to facebook, x, or other socials to post alternate direct lines since the intake systems for new calls were broken.
In some places entire broadcast stations for television were left inoperable. Sky News in the United Kingdom went off the air for a bit, there were problems in the US with several news stations as well, and evenupon a partial restoration there will still functions like graphics, teleprompters, and studio quality issues that weren't entirely functional.
Getting a bit into some heavier hitting impacts. Let's talk about flights. This is one where there is some really interesting demonstrable and ongoing problems.
This outage lead to hundreds upon thousands of flights ebing canceled on that Friday. Honestly for me it was a close call because I had a flight 12 hours prior to this outage. I hate to imagine what would have happened if I had chosen the Friday morning option instead of Thursday. The issues were global and impacted a wide variety of airlines. In an article on Mashable Shannon Connelan reported these airlines as having to ground flights.
Aegan Airlines, Air France, Allegiant, Akasa, American Airlines, Binter, Delta, Frontier, IndiGo, JetStar, KLM, Qantas, Ryanair, Singapore Airlines, SpiceJet, Sun Country, Swiss International, TAP Air Portugal, Turkish Airlines, United, Vueling, Wizz Air, and even some airports were left either partially or entirely closed such as Logan Airport in Boston, a major hub of the east coast.
You may have seen one graphic that's been making it's way around the internet. It shows the flight maps just before, during, and after the outage. It's eerie seeing what looks like a massive swarm of planes operating as normal quickly become a fraction of what it normally is. It honestly reminded me of when I was younger and a similar graphic was used to show the sky clearing on September eleventh. Slowly just more and more planes landing with less and less taking off. If you got to the episode transcript at whattheshellpod.com I've included a link to this, and I'll try to get it up on my socials as well for you this week once the episode is released.
And while the impact to travel was bad let's up the impact even more to talk about the stock markets. While not as all or nothing as the airline industries there were issues with international stock markets because of this. The London Stock Exchange for example had to delay it's opening to address the issue and allow for trading. This impact was not felt as heavily in the US but was still interesting to see how the international markets adapted regardless.
Lastly, let's talk about one of the most critical areas that got hit…healthcare. This one hits close to home, I used to work in healthcare information security and these kinds of outages that are caused by something out of your control can wreak havoc not only on the systems but on regular people like you or I. Healthcaredive reported that at least 11 healthcare organizations were experiencing problems. These are large institutions like Emory Healthcare, JFK Medical, Mass General Brigham, and Norton Healthcare. If hospitals weren't mostly down they were often operating in an almost emergency like condition, with day surgeries and non-critical procedures being postpooned.
Perhaps the hardest part for all of these industries responding wasn't the outage itself though it was the recovery process. To respond quickly, teams needed to get hands on keyboards. That meants that in some cases you had organizations with upwards of 100,000 machines and a support staff of only hundreds of IT workers that needed to go to each and every machine to get this done as quick as possible. And again, let's not forget that this was on a Friday. So this meant that many helpdesk and support engineers were likely about to lose their entire weekend to this problem. It just wasn't something your average end user could do so you needed a good engineer to help. Eventually there were thrown together solutions for network booting and other work arounds but those didn't come right away.
What we're left with at the end of this is a disruption the likes of which many had not seen before. The damages are still being considered but according to Parametrix it caused over 5 billion dollars in damages to Fortune 500 companies. And Delta is currently going back and forth with Crowdstrike publically about the possibility of a lawsuit coming due to it's half a billion dollar impact.
That's not even to consider the damage to crowdstrike itself. If you look at the Crowdstrike stocks from a month ago compared to now? Well they're down over 40% going from near 400 a share to as low as 230. I've got the whole chart for the last month up to August 6th in the episode transcript as well.
There's fiscal damage, brand damage, reputation damage. It keeps going. And one last thing that you might have seen that tends to come with issues like this is the profiteering of malicious actors and hackers. Yes, we can't forget that in the wake of emergencies like this there will always be those looking to make a buck. What we're seeing now are phishing cases where people are masquerading as crowdstrike trying to trick you into thinking you're impacted even if you're not, and getting you to install malware on your machies. This isn't unusual but it's getting an uptick in visibility because of the ongoing news reporting on the issue. You see this with every kind of major news or emergency event, if you recall when I talked about the start of the Russia/Ukraine war there were people profiteering off of that as well.
So what do I think will come of this? It's hard to say. At Crowdstrike? I imagine a couple things will happen. There's definitely going to need to be some turnover of leadership and accountability. Even if a lawsuit doesn't make it's way to fruition I imagine that some executives will be pulling the ripcord on whatever exit packages they can get. I also imagine that their internal test and quality assurance processes are going to receive a full audit, which will hopefully end in more support for the engineers on their teams.
In the world at large? I think we'll probably start to see some more strict regulations surrounding the liability after test and qa. It's hard to imagine that some poilitician won't try to push forward with new rules surrounding what kind of processes should be followed and who is ultimately accountable for the fiscal damages if something slips through like this. I don't think it will come soon, but I imagine those discussions are starting as we speak.
I hope that this was a wake up call though, because while this wasn't a cyber attack…just a failure of processes, that doesn't mean the next one won't be. It's scary to think about but this demonstrates what a large scale outage would look like around the world. We recovered pretty well this time but that was when we knew the fix and cause within minutes to hours of it happening. If I were a business I'd be tabletopping our continuity plans and recovery procedures because as with most disruptions in the cyber landscape it's never a question of if you'll be impacted it's a question of when you'll be impacted.
With that I want to pivot to a different story that also had me interested in the past few weeks that I thought you all migth find interesting too. It's regarding a company called KnoBe4. If that name seems familiar to you then kudos because you've been here a while. I mentioned KnoBe4 way back in episodes 2 and 3 of What the Shell because it was a company that Kevin Mitnick went on to work heavily with before he passed away last year.
Think of KnowBe4 as a company that helps businesses teach their employees how to stay safe online. Just like schools teach kids about good habits and safety, KnowBe4 helps companies train their staff to recognize and avoid online threats.
Here’s a simple breakdown:
Training Programs: They offer online lessons and videos that explain common online threats, like phishing emails (which are fake emails trying to steal your information). These lessons show employees how to spot these threats and what to do if they encounter them.
Practice Tests: KnowBe4 also runs simulated phishing attacks. This means they send fake phishing emails to employees to test their responses. If an employee falls for the fake attack, it’s a sign they need more training. It’s like a practice drill to help them get better at identifying real threats.
Improving Security: By training employees and running these practice tests, KnowBe4 helps businesses create a safer work environment. It’s like giving everyone the knowledge and tools to avoid falling for online scams and keeping the company’s information secure.
In short, KnowBe4 helps companies make sure their employees are smart and prepared when it comes to online security. Which is why it was such an interesting story when they themselves came out about how a North Korean hacker was hired to the IT team, slipping through the process and attempting to steal secrets.
It should be noted that everything I'm about to cover as a part of the story here comes directly from KnowBe4 themselves in a blog post that came out on their website on July 23rd, 2024 titled "How a North Korean Fake IT Worker Tried to Infultrate Us".
So we know what KnowBe4 does now right, so that means you must be able to at least imagine some of the scope that they need to function. They need to maintain the research, build the tech, potentially offer multiple language support, and keep up with the times.
So it really wasn't surprising when earlier this year KnowBe4 needed a software engineer for their own Internal AI team. If you work in security you know that AI enhancement is the hot item for platforms at the moment. It's not meant to take a job away from you like some would think, in most cases it's meant to streamline and offer you the information that you're likely to try and ask for first. So they continued through their normal hiring process. They would post the job, screen resumes, interview candidates and when it got to the point where they had one picked out they went through what one might expect and gave the background check and reference check.
Well, the unnamed employee would be hired and almost immediately upon logging in for the first time it started to attempt to load malware. And here's the reason I decided to tell this story second. The reason this was caught was because the EDR solution in use at KnowBe4 caught the suspicious activity on the device at this stage and alerted their own cyber defense team.
The suspicious activity started on July 15th at around 10PM EST. Now if you're on a SOC you might start thinking about how you'd respond to an alert out of the blue like this. I think most peoples first guess would be "Oh god, what link did the new hire click".
The security team got a hold of the new hire and asked what they were doing on the machine to which he claimed that he was following the steps on his router FAQ to troubleshoot some speed issues. Not really the kind of answer you'd expect to hear in this case. All the while more and more alerts are triggering. The security team is seeing attempts to alter session history files on the device, transfer files to the device, and execute unauthorized code.
Now this is all stuff that raises alarms and when it doesn't match up with what the user is saying they're doing it sets off even more. But again, I imagine they weren't immediately leaping to malicious actor so they tried to get more details from him and getting him directly on the phone but he eventually became unresponsive. And even while he was unresponsive the activity would continue.
So about a half hour into this at 10:20PM the security operations center isolated the new hires device. This is something that the EDR capabilities do to limit access and contain a threat. If they have reason to believe a device has been compromised or that something isn't right they can isolate it from the network, limit it's accesss, and essentially put a cone of shame on it until they figure out if it needs to be wiped or not.
And after isolation it was honestly pretty much close to done from the compromise standpoint. KnowBe4 severley limits a users access if they're a new hire that hasn't done all their training anyways so there was no access to or exfiltration of any sensitive data from the company.
Now at this point I'm going to say this isn't really meant to be any kind of slam on KnowBe4, but a highlight of how nation states try to infiltrate operations.
So let's rewind a bit and talk about that infiltration process. What did they do? Well this is the part that might freak some of you out because it's an example of something that's probably going to pop up more and more.
It started with a stolen identity of a United States citizen. It doesn't really matter how it was stolen, could have been a breach, could have been sold in the dark web, who knows. But someones identity was stolen.
So the North Korean worker has a name and an identity. Next they set up an address to attach the resume to, and this is going to come back later.
From here they now have a different problem, how do you make yourself look legitimate? Well the answer in this case was to take a stock image of a profressional and use AI to dress it up. KnowBe4 posted the pictures to their blog and I have them on the website in the transcript but I'll also toss them into the socials at shell_pod on instagram.
The left is the original stock, the right is the AI Alter.
Here's where it gets even crazier there were 4 rounds of video interviews done. The ai enhanced picture looked enough like him and the identity supplied on the application that there were no red flags to the interviewees and even then: to all my people who are interviewing, how often is that really at the front of your mind? Really you're there to assess techincal and team aptitude and while it should be someones responsibility to follow up on this, it might not necessarily be yours.
So the hacker got by with this and was hired, giving the address he had used for the fake identity to have his macbook shipped to him. However the hacker wasn't actually at this address, the address would have been a middlepoint for other machines in the area that he could remote into from either North Korea or somewhere in China. So he was never actually stateside, just working nights to make it look like he was online during our business hours. The combintaion to VPNing into a laptop farm that was actually in the country and being online during normal hours meant that from an access standpoint it really didn't look super abnormal. According to KnowBe4 in the blog post many of the scams like this operate twofold, where part of it is getting that persistent access to the network and part of it is actually to make money that can be used for other illegal services and activiites coming out of North Korea.
So at the end of the day a lot of the process worked well here. The EDR caught it, the team at KnowBe4 is actively working with law enforcement on this which is why some of the details are scarce here like not having a name, and KnowBe4 even changed up some of their processes to make sure this can't happen again. In fact let me pull a couple items from the FAQ they published that I think you might want to hear about.
The first question that was asked of them that I want to highlight was 'Why would someone hired as a software developer try to load malware on their new machine?'
To which KnowBe4 said that "We can only guess, but the malware was an infostealer targeting data stored on web browsers, and perhaps he was hoping to extract information left on the computer before it was commissioned to him.
This was a skillful North Korean IT worker, supported by a state-backed criminal infrastructure, using the stolen identity of a US citizen participating in several rounds of video interviews and circumvented background check processes commonly used by companies."
The other question I'm highlighting was "Has KnowBe4 changed their hiring process?"
And they replied "You bet we have! Several process changes were made so that this thing will be caught earlier. One example is that in the US we will only ship new employee workstations to a nearby UPS shop and require a picture ID."
So there we have it, it's a good cautionary tale on multiple fronts. Make sure your EDR is alerting right and your teams respond well. Make sure your background checks are comprehensive. Maybe highlight some red flags and consider changing your own processes if you think that there might be a gap similar to theirs.
All in all, it was an interesting month in July for cyber events. We're just over half the year in, so let's see if we get anything else in the back half of 2024 that comes close to this. I'm John Kordis, that's this episode of What the Shell. Thanks to everyone for listening.
Before we go I want to ask if you can do me a favor. Since I'm just coming back the show is going to have a bit of an uphill battle against the algorithym. So if you can rate the show that would be super helpful. Especially all you Spotify listeners, that seems to be my biggest userbase. And if you liked this, maybe share it with someone you think might enjoy it too.
I've rebuilt the shows discord, you can find a link to it in the episode description. To anyone that's going out to vegas this weekend for DefCon I hope you all have a great time! I won't be there this year but I'm planning on coming back next year. If you are there, make sure you say hi to Jack Rhysider from Darknet Diaries, he's doing a bunch of cool events you can check out on his discord or socials. That's it for this week, see you all in the middle of August.
https://www.healthcaredive.com/news/crowdstrike-outage-hits-us-hospitals/721887/
https://www.crowdstrike.com/blog/falcon-content-update-preliminary-post-incident-report/
https://mashable.com/article/grounded-flights-today-list-faa-airport-closures-microsoft-outage
https://www.barrons.com/news/london-stock-market-hit-by-technical-glitch-d0b9bb24
https://blog.knowbe4.com/how-a-north-korean-fake-it-worker-tried-to-infiltrate-us