The Hard Thing About Hard Things with Daniel Wiley
Going 0 to 1 in the MDR business, the reality of cyber hygiene, and scaling to terabytes per second.
Welcome to Detection at Scale, a weekly newsletter for SecOps practitioners covering detection engineering, automation, the latest vulnerabilities/breaches, and more.
Each month on the Detection at Scale podcast, we interview leaders and practitioners responsible for scaling their security operations programs. In today’s blog, we’ll cover the highlights from our episode with Daniel Wiley of Check Point Software.
“Lessons are repeated until they are learned. Oftentimes, we need to experience lessons several times to really get them.”
After working in cybersecurity for 30 years, Daniel Wiley’s biggest advice is to never stop learning. His biggest career progressions came from reading as much as possible to level up his knowledge and move forward. Progress through pain, hard learnings, and setbacks. “Never stop trying, hacking, and questioning everything around you.”
During our conversation, Daniel recounted taking Check Point’s Incident Response (IR) and Managed Detection and Response (MDR) businesses from 0 to 1, how they built a massively scalable real-time pipeline to support nearly 500 customers, the current state and future of the industry, and a wholesome ending.
Episode highlights
Going 0 to 1 at Check Point
Daniel built Check Point’s IR business from the ground up and scaled it to nearly 500 customers. When those customers needed a permanent solution to Managed Detection and Response, that was a sign to build another new business unit within Check Point to help sustain these services. Eventually, this reached a similar customer scale to the IR business, and the Check Point team (like many MDRs) built an end-to-end cloud architecture to accomplish Detection at Scale. Daniel recalled one of his proudest moments while presenting to their CEO, Gil Shwed, about how his team could keep cloud costs so low while operating on such a huge amount of data throughput. As a software business, Check Point prides itself on high gross margins (70-80%), and that culture is carried into its MDR business, too.
The Reality of Cyber Hygiene
Many businesses lack the basics of Detection and Response. Oftentimes, the investment in the people, process, and time required to get an in-house D&R practice off the ground is simply too much. It’s also extremely nuanced work that can be difficult to find the right person to lead it. Daniel believes that if you are developing SecOps in-house, you are unlike 99% of the businesses. There isn’t an extremely high barrier to entry to cover the basics, though. Daniel advocates for three core areas when getting started: An advanced EDR platform, advanced email protection, and a gateway product to secure the [cloud] perimeter. If you do those three, you’ll have the basic technology requirements covered.
Building for Analyzing Terabytes per Second
Time is your most important asset, and many security teams advocate for sub-minute latency in their detection pipelines to minimize the effect of a breach as quickly as possible. The challenge is building and maintaining the technology to support that requirement. Daniel explains that to meet this requirement, they built a massively scalable real-time analytics pipeline with a few principles: Make it easy to get data into the system, pay attention to the statistical relevance of alerts and events, and strategically use the [big] data lake for larger hunts.
Making data easy to onboard is a critical requirement for any SIEM because, without good data, you can’t analyze anything. The log data framework, though, is just as important. Daniel advocates building a unified schema based on your internal needs rather than an off-the-shelf unified model. There is no perfect “one-size-fits-all” approach.
To maximize time and money, pay close attention to the statistical relevance of alerts. Have we seen this before or in a different context? Should we spend our time on this alert? Every message should be treated like a funnel becoming filtered, processed, and eventually ending in an analyst’s notebook.
To reach a massive scale, they utilized massively horizontal and very small VMs (referencing Firecracker) that maintain a 20-30 day cache of log data in memory for quick lookups. The focus on the real-time pipeline is quick triage and initial signaling, not investigation. When the team needs to do a larger threat hunt, they turn to the data at rest in blob storage in the Data Lake.
To visualize this design, Daniel jokes that it’s similar to the mail machine from Men in Black:
Consolidation and the Future of the Industry
“The easiest way to market SIEM is being nearly free, but concessions are made in what you get for that price.” At the time of recording, the Exabeam merger and QRadar acquisition had just been announced, which marks the last generation of SIEM tooling consolidating and leaving room in the market for the next wave. Many big players are touting that GenAI will dominate the game (Palo Alto), but given the ubiquity and rise of open source, AI is not as novel anymore. Daniel believes that in the future, security engineers will spend their time training, tuning, and interacting with models that can sustain complex analysis for longer periods of time than humans. This will be the step function for GenAI into SecOps, but passing along our intuition is not always the easiest.
Put Your Oxygen Mask On First
We always end the episode with advice from our guest. One piece was given at the start of the blog: Never Stop Learning. The second piece is to take care of yourself.
“Over time, this job [of incident response] gets really tough to maintain due to the high-stress situations and exposure to bad stuff.” Physical and mental health is paramount for your career longevity. You should also strive to build social connections with peers and talk about what’s going on inside your worlds. Don’t internalize it! It’ll eat you alive.
Thank you for reading!
To hear this conversation, follow Detection at Scale wherever you get your podcasts. If you enjoyed this blog, share the episode with a friend!
Photo by Andrej Lišakov on Unsplash