LLM Fundamentals for SecOps Teams

A practical guide to leverage large language models to enhance security operations

Sep 03, 2024

Welcome to Detection at Scale, a weekly newsletter for SecOps practitioners covering detection engineering, cloud infrastructure, the latest vulns/breaches, and more. Enjoy!

In cybersecurity, the promise of AI—coupled with vendors' hyperbolic marketing—has left a bad taste and an even worse impression! While all technological trends go through their hype cycle, today, we'll explore foundational LLM concepts, prompting techniques, and practical ways to extract real value for detection and response.

This is our second blog about generative AI, in case you missed the first one:

The GenAI D&R Revolution Begins

Jack Naglieri

May 13

Read full story

LLM Basics

LLMs (large language models) are highly compressed collections of general knowledge designed to understand and generate human-like text based on the input they receive. Trained on billions of words from diverse sources, these models learn patterns and relationships, enabling them to predict and generate text. The "magic" of LLMs lies in their generative nature—they calculate the next possible sequence based on the given prompt and their extensive training data.

Each specific bit of context assists in generating a response, which can both help and hurt the quality of the output. While LLMs don't always get it right, they can create novel connections, like a security analyst with perfect recall who occasionally gets creative with the details.

*Generated with aitestkitchen.withgoogle.com*

GenAI vs Traditional AI/ML

Generative AI, unlike traditional AI/ML applications in cybersecurity, excels at analytical tasks such as sentiment analysis, natural language processing, pattern recognition, and problem-solving. The ultimate goal with LLMs and generative AI is to achieve "artificial general intelligence," where the model can think like a human. Applying human psychology techniques, such as examples, can enhance the quality of the output, bringing us closer to this ambition.

Compared to generative AI, traditional AI/ML excels at tasks that require precise, deterministic outputs based on specific data inputs. This includes anomaly detection in network traffic, malware classification, and predictive modeling for threat detection. Traditional ML shines when you need consistent, reproducible results for well-defined problems with clear right or wrong answers. It's like a security guard who's excellent at spotting known threats but might struggle with an entirely new type of attack.

One of the most impressive features of LLMs is their ability to keep a tremendous amount of context "in attention" - far more than a human could manage. This is particularly powerful with log analysis and understanding complex security events, allowing LLMs to:

Identify subtle patterns across disparate log sources and states
Maintain context and recall details over long sequences of events
Quickly synthesize information from multiple log formats and sources
Generate detailed, context-aware reports that take into account background information
Summarize technical machine data for less technical individuals (and vice versa)

It's crucial to remember that while LLMs are powerful, they're not infallible. They will make mistakes, "hallucinate" information, and potentially be manipulated. LLMs should be seen as powerful assistants rather than autonomous decision-makers in security.

The Art of Prompting

The most tried and true test of a good prompt is this: If you can ask your friend to interpret it and they understand, you probably wrote a good prompt! To use another analogy, if you were to delegate a task, you'd give more or less guidance depending on their familiarity and proficiency. The same goes for LLMs – they need just the right amount of direction.

When writing prompts, it's essential to be specific with the task, goal, and instruction but not overly verbose in context. Each piece of information can improve precision or distract attention. Think of an LLM's attention like a spotlight – you want to shine it on the most essential parts of your request.

Let's take a real example of writing a Sigma detection rule:

Vague prompt: "Write a Sigma rule for detecting suspicious PowerShell activity." LLM response:

title: Suspicious PowerShell Activity
status: experimental
description: Detects suspicious PowerShell activity
tags:
    - attack.execution
    - attack.t1059.001
logsource:
    product: windows
    service: powershell
detection:
    selection:
        EventID: 4104
    condition: selection
falsepositives:
    - Legitimate PowerShell scripts
level: medium

While this is okay, it's pretty generic. Now, let's feed it a schema and be more specific:

Improved prompt: "Write a Sigma rule for detecting PowerShell scripts that attempt to disable Windows Defender. Use the following Sigma rule format:

title: [A clear, concise title]
id: [Unique identifier (UUID)]
status: experimental
description: [A brief description of what the rule detects]
author: [Your name or organization]
date: [Current date in YYYY/MM/DD format]
tags:
    - [Relevant MITRE ATT&CK tactics and techniques]
logsource:
    product: windows
    service: powershell
detection:
    selection:
        EventID: 4104
        ScriptBlockText|contains:
            - [Specific strings to detect]
    condition: selection
falsepositives:
    - [Potential false positive scenarios]
level: [Severity level: low, medium, high, or critical]

Please fill in the placeholders with appropriate content."

LLM response:

title: Suspicious PowerShell Activity Disabling Windows Defender
id: a83d75b1-2d4e-4a8a-b3d2-4e9f8f7d8e9f
status: experimental
description: Detects PowerShell scripts attempting to disable Windows Defender by modifying key settings such as real-time protection, exclusion paths, or disabling antivirus services.
author: Your Name
date: 2024/08/31
tags:
    - attack.defense_evasion
    - attack.t1562.001
logsource:
    product: windows
    service: powershell
detection:
    selection:
        EventID: 4104
        ScriptBlockText|contains:
            - 'Set-MpPreference -DisableRealtimeMonitoring'
            - 'Set-MpPreference -DisableBehaviorMonitoring'
            - 'Set-MpPreference -DisableIOAVProtection'
            - 'Set-MpPreference -DisableScriptScanning'
            - 'Set-MpPreference -DisableIntrusionPreventionSystem'
            - 'Add-MpPreference -ExclusionPath'
            - 'sc.exe config WinDefend start=disabled'
            - 'Stop-Service -Name WinDefend'
            - 'Set-Service -Name WinDefend -StartupType Disabled'
    condition: selection
falsepositives:
    - Legitimate administrative scripts that modify Windows Defender settings
    - Authorized software deployment processes
level: high

Now we're cooking! This rule is much more specific and valuable. While the generated response may have errors, they can be quickly reviewed and corrected through follow-up prompts or by changing it yourself.

Few-Shot and Zero-Shot Learning

Examples and concepts can significantly improve performance for nuanced knowledge that may not have existed in the training data. This is where few-shot learning comes into play.

Few-shot learning is a technique where an LLM is quickly trained on a new topic by providing a few examples. This allows the model to generate similar outputs based on those examples, adapting to new tasks with minimal input. Let's expand on our Sigma rule example:

Example prompt: "Write a Sigma rule for detecting PowerShell scripts that attempt to clear Windows event logs. Use the following example as a guide:

title: PowerShell Attempt to Disable Windows Defender
id: f62176f3-8128-4faa-bf6d-667a8d84c3c8
status: experimental
description: Detects PowerShell scripts that attempt to disable Windows Defender
author: AI Security Analyst
date: 2023/08/31
tags:
    - attack.defense_evasion
    - attack.t1562.001
logsource:
    product: windows
    service: powershell
detection:
    selection:
        EventID: 4104
        ScriptBlockText|contains:
            - "Set-MpPreference"
            - "-DisableRealtimeMonitoring"
            - "Set-ItemProperty"
            - "DisableAntiSpyware"
    condition: selection
falsepositives:
    - Legitimate security testing or admin scripts
    - Automated system management tools
level: high

Please create a similar rule for event log clearing."

This approach gives the LLM a clear template, improving the chances of getting a well-structured and relevant output!

On the other hand, zero-shot learning occurs when you ask an LLM to tackle a subject it hasn’t been explicitly trained on. The model relies on its general knowledge to infer the correct approach, sometimes leading to inaccurate or "hallucinated" results. Our first vague prompt about PowerShell activity was essentially a zero-shot attempt—we gave it no specific examples or format to follow.

Creative Outputs

Similar to our goal and instruction set, we can also specify the type of output we want. As you've likely played around with images, you can give them a style like "photorealistic" or "vaporwave." You can do the same with text outputs, opting for specific formats (HTML, XML) or explanations written as poems.

For instance: "Explain the concept of a buffer overflow in a haiku format."

LLM response:

Data overflows
Memory boundaries breached
System compromised

Model-Specific Nuances

The final piece of prompting advice is researching prompting techniques for particular foundation models. For example, Claude's models operate well using <xml> tags in prompts to label prompt instructions explicitly. OpenAI models perform better with clear, concise instructions at the beginning of the prompt, followed by any necessary context or examples.

By learning these nuances, you can also improve performance and get the most out of your AI-powered security sidekick. Remember, prompt engineering is as much an art as a science. Feel free to experiment, iterate, and find what works best for your specific security tasks!

Example Prompts for Security Analysts

Let's explore some practical prompts that security analysts can use to leverage LLMs:

Find Patterns in Multiple Log Files

Role: You are a senior security analyst with expertise in correlating events across different log sources.
Task: Analyze excerpts from three different log files: firewall logs, Active Directory logs, and endpoint detection and response (EDR) logs.
Goal: Identify any correlated events or patterns that might indicate a sophisticated attack, such as an APT (Advanced Persistent Threat).
Context: These logs are from the last 24 hours during which a potential data exfiltration attempt was suspected.
Output Format: Present your findings in a table with columns for 'Timestamp', 'Log Source', 'Event Description', and 'Correlation Analysis'. Follow the table with a summary of your overall assessment.

Firewall Logs:
<Firewall log excerpt here>

Active Directory Logs:
<AD log excerpt here>

EDR Logs:
<EDR log excerpt here>

Analyze Unusual Network Traffic

Role: You are a network security analyst with deep knowledge of common protocols and attack patterns.
Task: Analyze this summary of unusual network traffic captured during a suspected security incident.
Goal: Identify any indicators of compromise, potential malware communication patterns, or signs of data exfiltration.
Context: This traffic was captured over a 6-hour period on our internal network, which normally doesn't communicate with external sources except for approved cloud services.
Output Format: Present your analysis as a bulleted list of findings, followed by a paragraph summarizing the most critical discoveries and recommended actions.

Network Traffic Summary:
<Network traffic capture summary here>

Recap

Let's recap the key takeaways from interacting with LLMs:

LLMs excel at analytical tasks and context retention, making them valuable in cybersecurity for pattern recognition, log analysis, and information synthesis.
While powerful, these models are not infallible and should be viewed as assistive tools rather than autonomous decision-makers.
Effective prompts require clear, concise instructions that focus the model's attention on key details and desired outcomes.
Tailoring prompts to specific models and utilizing techniques like few-shot learning with relevant examples can significantly improve output quality.
Understanding how these models work enables security teams to better leverage their capabilities for complex analysis tasks.

Embracing LLMs

The future of cybersecurity with LLMs is bright. As models continue to evolve, we anticipate even more sophisticated applications in threat detection, incident response, and proactive security measures where LLMs can:

Predict and simulate complex attack scenarios, helping us stay one step ahead.
Generate and update detection rules based on the latest intel and alert data.
Provide real-time, context-aware guidance during active incident response.
Facilitate seamless communication between technical and non-technical stakeholders.

Remember that LLMs are tools to augment humans rather than replace them. The creativity, intuition, and judgment of security teams will remain invaluable.

Let's embrace the LLM revolution! By refining our prompting techniques, staying informed about new models, and using LLMs to think critically, we can build more robust, responsive, and intelligent security operations.

The future is here, and it's speaking our language – quite literally!

Thanks for reading Detection at Scale! Subscribe to join hundreds of security professionals who read this blog weekly.

Detection at Scale