Skip to main content
availability

Deployment: Invicti Platform on-demand, Invicti Platform on-premises

LLM-based app vulnerability testing

The LLM Security scan profile in Invicti Platform is specifically designed to test Large Language Model (LLM) powered applications for security vulnerabilities unique to AI-enabled systems. This scan profile focuses on identifying weaknesses in chatbots, AI assistants, and other applications that integrate with language models.

This document provides technical details about:

  • What prompts and payloads are being used
  • What specific tests are executed
  • How to verify that LLM security tests were successfully injected
  • Detection and confirmation methods

Overview of LLM security testing

Target applications

LLM Security testing is designed for web applications that integrate with:

  • AI chatbots embedded in web interfaces
  • Virtual assistants for customer support
  • Content generation tools powered by language models
  • AI-powered search and recommendation systems
  • Code generation interfaces that use LLM capabilities
  • Document processing tools with AI summarization

How to configure an LLM scan

To scan LLM-powered applications for AI-specific vulnerabilities:

  1. Navigate to Scans  > New Scan in Invicti Platform
  2. Select LLM Security as the scan profile
  3. Configure your target URL (the web application that includes LLM functionality)
  4. Configure additional scan settings as needed

The LLM Security scan profile is automatically going to detect and test AI-powered components within your target application, including chatbots, virtual assistants, and other LLM integrations.

For detailed information about creating and configuring scans, see New scan.

Testing approach

The LLM Security scan profile uses Invicti's DeepScan engine to perform sophisticated analysis of LLM-powered endpoints. Unlike traditional web application testing, LLM security testing requires:

  1. Contextual understanding of conversational interfaces
  2. Dynamic prompt generation based on application responses
  3. Behavioral analysis to detect AI model manipulation
  4. Multi-turn conversation testing to identify complex attack vectors

LLM response detection and analysis

Before testing for vulnerabilities, Invicti first establishes that it's communicating with an actual LLM by analyzing response patterns and formats. The scanner can identify and parse various response types:

  • Server-Sent Events (SSE): Streaming responses in text/event-stream format
  • JSON responses: Structured API responses containing LLM output
  • Plain text responses: Direct text-based LLM outputs
  • Streaming responses: Various streaming formats used by LLM applications

This detection phase ensures that the scanner accurately identifies LLM interfaces and understands how to extract and analyze the actual AI-generated content from various response formats.

LLM Security vulnerabilities tested

Based on the scan profile configuration, Invicti tests for the following LLM-specific security issues:

1. Prompt injection

What it tests: Attempts to manipulate the LLM's behavior by injecting malicious instructions into user inputs.

Test methodology:

  • Direct prompt injection: Injecting commands directly into user input fields
  • Indirect prompt injection: Using data sources that the LLM might reference
  • Role manipulation: Attempts to change the AI's assumed role or permissions
  • Context manipulation: Exploiting conversation history to alter behavior

Example attack patterns:

Invicti tests prompt injection by sending specially crafted prompts that attempt to override the LLM's original instructions. The scanner uses verification techniques to confirm whether the injection was successful.

Typical patterns include:

  • Instructions to ignore previous directives
  • Commands requesting specific factual information to verify compliance
  • Role manipulation attempts
  • Context boundary violations

Detection method: The scanner analyzes responses to determine if the LLM followed the injected instructions rather than its original system directives or safety guidelines. Successful prompt injection is confirmed when the model demonstrates it has deviated from its intended behavior.

2. System prompt leakage

What it tests: Attempts to extract the system prompt or internal instructions that guide the LLM's behavior.

Test methodology:

  • Direct extraction attempts: Asking the model to reveal its instructions
  • Conversational manipulation: Using social engineering to extract system information
  • Role reversal techniques: Attempting to make the AI explain its own configuration

Example attack patterns:

Invicti attempts to extract system prompts through various techniques:

  • Direct requests for the model to reveal its initialization instructions
  • Social engineering approaches to manipulate the model into sharing configuration details
  • Techniques that exploit conversational context to expose hidden directives
  • Methods that attempt to bypass confidentiality restrictions

Detection method: The scanner uses sophisticated pattern matching and content analysis to identify when system prompts or internal instructions have been successfully extracted from the LLM's responses.

3. LLM command injection

What it tests: Attempts to execute system commands or access unauthorized functionality through LLM interfaces.

Test methodology:

  • System command injection: Attempting to execute shell commands
  • Function call manipulation: Exploiting tool/function calling capabilities
  • API access attempts: Trying to access backend APIs through the LLM
  • Code execution: Attempting to execute Python or other code

Example attack patterns:

Invicti tests for command injection through various approaches:

  • Attempts to execute system-level shell commands
  • Requests to run encoded or obfuscated commands
  • Instructions to execute code in various programming languages (Python, Bash, etc.)
  • Combined prompt injection with command execution requests

Detection method: The scanner uses out-of-band detection techniques, including Invicti OOB integration, to verify whether commands were actually executed on the backend system. This provides definitive proof of exploitability rather than relying solely on response analysis.

4. LLM-enabled Server-Side Request Forgery (SSRF)

What it tests: Exploits LLM's ability to make web requests to access internal resources.

Test methodology:

  • Internal network scanning: Attempting to access localhost and internal IPs
  • Cloud metadata access: Trying to access cloud service metadata endpoints
  • Port scanning: Using the LLM to probe internal network ports
  • Service discovery: Attempting to identify internal services

Example attack patterns:

Invicti tests for SSRF vulnerabilities by attempting to make the LLM access restricted resources:

  • Requests to fetch content from internal network addresses
  • Attempts to access cloud provider metadata endpoints
  • Instructions to probe internal services and ports
  • Requests to retrieve data from private network resources

Detection method: The scanner uses both out-of-band callback verification (via Invicti OOB) and content analysis to confirm successful SSRF attacks. This dual approach validates that the LLM actually made the request and accessed the target resource.

5. Insecure output handling

What it tests: Identifies vulnerabilities in how LLM outputs are processed and displayed.

Test methodology:

  • XSS through LLM output: Injecting scripts that get rendered in the browser
  • Template injection: Exploiting server-side template engines
  • Code injection: Attempting to inject executable code in LLM responses
  • HTML injection: Manipulating page structure through AI-generated content

Example attack patterns:

Invicti tests how the application handles potentially dangerous content in LLM responses:

  • Requests for the LLM to generate HTML with embedded scripts
  • Instructions to create responses containing template injection payloads
  • Attempts to inject executable code into LLM-generated output
  • Requests for malicious markup that could affect page structure

Detection method: The scanner analyzes how LLM-generated content is rendered in the application, checking whether dangerous output is properly sanitized, escaped, or filtered before being displayed to users.

6. Tool usage exposure

What it tests: Attempts to enumerate and misuse tools and functions that the LLM has access to.

Test methodology:

  • Tool enumeration: Discovering what tools and functions are available to the LLM
  • Parameter discovery: Identifying tool parameters and their expected values
  • Unauthorized tool access: Trying to use tools beyond intended scope
  • Tool parameter manipulation: Exploiting tool parameters for malicious purposes
  • Privilege escalation: Attempting to access higher-privilege tools

Example attack patterns:

Invicti attempts to discover and exploit LLM tool capabilities:

  • Requests for the LLM to list available tools and functions
  • Instructions to enumerate tool parameters and capabilities
  • Attempts to invoke tools with manipulated or unauthorized parameters

Detection method: The scanner analyzes LLM responses to identify when tools are exposed or can be manipulated, and validates whether tool usage restrictions are properly enforced.

7. LLM fingerprinting

What it tests: Identifies the specific LLM model and configuration being used.

Test methodology:

  • Model identification: Determining the specific AI model in use (e.g., OpenAI GPT, Anthropic Claude, Google Gemini, Meta Llama, Mistral, etc.)
  • Version detection: Identifying model version and capabilities
  • Configuration probing: Discovering model parameters and settings
  • Capability enumeration: Mapping available functions and tools
  • Response pattern analysis: Analyzing response characteristics to identify the underlying model

Example attack patterns:

Invicti queries the LLM to reveal its identity and capabilities:

  • Direct questions about the model's identity
  • Requests for version information
  • Capability probing questions

Detection method: The scanner analyzes responses for model-specific identifiers, response patterns, and behavioral characteristics. This information helps understand the attack surface and potential vulnerabilities specific to the identified model.

How tests are executed

1. Application discovery phase

The scanner first identifies LLM-powered components by:

  • Analyzing JavaScript: Looking for chatbot frameworks and AI integration code
  • Detecting API endpoints: Identifying endpoints that accept conversational input
  • Form analysis: Finding text areas and input fields connected to AI processing
  • Response pattern matching: Detecting AI-generated content patterns

2. Conversation initiation

Once LLM interfaces are identified, the scanner:

  • Establishes sessions: Creates proper conversation contexts
  • Tests basic functionality: Verifies the LLM is responsive
  • Maps conversation flow: Understanding multi-turn conversation capabilities
  • Identifies input validation: Testing what types of input are accepted

3. Vulnerability injection

For each identified LLM endpoint, the scanner:

  • Sends crafted prompts: Uses the vulnerability-specific payloads
  • Monitors responses: Analyzes AI-generated responses for signs of success
  • Tests multiple variations: Uses different phrasings and techniques
  • Maintains context: Preserves conversation history for complex attacks

4. Response analysis

The DeepScan engine analyzes responses using:

  • Pattern matching: Looking for specific indicators of successful injection
  • Behavioral analysis: Detecting unusual AI behavior patterns
  • Content inspection: Analyzing response content for security issues
  • Context validation: Ensuring responses indicate actual vulnerabilities

Need help?

Invicti Support team is ready to provide you with technical help. Go to Help Center

Was this page useful?