OWASP Top 10 for Large Language Models, examples and attack mitigation

As the world embraces the power of artificial intelligence, large language models (LLMs) have become a critical tool for businesses and individuals alike. However, with great power comes great responsibility – ensuring the security and integrity of these models is of utmost importance.

The OWASP Top 10 for Large Language Models is a guiding light to navigate the landscape of potential security vulnerabilities.

Let’s dive into the most critical security risks and how to mitigate them, ensuring LLMs’ safe and effective use for organisations and their staff.

Key Takeaways

Given the rise of LLM applications usage across the Internet since late 2022, businesses need to assess, analyse and reduce the risks of AI.
OWASP Top 10 for LLMs helps organisations identify and mitigate potential security risks in their applications.
By implementing secure practices, organisations can prevent vulnerabilities such as prompt injection attacks, insecure output handling issues, training data poisoning, and MDoS attacks.
Organisations should also protect against model theft through authentication & access controls and copyright/licensing protections.

The Importance of LLM Security

In today’s technology-driven world, LLMs have become increasingly integrated into daily operations, providing valuable insights and assistance in various tasks. However, with their widespread adoption comes the potential for security vulnerabilities, such as training data poisoning, remote code execution, and supply chain vulnerabilities.

A few large companies across the US and UK were seen to have applied blanket rules to ban the use of LLMs; this will not stop their curious staff from using the AI. The underlining is to ensure companies’ processes, people and tech controls are in place to allow the safe use of such models. The stakes are high – a single security incident could become data disclosure or a serious breach that can lead to unauthorised access, privacy violations, and a damaged reputation.

To address these concerns, the OWASP Top 10 for LLMs was developed to help organisations identify and mitigate potential security risks in their LLM applications.

Regular assessments and penetration tests on LLMs are crucial in identifying potential weaknesses or vulnerabilities. By staying vigilant, organisations can maintain the security and reliability of their LLMs, safeguarding their confidential data and ensuring a high level of trust in their proprietary algorithms.

1. Prompt Injection Vulnerabilities

Prompt injection is a vulnerability where attackers manipulate the functioning of a trusted LLM through crafted inputs, either directly or indirectly, to access restricted resources or execute malicious actions. These attacks can lead to unintended actions or data leakage, posing critical security risks such as unauthorised access to sensitive information.

Direct prompt injection involves the attacker replacing or modifying the system prompt, while indirect prompt injection directs the LLM to an external source containing malicious instructions.

The implications of prompt injection vulnerabilities can be severe, with consequences ranging from data leakage to security breaches and unauthorised access. Recognising these risks and deploying adequate security measures is paramount in safeguarding LLMs from such threats.

Common Examples of Vulnerability

Prompt injection vulnerabilities bypass filters or restrictions using specific language patterns or tokens. They also exploit weaknesses in the LLM’s tokenisation or encoding mechanisms to mislead it into performing unintended operations.

For example, an attacker could use an LLM to execute a malicious indirect prompt injection attack by crafting a command like “model ignores previous instructions” and providing new instructions to query private data stores. This results in the LLM disclosing sensitive or personal information.

Another scenario involves a malicious user circumventing a content filter by using specific language patterns, tokens, or encoding techniques that the filter fails to recognise as restricted content, enabling the user to execute actions that should be blocked.

Mitigation of prompt injection vulnerabilities

The mitigation and prevention of prompt injection attacks in LLMs necessitate the sanitisation and validation of user input, along with the use of prepared statements or parameterised queries. Organisations can prevent the LLM from ignoring previous instructions and executing unauthorised code by validating user input to ensure it is in the expected format and sanitising it to remove any malicious code.

Implementing these measures helps maintain the security and dependability of LLM-based applications, reducing the risks associated with prompt injection vulnerabilities.

2. Insecure Output Handling Issues

Insecure output handling is a vulnerability that can lead to critical security issues, such as cross-site scripting (XSS), cross-site request forgery (CSRF), and server-side request forgery (SSRF). When LLMs fail to handle output properly, they expose backend systems and create potential risks for XSS, CSRF, SSRF, privilege escalation, and remote code execution, leading to critical vulnerabilities.

Preventing these issues requires organisations to enforce input validation, output encoding, and suitable access controls.

For instance, a malicious actor could create prompts that prompt the LLM to extract and divulge confidential information from a sensitive database, exploiting inadequate sandboxing. Organisations can prevent accidentally revealing sensitive information or executing malicious code by addressing insecure output handling issues.

Common Examples of Vulnerability

Some common insecure output handling vulnerabilities include cross-site scripting (XSS), cross-site request forgery (CSRF), and server-side request forgery (SSRF). These vulnerabilities result from a failure to sanitise user input before displaying it, which can lead to malicious code being injected into the application, potentially leading to unauthorised access to sensitive information or execution of malicious code.

The potential consequences of insecure output handling can be severe, ranging from unauthorised access to sensitive data, financial loss, and identity theft to reputational damage.

Cyber attacks are not a matter of if, but when. Be prepared.

Box-ticking approach to penetration tests is long gone. We help you identify, analyse and remediate vulnerabilities so you don’t see the same pentest report next time.

How do we prevent insecure output handling issues?

Addressing and preventing insecure output handling vulnerabilities in LLMs entails:

Sanitising user input to eliminate any malicious code
Validating user input to confirm it is in the expected format
Using prepared statements or parameterised queries to handle user input as data rather than code.

Additionally, the output should be encoded using the appropriate encoding for the type, such as HTML encoding for HTML and JavaScript encoding for JavaScript.

Implementing these measures helps to avoid accidentally revealing sensitive information or executing malicious code while maintaining the security and reliability of LLM-based applications.

3. Training Data Poisoning

Training data poisoning in LLMs involves introducing false information into the training data, which can lead to biased or compromised models. Issues associated with data poisoning in LLMs include the introduction of backdoors or vulnerabilities, the injection of biases, and the exploitation of the fine-tuning process.

Data poisoning can lead to distorted training data that could be prejudiced or deliver inaccurate results, potentially impacting the effectiveness and reliability of LLMs.

Ensuring accurate and unbiased data usage in training LLMs requires organisations to select and validate their training data sources meticulously. Some commonly used training data sources include:

Common Crawl
WebText
OpenWebText
Books

By diligently curating training data from these sources, organisations can help prevent the risks of training data poisoning.

Common Examples of Vulnerability

Training data poisoning vulnerabilities involve:

Circumventing filters or manipulating the LLM through the utilisation of specifically designed prompts
Potentially maliciously manipulating training data
Resulting in the model disregarding prior instructions or executing unintended actions.

Introducing backdoors or vulnerabilities into the LLM through maliciously manipulated training data is one example of training data poisoning risks. Injecting biases into the LLM can also result in biased or inappropriate responses.

By being aware of these vulnerabilities and taking steps to prevent them, organisations can maintain the security and reliability of their LLMs.

How to prevent data poisoning attacks?

Preventing and mitigating data poisoning attacks in LLMs necessitates careful selection and validation of training data, ensuring it is unbiased, accurate, and free from manipulation.

In addition to carefully curating training data, organisations should implement appropriate access controls to protect their LLMs from unauthorised access and potential data poisoning attacks.

By taking these steps, organisations can help safeguard their LLMs against the risks associated with data poisoning and maintain their models’ security, integrity, and effectiveness.

4. Model Denial of Service

Model denial of service (MDoS) is a security vulnerability wherein an attacker engages with an LLM in a resource-intensive fashion, resulting in a decreased quality of service for other users or increased resource costs. These attacks can cause significant service disruption and potential financial loss, as resources are consumed by the attacker’s requests, leaving legitimate users with degraded performance.

Implementing rate limiting, input validation, and resource allocation strategies can safeguard LLMs from MDoS attacks. These strategies can help to prevent service degradation and ensure that resources are available for legitimate users, maintaining the performance and reliability of the LLM.

Common Examples of Vulnerability

Model denial of service vulnerabilities can occur when an attacker submits resource-intensive requests to an LLM, overwhelming the model and causing it to become unresponsive or slow down. This can result in:

significant service disruption
data loss
financial losses for the organisation
frustration and inconvenience for legitimate users.

By implementing measures to prevent and mitigate DDoS attacks, organisations can help maintain their LLMs’ availability and performance and protect their users from the potential consequences of such attacks.

How to prevent it?

Mitigation and prevention of model denial of service attacks in LLMs require the enforcement of rate limiting, input validation, and appropriate resource allocation.

Rate limiting restricts the number of requests that can be sent to a model within a specified time frame, helping to prevent an attacker from overwhelming the LLM.

Input validation ensures that the data supplied to large language models is authentic and not malicious. At the same time, proper resource allocation guarantees that the model has the requisite resources to accommodate the requests it receives.

By implementing these measures, organisations can protect their LLMs from MDoS attacks and maintain their models’ security, performance, and reliability.

5. Supply Chain Vulnerabilities

In LLMs, supply chain vulnerabilities can result in skewed results, security violations, and even complete system breakdowns. The supply chain attack surface is broadened by vulnerable pre-trained models, training data provided by external entities that may be contaminated, and plugin designs that are not secure.

Maintaining the security and integrity of LLMs throughout their lifecycle is vital to guarding against supply chain vulnerabilities and preserving the credibility of their output.

In addition to the risks associated with training data poisoning and insecure plugin design, supply chain vulnerabilities can also stem from using third-party datasets, pre-trained models, and plugins in LLMs.

Such components can introduce vulnerabilities and adversely affect the LLM application lifecycle, potentially leading to unauthorised access, privacy violations, and security breaches.

Common Examples of Vulnerability

Supply chain vulnerabilities in LLMs can include code injection, data injection, and model injection, which are all malicious. For example, an attacker could exploit the PyPi package registry to deceive model developers into downloading a malicious package, potentially compromising the security and integrity of their LLM.

Additionally, insecure coding practices and inadequate security measures can expose LLMs to various supply chain attacks, resulting in unauthorised access, privacy violations, and security breaches.

How do we mitigate against supply chain attacks?

Mitigating supply chain attacks requires organisations to enforce secure development practices, adhere to coding standards, and conduct regular security audits for their LLMs. Strong authentication and access controls can also help guarantee that only authorised users can access the model and its associated data.

By taking these steps, organisations can help safeguard their LLMs against supply chain vulnerabilities, ensuring their models’ security, integrity, and effectiveness.

6. Sensitive Information Disclosure

Sensitive information disclosure refers to the unintentional disclosure of confidential data, which can lead to unauthorised access, privacy violations, and security breaches.

In the context of LLMs, sensitive information disclosure can occur when an attacker crafts a prompt instructing the model to request an internal service, bypassing access controls and gaining access to sensitive information.

Additionally, misconfigurations in the application’s security settings can allow the LLM to interact with a restricted API, potentially accessing or modifying sensitive data.

Safeguarding LLMs from sensitive information disclosure vulnerabilities necessitates organisations to implement and maintain access controls, authentication mechanisms, and data sanitisation processes. These measures can help to prevent unauthorised access to sensitive information and maintain the security and integrity of LLMs.

Common Examples of Vulnerability

Sensitive information disclosure vulnerabilities can occur when an attacker exploits an input validation flaw in an LLM, allowing them to gain unauthorised access or control over the system. Examples of sensitive information disclosure vulnerabilities include SQL injection, cross-site scripting (XSS), and insecure direct object references (IDOR).

These vulnerabilities, representing some of the most critical security risks, can result in unauthorised access to sensitive information, financial loss, identity theft, and reputational damage.

How to prevent it?

Preventing sensitive information disclosure vulnerabilities in LLMs requires organisations to:

Enforce data sanitisation processes
Implement strict user policies
Validate user input to ensure it is in the expected format
Sanitise user input to remove any malicious code
Use prepared statements or parameterised queries to treat user input as data, not code.

Additionally, implementing access controls and authentication mechanisms can help to ensure that only authorised users can access sensitive information or execute specific actions. By taking these steps, organisations can help safeguard their LLMs against sensitive information disclosure vulnerabilities and maintain their models’ security, integrity, and effectiveness.

7. Insecure Plugin Design

Insecure plugin design in LLMs is a vulnerability that can lead to remote code execution or data exfiltration if not appropriately addressed. The potential risks associated with insecure plugin design include the exposure of backend systems and the potential for remote code execution or data exfiltration.

To maintain LLMs’ security and reliability, designing and implementing secure plugins that follow industry best practices and upholding the model’s integrity is critical.

With the rapid growth of LLM technology, plugins have become an essential component of many LLM applications, providing additional functionality and capabilities.

Poorly designed plugins can introduce vulnerabilities and adversely affect the LLM application lifecycle, potentially leading to unintended consequences and security breaches.

Common Examples of Vulnerability

Insecure plugin design vulnerabilities can result from various factors, including insufficient access control, improper input validation, and insecure coding practices. For example, a plugin that accepts freeform text as input without proper validation can be exploited by a malicious actor to construct a request that results in unauthorised access or unauthorised code execution, which may lead to remote code execution.

By being aware of these vulnerabilities and taking steps to prevent them, organisations can maintain the security and reliability of their LLMs.

Secure code is an essential element for business growth

Show your customers and supply chain you can manage application risks with secure coding practices.

How to prevent it?

Preventing insecure plugin design vulnerabilities in LLMs requires organisations to enforce secure development practices, adhere to coding standards, and conduct regular security audits. Strong authentication and access controls can help ensure that only authorised users can access the plugin and its associated data.

By taking these steps, organisations can help safeguard their LLMs against insecure plugin design vulnerabilities, ensuring their models’ security, integrity, and effectiveness.

8. Excessive Agency

Excessive agency in LLMs refers to an abundance of functionalities, permissions, and autonomy that can lead to unintended consequences and security vulnerabilities. Inadequate AI alignment, which can result from poorly defined objectives or misaligned reward functions, is a common issue that contributes to excessive agency in LLMs. Without appropriate limitations, any undesired activity of the LLM, regardless of the origin, may lead to undesirable outcomes.

Avoiding risks linked to excessive agency calls for a careful definition of the objectives and constraints of LLMs, ensuring alignment with desired outcomes and ethical norms. By implementing appropriate limitations and monitoring the behaviour of LLMs, organisations can reduce the potential security risks associated with excessive agency.

Common Examples of Vulnerability

Excessive agency vulnerabilities can stem from various sources, including misperception, direct or indirect prompt insertion, or inadequately designed benign prompts. For example, an LLM might be granted excessive functionality, allowing it to access and modify sensitive information without proper oversight.

By being aware of these vulnerabilities and taking steps to prevent them, organisations can maintain the security and reliability of their LLMs.

How to prevent it?

Preventing excessive agency vulnerabilities in LLMs necessitates organisations to align the principal’s and agent’s interests, eradicate potential conflicts of interest, and enhance oversight and review of agency actions.

Additionally, implementing appropriate limitations on LLMs’ functionalities, permissions, and autonomy can help reduce the potential security risks associated with excessive agency. By taking these steps, organisations can ensure their LLMs’ safe and ethical operation, maintaining their models’ security, integrity, and effectiveness.

9. Overreliance on LLMs

Overreliance on LLMs for decision-making or content generation can lead to misinformation, legal issues, and security vulnerabilities. Given the rapid growth of LLM technology and its intensifying integration into daily operations, striking a balance between leveraging LLMs’ power and maintaining human oversight to guarantee accurate, unbiased, and secure results is critical.

Organisations should implement a review process for LLM-generated content to address the risks associated with overreliance on LLM, ensuring that it is accurate, unbiased, and free from malicious manipulation. By balancing human oversight and LLM-generated content, organisations can mitigate the risks associated with overreliance on LLMs and maintain their models’ security, integrity, and effectiveness.

Common Examples of Vulnerability

Overreliance on LLM content can result in the propagation of misinformation and other unintended outcomes, such as biased or inaccurate results. For example, an organisation that relies too heavily on LLM-generated content for news articles or security reports may inadvertently propagate false information, leading to potential legal issues, reputational damage, and other negative consequences.

By being aware of these vulnerabilities and taking steps to prevent them, organisations can maintain the security and reliability of their LLMs.

How to prevent it?

Preventing overreliance on LLMs requires the following:

Enforcement of secure development practices
Adherence to secure coding standards
Regular security audits
Regular maintenance and monitoring of LLMs for potential vulnerabilities
Implementation of a review process to guarantee the safe and correct operation of the LLM

By taking these steps, organisations can help safeguard their LLMs against the risks associated with overreliance, ensuring their models’ security, integrity, and effectiveness.

10. Model Theft

Model theft in LLMs refers to the potential for taking proprietary models illicitly, either by obtaining physical access, replicating algorithms, or stealing weights. The potential repercussions of model theft may include significant economic detriment, a loss of competitive advantage, or access to confidential information contained within the model.

Safeguarding LLMs from model theft necessitates the implementation of robust authentication and access controls by organisations, ensuring that only authorised users can access the model and its related data. Additionally, securing copyright or licensing the model may provide legal protections against potential theft or unauthorised use.

Common Examples of Vulnerability

Model theft vulnerabilities can occur when an attacker gains unauthorised access to a proprietary LLM model, potentially exposing sensitive information or disrupting operations. For example, an attacker could steal a proprietary LLM model and use it to generate content for a competing organisation, resulting in a loss of competitive advantage and potential economic losses for the original organisation.

By being aware of these vulnerabilities and taking steps to prevent them, organisations can maintain the security and reliability of their LLMs.

How to prevent it?

Preventing model theft vulnerabilities in LLMs requires organisations to enforce robust authentication and access controls, ensuring that only authorised users can access the model and its related data. Additionally, securing copyright or licensing the model may provide legal protections against potential theft or unauthorised use.

By taking these steps, organisations can help safeguard their LLMs against model theft, ensuring their models’ security, integrity, and effectiveness.

Summary

The OWASP Top 10 for Large Language Models is a valuable resource for organisations to identify and mitigate potential security vulnerabilities in their LLM applications. As artificial intelligence continues to evolve, organisations must stay vigilant and proactive in addressing potential security vulnerabilities, ensuring a safe and reliable environment for LLM applications.

Frequently Asked Questions

What is the purpose of the OWASP Top 10 for Large Language Models?

The OWASP Top 10 for LLMs provides a guide to help organisations identify and address potential security vulnerabilities, ensuring their models’ security, integrity, and effectiveness.

What is a prompt injection vulnerability?

A prompt injection vulnerability is a security risk in which an attacker manipulates inputs to access restricted resources or execute malicious actions.

How can organisations prevent and mitigate training data poisoning?

Organisations can prevent and mitigate training data poisoning by carefully selecting and validating their training data sources to guarantee accuracy, unbiasedness, and protection from malicious manipulation.

What are the potential consequences of model theft?

Model theft can lead to economic detriment, loss of competitive advantage, and unauthorised access to confidential information, all of which have serious repercussions.

How can organisations reduce the risks associated with overreliance on LLMs?

Organisations can reduce the risks associated with LLMs by implementing secure development practices, adhering to coding standards, conducting security audits, and reviewing their usage.

Harman Singh

Harman Singh is a security professional with over 15 years of consulting experience in both public and private sectors.

As the Managing Consultant at Cyphere, he provides cyber security services to retailers, fintech companies, SaaS providers, housing and social care, construction and more. Harman specialises in technical risk assessments, penetration testing and security strategy.

He regularly speaks at industry events, has been a trainer at prestigious conferences such as Black Hat and shares his expertise on topics such as ‘less is more’ when it comes to cybersecurity. He is a strong advocate for ensuring cyber security as an enabler for business growth.

In addition to his consultancy work, Harman is an active blogger and author who has written articles for Infosecurity Magazine, VentureBeat and other websites.

Sharing is caring! Use these widgets to share this post

Continue your journey with more articles

Application Security

Insecure design vulnerabilities – what are they, and why do they occur?

Application Security

WAAP (Web Application & API Protection) security and its importance in 2022

Application Security

Serialize vs Deserialize in Java (with examples)

Find the right school for you with Sast or Dast's assistance.

Application Security

SAST vs DAST: Explore different types, and examples and make the right choice.

A triangle depicting privilege escalation between administrator, Bob, and Alice.

General

Privilege Escalation Attacks: Types, Examples and Defence

OWASP Top 10 for Large Language Models, examples and attack mitigation

Key Takeaways

The Importance of LLM Security

1. Prompt Injection Vulnerabilities

Common Examples of Vulnerability

Mitigation of prompt injection vulnerabilities

2. Insecure Output Handling Issues

Common Examples of Vulnerability

Cyber attacks are not a matter of if, but when. Be prepared.

How do we prevent insecure output handling issues?

3. Training Data Poisoning

Common Examples of Vulnerability

How to prevent data poisoning attacks?

4. Model Denial of Service

Common Examples of Vulnerability

How to prevent it?

5. Supply Chain Vulnerabilities

Common Examples of Vulnerability

How do we mitigate against supply chain attacks?

6. Sensitive Information Disclosure

Common Examples of Vulnerability

How to prevent it?

7. Insecure Plugin Design

Common Examples of Vulnerability

Secure code is an essential element for business growth

How to prevent it?

8. Excessive Agency

Common Examples of Vulnerability

How to prevent it?

9. Overreliance on LLMs

Common Examples of Vulnerability

How to prevent it?

10. Model Theft

Common Examples of Vulnerability

How to prevent it?

Summary

Frequently Asked Questions

What is the purpose of the OWASP Top 10 for Large Language Models?

What is a prompt injection vulnerability?

How can organisations prevent and mitigate training data poisoning?

What are the potential consequences of model theft?

How can organisations reduce the risks associated with overreliance on LLMs?

Article Contents

Subscribe toOur Newsletter

Pen Testing

Solutions

Security services

Company

Contact

Email/Call

Follow us on

Securing your cyber sphere

© 2023 CYPHERE All Rights Reserved.