Hallucinative-Insecure GPT Code Generation

5 min readJun 12, 2023

When AI Code Assistants Start Seeing Unicorns in Your Code

Recently, we’ve come across several articles that discuss the potential risks and security challenges posed by AI code assistants. These articles have piqued our curiosity and led us to question: Are these concerns valid? How do these risks manifest in real-world scenarios? And most importantly, how can we as developers and security professionals mitigate these risks?

So, we’ve decided to put these theories to test.

The Ghost in the Machine

In the realm of software development, the advent of AI code assistants like OpenAI’s Codex or Facebook’s InCoder has been nothing short of revolutionary. These tools, powered by Generative Pretrained Transformers, have the potential to streamline the coding process, automate mundane tasks, and even provide solutions to complex programming challenges. However, as with any powerful tool, they come with their own set of risks and challenges.

The Hallucinative Nature of GPTs

GPTs are trained on vast amounts of data, learning patterns and structures that allow them to generate human-like text. However, this learning process can sometimes lead to what is known as “hallucinative” or “insecure” code generation. This is when the AI produces code that appears to be correct and functional but is, in fact, based on patterns that the AI has “hallucinated” from its training data, rather than on sound programming principles.

The Security Implications

This nature of GPTs can lead to serious security implications. A recent study [5] found that using an AI assistant often resulted in more security vulnerabilities across multiple questions. The AI assistant often outputs code that, while satisfying “correctness”, it lacks the necessary context to ensure security, and in some cases, can create code that unintentionally introduces security vulnerabilities.

Generating an Insecure Cipher

Consider a developer who is using an AI assistant to create a function for encrypting sensitive user data. The developer might ask the AI assistant for help with a prompt like:

“Write a function in Python to encrypt a string using a cipher.”

At first glance, this function appears to work correctly. It takes an input string and a key, and it returns an encrypted string. However, the encryption method used here is a simple Caesar cipher, which is a type of substitution cipher. This type of cipher is trivial to crack and is not suitable for encrypting sensitive data.

In this case, the AI assistant has generated code that is “correct” in the sense that it performs encryption as requested, but it lacks the necessary context to understand that a secure encryption method is needed. As a result, it has unintentionally introduced a security vulnerability.

Employing an Insecure Authentication

Here’s another example. Imagine a developer is working on a web application and asks the AI assistant to generate a function to authenticate users based on their username and password. The developer might ask:

“Write a function in Node.js to authenticate a user.”

As generated result, it searches for a user in the database with the provided username, and if it finds a match, it checks if the provided password matches the stored password. If both match, the function returns true, indicating that the authentication was successful.

However, from a security perspective, this function is a disaster waiting to happen. It’s comparing plaintext passwords, which implies that the system is storing user passwords in plaintext in the database. This is a major security flaw. In a secure system, passwords should be hashed and salted before they are stored, and the comparison should be done on the hashed and salted values, not on the plaintext passwords. This practice may impact to the developer mindset.

The Threat of Malicious and Outdated Packages

Another risk associated with AI code generation is the potential for the inclusion of malicious or outdated packages in the generated code. AI assistants, in their quest to provide a solution, might suggest the use of a package that has known vulnerabilities or has been deprecated. This could expose the application to potential security risks and make it harder to maintain in the long run.

A well-known example of a supply chain attack is the event-stream incident in the Node.js ecosystem. In this case, a malicious actor gained access to the event-stream npm package and introduced a dependency on a new package, flatmap-stream, which contained malicious code. The malicious code was designed to steal cryptocurrency from specific applications that used the event-stream package.

The developer might ask the AI assistant:

“Write a function in Node.js to read data from a stream.”

This code is correct and will work as expected. However, if the version of event-stream used is 3.3.6 (the compromised version), the application is now exposed to the malicious code in the flatmap-stream package. In this case, the AI assistant, in its quest to provide a solution, has unintentionally introduced a malicious package into the application.

This example underscores the importance of using up-to-date and secure packages, and the potential risks of supply chain attacks in the software development process.

The Impact on Developers and Security Professionals

These risks and challenges have significant implications for both developers and security professionals.

For developers, it means that they cannot blindly trust the code generated by AI assistants. They need to thoroughly review and test the code, understand its implications, and ensure that it adheres to best practices for secure coding.

For security professionals, the rise of AI code generation presents a new frontier of threats to navigate. They need to be aware of the potential vulnerabilities introduced by AI-generated code and develop strategies and tools to detect and mitigate these risks.

Conclusion

In conclusion, while AI code assistants like GPTs hold great promise for the future of software development, they also present new challenges and risks. As developers and security professionals, we need to approach these tools with a healthy dose of skepticism and caution, always keeping in mind that the ghost in the machine, while incredibly powerful, is not infallible. We must continue to rely on our expertise, critical thinking, and vigilance to ensure the security and integrity of our code.

References

OpenAI. (2021). Codex. https://openai.com/research/codex/
Facebook AI. (2021). InCoder. https://ai.facebook.com/tools/incoder/
Ohm, M., Plate, H., Sykosch, A., & Tam, K. C. (2020). Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks. https://doi.org/10.1007/978-3-030-52683-2_2
Vasilakis, N., Karel, B., Roessler, N., Dautenhahn, N., DeHon, A., & Smith, J. D. (2017). Towards Fine-grained, Automated Application Compartmentalization. https://doi.org/10.1145/3144555.3144563
Neil Perry. Megha Srivastava. Deepak Kumar. Dan Boneh.
Do Users Write More Insecure Code with AI Assistants? (2023). Security Implications of AI Code Assistants. https://arxiv.org/pdf/2211.03622.pdf