ChatGPT and secure coding: The good, the bad, and the dangerous

September 14, 2023


0

In the digital landscape, ChatGPT's influence is hard to ignore. With a monthly user base exceeding 100 million, people rely on OpenAI’s chatbot for tasks ranging from casual chats to educational resources, content generation, and even coding support. At Nord Security, we're particularly intrigued by its coding capabilities. Can ChatGPT really produce secure code that withstands today's advanced cyber threats? To find out, our security expert, Astrid Bytes (name changed for security reasons), put it to the test. Dive into this blog to discover her experiment and key findings.

Featured image ChatGPT and secure coding

Research

92% of US developers state that they use AI tools like ChatGPT to boost productivity across a variety of programming tasks. One of them might be a login in form, which, at first glance, can look like a simple task. But "simple" can be deceptive. The safe handling of user credentials is critical. One mistake here could result in GDPR violations and hefty fines.

With this in mind, Astrid initiated a code-writing simulation using ChatGPT-3.5. She tasked the AI to generate database check-in code in PHP. What she didn't do, however, was specifically ask the AI chatbot to create secure code or include certain security features.

You have a login form with a username and password. Please write a code snippet in PHP to check in the database whether the credentials match.

PHP code snipet with enlarged code vulnerabilities

The code generated showcased clear vulnerabilities:

  • It's prone to SQL injection attacks;

  • It allows username enumeration;

  • It hardcodes credentials directly in the code.

Firstly, the susceptibility to SQL injection attacks is caused by the $username variable being directly plugged into the SQL query string, lacking a properly prepared statement with parameter binding. It might be assumed that the sanitize_input function could thwart SQL injections. However, the PHP functions – trim, stripslashes, and htmlspecialchars – aren't tailored to prevent such injections and can even be detrimental when used for this purpose.

A security expert asked ChatGPT to improve the code so that it is not vulnerable to SQL injection attacks.

1
// Prepare the SQL statement using a parameterized query
2
$stmt = $conn->prepare("SELECT * FROM users WHERE username = ? LIMIT 1");
3
$stmt->bind_param("s", $username);
4
5
// Execute the prepared statement
6
$stmt->execute();
7
8
// Get the result from the executed statement
9
$result = $stmt->get_result();

The second flaw was user enumeration. The AI code suggested displaying "Invalid password!" for failed logins, a security misstep. Such messages give attackers hints, easing their job of guessing valid usernames. A better practice is to keep things vague with messages like "Invalid username or password", keeping attackers in the dark about which part failed. Astrid subsequently called for the removal of user enumeration in the code.

1
// Password doesn't match
2
echo "Invalid username or password!";
3
}
4
} else {
5
// Delay the response for a short random period (to prevent timing attacks)
6
usleep(rand(100000, 500000)); // Random delay between 0.1s and 0.5s
7
// Generic error message to prevent username enumeration
8
echo "Invalid username or password!";
9
}

The last red flag – hardcoded credentials. Attackers who access the source code could exploit these to compromise the system or steal sensitive information. To fix this issue, the AI chatbot suggests using environment variables or configuration files to store database credentials.

The experiment didn't end there. Astrid Bytes delved deeper, giving the same task to ChatGPT in other widely used programming languages:

"I was experimenting with 5 different programming languages, including PHP, Java, Rust, JSON, and C, but didn’t notice any significant differences when it came to more secure code," she reported.

Java code snipet with enlarged code vulnerabilities

The Java test mirrored PHP results, revealing code vulnerabilities. Moreover, each time a flaw was patched, a new one emerged.

The Java check-in code and its subsequent iterations suffered from various issues:

  • Vulnerability to SQL injections;

  • Hardcoded credentials in connection strings;

  • Storing passwords as plain text or hashing with the SHA-256 algorithm;

  • Weak exception handling;

  • Exposure to cross-site scripting (XSS) attacks;

  • Unsolicited code that included information not tailored to specific requests or needs.

Astrid also evaluated ChatGPT-4’s secure coding capabilities. She found it slightly more robust than its 3.5 predecessor. However, an expert’s oversight was still needed to correct flaws in the code.

Interestingly, ChatGPT displayed enhanced proficiency when “writing a code in development frameworks compared to vanilla versions of programming languages.” This observation aligns with the fact that certain development frameworks provide integrated solutions to tackle specific security vulnerabilities. Nonetheless, it's crucial to understand that these frameworks, while helpful, are not foolproof – developers can still produce insecure code within them.

Key takeaways

This test revealed that, while ChatGPT does a great job in engaging in human-like conversation, it doesn’t perform so well in producing secure code. Astrid Bytes classified her findings into the good, the bad, and the dangerous.

The good

  • ChatGPT serves as an excellent coding assistant, boosting productivity and helping with quick algorithm implementations. A study from the National Bureau of Economic Research attests that generative AIs like ChatGPT can enhance workforce productivity by roughly 14%.

  • It can generate code in a multitude of programming languages.

  • ChatGPT-4 generally outperforms ChatGPT-3.5, though expert review remains essential for spotting vulnerabilities.

  • Considering secure coding, the chatbot performs better within modern development frameworks than in standard programming languages.

  • ChatGPT can recognize code issues, detailing their exploitability and suggesting remediation steps. However, this feature is effective only if the user actively seeks such insights.

The bad

  • ChatGPT has a limited response size and cuts corners when focusing only on functional requirements, skipping security considerations. So, you won’t always get the right code on the first try.

The dangerous

  • Code output falls below minimum security standards. Astrid Bytes noted that this issue stems from ChatGPT’s training data: “It's trained on old data (until September 2021) and isn't updated on new vulnerabilities and attack types. Plus, ChatGPT has been trained on large amounts of data and coding examples found on the web. The truth is that not all of them are written securely. There is a lot of bad code on the web.”

  • Inadequate code security is language-agnostic. As Astrid asserts, "I was experimenting with 5 different programming languages, but did not notice any significant differences when it came to more secure code."

  • Secure code only if asked. According to our security expert, “It's focussed on generating code based on functional requirements (your request to write code that solves a particular task) while security and other non-functional requirements are not always taken into consideration – unless you specifically ask for it.”

  • Requests to fix code vulnerability might lead to a code mutation. As she observed, “While fixing one place, it made changes in another part of the code which was previously secure or even rewritten the code by using a different framework compared to what was originally requested.”

  • Some of ChatGPT’s answers provided were incorrect. Astrid Bytes noticed that ChatGPT sometimes returned code snippets that included extraneous or incorrect information. This inconsistency underscores a recent Purdue University study, which revealed that ChatGPT answered only 48% of software engineering questions accurately.

Conversation on ChatGPT

Can ChatGPT be used for coding?

Astrid highlights that ChatGPT should be viewed only as a supporting tool for code writing. Whether you're using an older or newer version, or even if you prompt it to adhere to secure coding standards, human touch and expert oversight remain indispensable.

“You have to understand that ChatGPT isn’t a security toll. It’s trained on old data and unaware of the latest vulnerabilities and attack vectors. So, it might suggest vulnerable libraries or insecure configurations,” Astrid notes.

Further, the research underscores its significant error rate when addressing coding queries. Such inaccuracies, combined with cybersecurity concerns, have led global giants like Apple, Samsung, and even the coding Q&A hub Stack Overflow to restrict its use.

So, if you decide to use an AI chatbot for coding:

  • Get to know your AI assistant. Whether it’s ChatGPT or any other tool, it’s important to know its limitations.

  • Take security seriously. It might not be such a big deal for single-use scripts that you won’t need tomorrow, but it makes a big difference for production code.

  • Only ask to generate the code in a programming language you’re familiar with. The more knowledge you have on programming language and secure coding practices, the easier it is to spot vulnerabilities in generated code.

  • Use SAST tools to help you evaluate the findings. However, they can generate false positives as well as false negatives. Therefore, any AI-generated code should undergo a manual code review as well.

  • And finally – trust no one. Not even ChatGPT.