Assessing the Strength of Passwords Generated by Large Language Models

The results of a password strength analysis conducted by large language models and AI assistants are presented. The researchers asked the Claude, ChatGPT, and Gemini models to generate a strong 16-character password. In all cases, the resulting passwords appeared to meet all secure password requirements and were recognized as strong by password quality tools. The passwords combined uppercase and lowercase letters, special characters, and numbers, but while they appeared secure, they actually had minimal entropy, followed a standard template, and formed a consistent pattern when queried repeatedly.

According to researchers, predictable passwords generated by large language models are used in practice by real users and suggested by AI assistants during code development. The entropy level in passwords generated by AI models is estimated at 20-27 bits, requiring anywhere from a few seconds to hours to crack, while analysis of the results by password quality checking tools predicts cracking times of centuries. The pattern-like nature of such passwords is a consequence of the content construction by large language models based on token prediction.

Of the 50 passwords generated in Claude Opus 4.6, 18 were exactly the same, all passwords began with a letter (mostly "G"), always followed by a number (mostly 7), all passwords contained the characters "L", "9", "m", "2", "$" and "#".

 Assessing the Strength of Passwords Generated by Large Language Models

In GPT-5.2, almost all passwords began with the letter "v," followed in half by the letter "Q" and a repeating pattern from a limited character set. In Gemini 3, almost half of passwords began with the characters "K" or "k," most often followed by "#," "P," or "9," with the character set significantly reduced. Increasing the "temperature" when running AI models does not significantly affect the quality of the generated passwords.

As for the AI ​​assistants used in development, the quality of a password largely depends on the request generated by the developer. For example, Claude Code with Opus 4.6 and Gemini-CLI with Auto Gemini 3 ran the "openssl rand" command to generate a password when asked to generate a strong password. Gemini-CLI with Auto Gemini 3 used "openssl rand" when asked to "generate a password," and used an AI model to generate a password when asked to "suggest a password." Codex with GPT-5.3-Code occasionally ran an external utility to generate a strong password, but occasionally generated a predictable password internally. Claude Code with Opus 4.5 most often generated predictable passwords internally. The ChatGPT Atlas browser used an AI model to generate a weak password when creating a password for website registration.

Source: opennet.ru

Add a comment