The PyPI repository revealed about 5000 secrets left in the code and 8 malicious obfuscators

GitGuardian researchers have published the results of an analysis of sensitive data forgotten by developers in code hosted in the PyPI (Python Package Index) repository of Python packages. After studying more than 9.5 million files and 5 million package releases associated with 450 thousand projects, 56866 cases of confidential data leakage were identified. If we take into account only unique data, without duplication in different releases, the number of identified leaks was 3938, and the number of projects with at least one leak was 2922.

In total, more than 150 types of confidential information leaks have been identified, including ordinary passwords, cryptographic keys, access tokens for cloud services, continuous integration systems and APIs. At least 768 credentials remained active at the time of the study. Examples of popular leaks that remain relevant include access keys for Azure Active Directory, credentials for SSH, MongoDB, MySQL and PostgreSQL, keys for GitHub OAuth App, Dropbox and Auth0, login parameters for Coinbase and Twilio.

Among the types of leaks that are gaining popularity are tokens for access to bots in Telegram, the number of which doubled in early 2021 and then doubled again in the spring of 2023. A constant increase in leaks has also been recorded since 2020 for access keys to the Google API, and since 2022 for credentials to the DBMS. Among the packages leading in the number of leaks, the chatllm and safire packages are mentioned, in which 209 keys to OpenAI and 320 keys to Google Cloud were forgotten.

Among the file types in which the largest number of leaks were identified, in addition to files with the “.py” extension, there are files with the extension .json (610 leaks), .md (270), PKG-INFO (240), METADATA (210), . txt (170), as well as README files (209) and files from directories named test (675). Many leaks are also due to oversights and errors in setting the exclusion of files when generating packages. For example, files with local configuration files (.cookiecutterrc, .env, .pypirc, etc.) can be excluded from the Git repository through a ".gitignore" file, which is not taken into account when creating the package. In particular, 43 .pypirc files were found in the repository containing credentials for accessing PyPI. In 15 leaks, developers did not intend to publicly release packages originally created for internal use, but published them on PyPI by mistake.

Additionally, two more events related to PyPI can be mentioned:

  • In the PyPI repository, 8 malicious packages were identified, presented as utilities for obfuscation, i.e. reducing the code to an unreadable form, complicating the restoration of the algorithm. The identified packages contained the string "pyobf" in their names (Pyobftoexe, Pyobfusfile, Pyobfexecute, Pyobfpremium, Pyobflight, Pyobfadvance, Pyobfuse and pyobfgood) and were downloaded more than 2000 times.

    The malicious code integrated into the packages was platform-specific Windows and allowed connection to an external control server, run arbitrary commands on the developer's computer, find and send sensitive information, such as access keys, to an external server, and transfer arbitrary files from the system. Furthermore, the malicious code could act as a keylogger, intercept passwords entered in Chrome, create screenshots, record audio, and even control the webcam.

  • The results of an independent audit of the code base of the tools used to organize the work of the pypi.org repository and the cabotage framework used in the container orchestration infrastructure have been published. The audit was carried out with the support of the non-profit organization OTF (Open Technology Fund). During the audit, no problems with a high level of danger were identified, and the source codes were recognized as meeting the basic requirements for safe coding. At the same time, insufficient testing coverage of the cabotage codebase was noted and 29 problems were identified, of which eight were assigned a moderate level of danger, 6 - low, and 14 were marked as informative comments.

    The most noticeable problems:

    • Insufficient verification of digital signatures used to integrate PyPI with AWS SNS allowed notifications to be sent to individual users' emails.
    • An information leak in the download handler that allows you to determine the existence of an account without generating events about login attempts.
    • The use of unreliable cryptographic hashes that do not exclude cache poisoning attacks.
    • If you have the right to launch build processes via cabotage, the attacker could potentially achieve the substitution of his commands.
    • With deployment rights in cabotage, an attacker could potentially deploy a legitimate-looking image.

Source: opennet.ru

Buy reliable hosting for sites with DDoS protection, VPS VDS servers 🔥 Buy reliable website hosting with DDoS protection, VPS VDS servers | ProHoster