Auditing WordPress code with WP-CLI and AI

Two week ago, Google released their newest AI model, Gemini 2.5 Pro experimental. This caught my eye for two reasons (🧠 + 💵). First, it’s the most advanced coding model I’ve seen. Second, it’s currently free to use! I don’t know what Google is thinking by releasing an advanced coding model for free. I can only assume that it will eventually have a cost, and I doubt it will be cheap. In the meantime, I’ve been having lots of fun with it.

Another unique aspect of Google Gemini is its large context window, which allows for 1 million tokens. I haven’t found a good token-to-character count conversion; however, in real-world testing, I’ve found 2 MBs of text can easily fit within the 1 million token limit. That means an entire codebase can be loaded into a single query. So why not do that from the command line?

Using Google Gemini to build a WP-CLI command that uses Google Gemini. 🤖 🛠️ 🤖

I’m a big fan of T3.chat. It’s the only AI service that I’m currently paying to use (only $8/month). They’re continuously adding the latest and greatest AI models from all different companies. That makes trying out the latest models extremely easy. I’ve had this idea of having AI audit WordPress code for a while, and with T3 Chat and Gemini 2.5, it pretty much created the fully usable WP-CLI command after a few attempts.

You can see my starting point for wp audit-files here: https://gist.github.com/austinginder/63002ef3739fbc06fcf3c2d453d97962. After a few revisions and real-world testing, here is what I ended up creating. Meet wp audit-files.

NAME

  wp audit-files

DESCRIPTION

  Audits theme and plugin PHP files using Google Gemini.

SYNOPSIS

  wp audit-files [--skip-api-call] [--api-key=<key>] [--timeout=<seconds>] [--themes=<themes>] [--plugins=<plugins>]

  Scans all themes and plugins by default. If --themes or --plugins are
  provided, only the specified items will be scanned. Splits large sets
  of files into chunks under ~2MB, makes separate API calls for each chunk,
  and compiles the results. Attempts to get structured JSON output from
  the API and display it as a table.

OPTIONS

  [--skip-api-call]
    Only find files and report the number of chunks, do not make API calls.

  [--api-key=<key>]
    Google Gemini API Key. If not provided, it will try to read the GEMINI_API_KEY environment variable.

  [--timeout=<seconds>]
    Timeout in seconds for *each* API request. Defaults to 300.

  [--themes=<themes>]
    Comma-separated list of theme slugs (directory names) to include.
      If provided, only these themes (and any specified plugins) will be scanned.

  [--plugins=<plugins>]
    Comma-separated list of plugin slugs (directory names) to include.
      If provided, only these plugins (and any specified themes) will be scanned.

Running wp audit-files crawls the themes and plugins directories for PHP code and compiles that code into a single payload TXT file. This payload is then split up and sent off to Google Gemini with requests to look over the files for any issues and return results in a particular JSON structure. The response is then displayed in a table using WP-CLI. The response is also saved to an all-issues.json file.

The command can either scan all PHP code or a particular theme or plugin using --themes=<themes> or --plugins=<plugins> arguments. Scanning everything will quickly hit Google’s free tier limits. While the command is smart enough to stay within the 1 million tokens per request, I discovered Google also has a 5 million tokens per day rate limit, which makes sense. That can easily be hit, depending on how much PHP code exists on your website. It will also fall back to a less powerful model when the usage rate of their 2.5 model hits its limits.

Chatting with your codebase from the command line unlocks untold potential.

My idea, to use AI as a junior developer auditor, is just one possible use case. Being able to talk to your codebase has far greater potential. With just a few tweaks to my prompt, you can ask anything you’d like about any arbitrary code. I mean, just think about that for a moment. What could you ask? Anything from “Do you spot any PHP compatibility issues with 8.4?” to “Have any ideas how I should refactor this PHP class?” to “Do you spot any compatibility issues with plugin A and plugin B?”. All possible with a tweak to the system prompt and response schema.

$this->responseSchema = [
    'type' => 'ARRAY',
    'description' => 'List of potential issues found in the PHP files.',
    'items' => [
        'type' => 'OBJECT',
        'properties' => [
            'file_path' => ['type' => 'STRING', 'description' => 'Relative path (e.g., /themes/my-theme/functions.php)'],
            'issue_description' => ['type' => 'STRING', 'description' => 'Description of the potential issue.'],
            'severity' => ['type' => 'STRING', 'description' => 'Estimated severity (High, Medium, Low, Info).'],
            'code_snippet' => ['type' => 'STRING', 'description' => 'Optional code snippet.', 'nullable' => true]
        ],
        'required' => ['file_path', 'issue_description', 'severity']
    ]
];

$this->api_prompt = <<<PROMPT
Review the following WordPress theme and plugin PHP files provided in the payload.
Identify potential major issues such as malware patterns, significant security vulnerabilities (like SQL injection, XSS, insecure file handling), or deprecated code usage with security implications. 
If no major issues are identified, then skip without any response.

Begin payload here:

PROMPT;

Installing and using wp audit-files.

If you’re interested in trying it out, then take a look at the WP-CLI command on Github. You will need to supply your own Google Gemini API key, however, you shouldn’t need to add billing info in order to use Google’s free tiers. If you are more adventurous, feel free to take this as a template to build your own AI-powered WP-CLI commands.