Skip to main content

GitHub AE is currently under limited release.

Analyzing your code with CodeQL queries

You can run queries against a CodeQL database extracted from a codebase.

GitHub CodeQL is licensed on a per-user basis upon installation. You can use CodeQL only for certain tasks under the license restrictions. For more information, see "About the CodeQL CLI."

If you have a GitHub Advanced Security license, you can use CodeQL for automated analysis, continuous integration, and continuous delivery. For more information, see "About GitHub Advanced Security."

About analyzing databases with the CodeQL CLI

To analyze a codebase, you run queries against a CodeQL database extracted from the code. CodeQL analyses produce results that can be uploaded to GitHub AE to generate code scanning alerts.

Prerequisites

Before starting an analysis you must:

The simplest way to run codeql database analyze is using the standard queries included in the CodeQL CLI bundle.

Running codeql database analyze

When you run database analyze, it:

  1. Optionally downloads any referenced CodeQL packages that are not available locally.
  2. Executes one or more query files, by running them over a CodeQL database.
  3. Interprets the results, based on certain query metadata, so that alerts can be displayed in the correct location in the source code.
  4. Reports the results of any diagnostic and summary queries to standard output.

You can analyze a database by running the following command:

codeql database analyze <database> --format=<format> --output=<output> <query-specifiers>...

Note: If you analyze more than one CodeQL database for a single commit, you must specify a SARIF category for each set of results generated by this command. When you upload the results to GitHub AE, code scanning uses this category to store the results for each language separately. If you forget to do this, each upload overwrites the previous results.

codeql database analyze <database> --format=<format> \
    --sarif-category=<language-specifier> --output=<output> \
    <queries>

You must specify <database>, --format, and --output. You can specify additional options depending on what analysis you want to do.

OptionRequiredUsage
<database>Specify the path for the directory that contains the CodeQL database to analyze.
<packs,queries>Specify CodeQL packs or queries to run. To run the standard queries used for code scanning, omit this parameter. To see the other query suites included in the CodeQL CLI bundle, look in /<extraction-root>/qlpacks/codeql/<language>-queries/codeql-suites. For information about creating your own query suite, see Creating CodeQL query suites in the documentation for the CodeQL CLI.
--formatSpecify the format for the results file generated during analysis. A number of different formats are supported, including CSV, SARIF, and graph formats. For upload to GitHub this should be: sarif-latest. For more information, see "SARIF support for code scanning."
--outputSpecify the location where you want to save the SARIF results file, including the desired filename with the .sarif extension.
--sarif-categoryOptional for single database analysis. Required to define the language when you analyze multiple databases for a single commit in a repository.

Specify a category to include in the SARIF results file for this analysis. A category is used to distinguish multiple analyses for the same tool and commit, but performed on different languages or different parts of the code.
--sarif-add-query-helpUse if you want to include any available markdown-rendered query help for custom queries used in your analysis. Any query help for custom queries included in the SARIF output will be displayed in the code scanning UI if the relevant query generates an alert. For more information, see "Using custom queries with the CodeQL CLI."
--threadsUse if you want to use more than one thread to run queries. The default value is 1. You can specify more threads to speed up query execution. To set the number of threads to the number of logical processors, specify 0.
--verboseUse to get more detailed information about the analysis process and diagnostic data from the database creation process.

Upgrading databases

For databases that were created by CodeQL CLI v2.3.3 or earlier, you will need to explicitly upgrade the database before you can run an analysis with a newer version of the CodeQL CLI. If this step is necessary, then you will see a message telling you that your database needs to be upgraded when you run database analyze.

For databases that were created by CodeQL CLI v2.3.4 or later, the CLI will implicitly run any required upgrades. Explicitly running the upgrade command is not necessary.

For full details of all the options you can use when analyzing databases, see "database analyze."

Basic example of analyzing a CodeQL database

This example analyzes a CodeQL database stored at /codeql-dbs/example-repo and saves the results as a SARIF file: /temp/example-repo-js.sarif. It uses --sarif-category to include extra information in the SARIF file that identifies the results as JavaScript. This is essential when you have more than one CodeQL database to analyze for a single commit in a repository.

$ codeql database analyze /codeql-dbs/example-repo \
    javascript-code-scanning.qls --sarif-category=javascript \
    --format=sarif-latest --output=/temp/example-repo-js.sarif

> Running queries.
> Compiling query plan for /codeql-home/codeql/qlpacks/codeql-javascript/AngularJS/DisablingSce.ql.
...
> Shutting down query evaluator.
> Interpreting results.

Examples of running database analyses

The following examples show how to run database analyze using CodeQL packs, and how to use a local checkout of the CodeQL repository. These examples assume your CodeQL databases have been created in a directory that is a sibling of your local copies of the CodeQL repository.

Running a single query

To run a single query over a CodeQL database for a JavaScript codebase, you could use the following command from the directory containing your database:

codeql database analyze --download <javascript-database> codeql/javascript-queries:Declarations/UnusedVariable.ql --format=csv --output=js-analysis/js-results.csv

This command runs a simple query that finds potential bugs related to unused variables, imports, functions, or classes—it is one of the JavaScript queries included in the CodeQL repository. You could run more than one query by specifying a space-separated list of similar paths.

The analysis generates a CSV file (js-results.csv) in a new directory (js-analysis).

Alternatively, if you have the CodeQL repository checked out, you can execute the same queries by specifying the path to the query directly:

codeql database analyze <javascript-database> ../ql/javascript/ql/src/Declarations/UnusedVariable.ql --format=csv --output=js-analysis/js-results.csv

You can also run your own custom queries with the database analyze command. For more information about preparing your queries to use with the CodeQL CLI, see "Using custom queries with the CodeQL CLI."

Running all queries in a directory

You can run all the queries located in a directory by providing the directory path, rather than listing all the individual query files. Paths are searched recursively, so any queries contained in subfolders will also be executed.

Important

You should avoid specifying the root of a core CodeQL query pack when executing database analyze as it might contain some special queries that aren’t designed to be used with the command. Rather, run the query pack to include the pack’s default queries in the analysis, or run one of the code scanning query suites.

For example, to execute all Python queries contained in the Functions directory in the codeql/python-queries query pack you would run:

codeql database analyze <python-database> codeql/python-queries:Functions --format=sarif-latest --output=python-analysis/python-results.sarif --download

Alternatively, if you have the CodeQL repository checked out, you can execute the same queries by specifying the path to the directory directly:

codeql database analyze <python-database> ../ql/python/ql/src/Functions/ --format=sarif-latest --output=python-analysis/python-results.sarif

When the analysis has finished, a SARIF results file is generated. Specifying --format=sarif-latest ensures that the results are formatted according to the most recent SARIF specification supported by CodeQL.

Running query suites

To run a query suite on a CodeQL database for a C/C++ codebase, you could use the following command from the directory containing your database:

codeql database analyze <cpp-database> codeql/cpp-queries:codeql-suites/cpp-code-scanning.qls --format=sarifv2.1.0 --output=cpp-results.sarif --download

This command downloads the codeql/cpp-queries CodeQL query pack, runs the analysis, and generates a file in the SARIF version 2.1.0 format that is supported by all versions of GitHub. This file can be uploaded to GitHub by executing codeql github upload-results or the code scanning API. For more information, see "Uploading CodeQL analysis results to GitHub" or "Code Scanning".

CodeQL query suites are .qls files that use directives to select queries to run based on certain metadata properties. The standard CodeQL packs have metadata that specify the location of the query suites used by code scanning, so the CodeQL CLI knows where to find these suite files automatically, and you don’t have to specify the full path on the command line. For more information, see "Creating CodeQL query suites."

For information about creating custom query suites, see "Creating CodeQL query suites."

Results

You can save analysis results in a number of different formats, including SARIF and CSV.

The SARIF format is designed to represent the output of a broad range of static analysis tools. For more information, see "CodeQL CLI SARIF output."

For more information about what the results look like in CSV format, see "CodeQL CLI CSV output."

Results files can be integrated into your own code-review or debugging infrastructure. For example, SARIF file output can be used to highlight alerts in the correct location in your source code using a SARIF viewer plugin for your IDE.

Viewing log and diagnostic information

When you analyze a CodeQL database using a code scanning query suite, in addition to generating detailed information about alerts, the CLI reports diagnostic data from the database generation step and summary metrics. If you choose to generate SARIF output, the additional data is also included in the SARIF file. For repositories with few alerts, you may find this information useful for determining if there are genuinely few problems in the code, or if there were errors generating the CodeQL database. For more detailed output from codeql database analyze, use the --verbose option.

For more information about the type of diagnostic information available, see "Viewing code scanning logs".

You can choose to export and upload diagnostic information to GitHub AE even if a CodeQL analysis fails. For more information, see "Uploading CodeQL analysis results to GitHub."

Next steps