Note: This article was migrated from the CodeQL documentation website in January 2023.
About analyzing databases with the CodeQL CLI
Note: This article describes the features available with the CodeQL CLI 2.7.6 bundle included in the initial release of GitHub Enterprise Server 3.4.
If your site administrator has updated your CodeQL CLI version to a newer release, please see the GitHub Enterprise Cloud version of this article for information on the latest features.
To analyze a codebase, you run queries against a CodeQL database extracted from the code.
CodeQL analyses produce interpreted results that can be displayed as alerts or paths in source code.
For information about writing queries to run with database analyze
, see "Using custom queries with the CodeQL CLI."
Other query-running commands
Queries run with database analyze
have strict metadata requirements. You can also execute queries using the following
plumbing-level subcommands:
-
database run-queries, which outputs non-interpreted results in an intermediate binary format called BQRS
-
query run, which will output BQRS files, or print results tables directly to the command line. Viewing results directly in the command line may be useful for iterative query development using the CLI.
Queries run with these commands don't have the same metadata requirements.
However, to save human-readable data you have to process each BQRS results
file using the bqrs decode plumbing
subcommand. Therefore, for most use cases it's easiest to use database analyze
to directly generate interpreted results.
Before starting an analysis you must:
- Set up the CodeQL CLI to run commands locally.
- Create a CodeQL database for the source code you want to analyze.
The simplest way to run codeql database analyze
is using CodeQL packs. You can also
run the command using queries from a local checkout of the CodeQL repository,
which you may want to do if you want to customize the CodeQL core queries.
Running codeql database analyze
When you run database analyze
, it:
- Optionally downloads any referenced CodeQL packages that are not available locally.
- Executes one or more query files, by running them over a CodeQL database.
- Interprets the results, based on certain query metadata, so that alerts can be displayed in the correct location in the source code.
- Reports the results of any diagnostic and summary queries to standard output.
You can analyze a database by running the following command:
codeql database analyze <database> --format=<format> --output=<output> <query-specifiers>...
You must specify:
<database>
: the path to the CodeQL database you want to analyze.--format
: the format of the results file generated during analysis. A number of different formats are supported, including CSV, SARIF, and graph formats. For more information about CSV and SARIF, see Results. To find out which other results formats are supported, see "database analyze."--output
: the output path of the results file generated during analysis.
You can also specify:
-
<query-specifiers>...
: a space-separated list of queries to run over your database. This is a list of arguments, where each argument can be:-
a path to a query file
-
a path to a directory containing query files
-
a path to a query suite file
-
the name of a CodeQL query pack
- with an optional version range
- with an optional path to a query, directory, or query suite inside the pack
If omitted, the default query suite for the language of the analyzed database will be used. For the complete syntax of query specifiers, see "Specifying which queries to run in a CodeQL pack."
-
-
--sarif-category
: an identifying category for the results. Used when you want to upload more than one set of results for a commit. For example, when you usegithub upload-results
to send results for more than one language to the GitHub code scanning API. For more information about this use case, see Configuring CodeQL CLI in your CI system. -
--sarif-add-query-help
: (supported in version 2.7.1 onwards) adds any custom query help written in markdown to SARIF files (v2.1.0 or later) generated by the analysis. Query help stored in.qhelp
files must be converted to.md
before running the analysis. For further information, see "Including query help for custom CodeQL queries in SARIF files." -
--download
: a boolean flag that will allow the CLI to download any referenced CodeQL packages that are not available locally. If this flag is missing and a referenced CodeQL package is not available locally, the command will fail.
Upgrading databases
For databases that were created by CodeQL CLI v2.3.3 or earlier, you will need to explicitly upgrade the database before you can run an analysis with a newer
version of the CodeQL CLI. If this step is necessary, then you will see a message telling you
that your database needs to be upgraded when you run database analyze
.
For databases that were created by CodeQL CLI v2.3.4 or later, the CLI will implicitly run any required upgrades. Explicitly running the upgrade command is not necessary.
For full details of all the options you can use when analyzing databases, see "database analyze."
Specifying which queries to run in a CodeQL pack
Query specifiers are used by codeql database analyze
and other commands that operate on a set of queries.
The complete form of a query specifier is scope/name@range:path
, where:
-
scope/name
is the qualified name of a CodeQL pack. -
range
is a semver range. -
path
is a file system path to a single query, a directory containing queries, or a query suite file.
When you specify a scope/name
, the range
and path
are
optional. If you omit a range
then the latest version of the
specified pack is used. If you omit a path
then the default query suite
of the specified pack is used.
The path
can be one of: a .ql
query file, a directory
containing one or more queries, or a .qls
query suite file. If
you omit a pack name, then you must provide a path
,
which will be interpreted relative to the working directory
of the current process. Glob patterns are not supported.
If you specify both a scope/name
and path
, then the path
cannot
be absolute. It is considered relative to the root of the CodeQL
pack.
Example query specifiers
-
codeql/python-queries
- All the queries in the default query suite of the latest version of thecodeql/python-queries
pack. -
codeql/python-queries@1.2.3
- All the queries in the default query suite of version1.2.3
of thecodeql/python-queries
pack. -
codeql/python-queries@~1.2.3
- All the queries in the default query suite of the latest version of thecodeql/python-queries
pack that is >=1.2.3
and <1.3.0
. -
codeql/python-queries:Functions
- All queries in theFunctions
directory in the latest version of thecodeql/python-queries
pack. -
codeql/python-queries@1.2.3:Functions
- All queries in theFunctions
directory in version 1.2.3 of thecodeql/python-queries
pack. -
codeql/python-queries@1.2.3:codeql-suites/python-code-scanning.qls
- All queries in thecodeql-suites/python-code-scanning.qls
directory in version 1.2.3 of thecodeql/python-queries
pack. -
suites/my-suite.qls
- All queries in thesuites/my-suite.qls
file relative to the current working directory.
Tip
The default query suite of the standard CodeQL query packs are codeql-suites/<lang>-code-scanning.qls
. Several other useful query suites can also be found in the codeql-suites
directory of each pack. For example, the codeql/cpp-queries
pack contains the following query suites:
-
cpp-code-scanning.qls
- Standard Code Scanning queries for C++. The default query suite for this pack. -
cpp-security-extended.qls
- Queries from the defaultcpp-code-scanning.qls
suite for C++, plus lower severity and precision queries. -
cpp-security-and-quality.qls
- Queries fromcpp-security-extended.qls
, plus maintainability and reliability queries.
You can see the sources for these query suites in the CodeQL repository. Query suites for other languages are similar.
Examples of running database analyses
The following examples show how to run database analyze
using CodeQL packs, and how to use a local checkout of the CodeQL repository. These examples assume your CodeQL databases have been created in a directory that is a sibling of your local copies of the CodeQL repository.
Running a single query
To run a single query over a CodeQL database for a JavaScript codebase, you could use the following command from the directory containing your database:
codeql database analyze --download <javascript-database> codeql/javascript-queries:Declarations/UnusedVariable.ql --format=csv --output=js-analysis/js-results.csv
This command runs a simple query that finds potential bugs related to unused variables, imports, functions, or classes—it is one of the JavaScript queries included in the CodeQL repository. You could run more than one query by specifying a space-separated list of similar paths.
The analysis generates a CSV file (js-results.csv
) in a new directory (js-analysis
).
Alternatively, if you have the CodeQL repository checked out, you can execute the same queries by specifying the path to the query directly:
codeql database analyze <javascript-database> ../ql/javascript/ql/src/Declarations/UnusedVariable.ql --format=csv --output=js-analysis/js-results.csv
You can also run your own custom queries with the database analyze
command.
For more information about preparing your queries to use with the CodeQL CLI,
see "Using custom queries with the CodeQL CLI."
Running all queries in a directory
You can run all the queries located in a directory by providing the directory path, rather than listing all the individual query files. Paths are searched recursively, so any queries contained in subfolders will also be executed.
Important
You should avoid specifying the root of a core CodeQL query pack when executing database analyze
as it might contain some special queries that aren’t designed to be used with
the command. Rather, run the query pack to include the
pack’s default queries in the analysis, or run one of the
code scanning query suites.
For example, to execute all Python queries contained in the Functions
directory in the
codeql/python-queries
query pack you would run:
codeql database analyze <python-database> codeql/python-queries:Functions --format=sarif-latest --output=python-analysis/python-results.sarif --download
Alternatively, if you have the CodeQL repository checked out, you can execute the same queries by specifying the path to the directory directly:
codeql database analyze <python-database> ../ql/python/ql/src/Functions/ --format=sarif-latest --output=python-analysis/python-results.sarif
When the analysis has finished, a SARIF results file is generated. Specifying --format=sarif-latest
ensures
that the results are formatted according to the most recent SARIF specification
supported by CodeQL.
Running query suites
To run a query suite on a CodeQL database for a C/C++ codebase, you could use the following command from the directory containing your database:
codeql database analyze <cpp-database> codeql/cpp-queries:codeql-suites/cpp-code-scanning.qls --format=sarifv2.1.0 --output=cpp-results.sarif --download
This command downloads the codeql/cpp-queries
CodeQL query pack, runs the analysis, and generates a file in the SARIF version 2.1.0 format that is supported by all versions of GitHub. This file can be uploaded to GitHub by executing codeql github upload-results
or the code scanning API.
For more information, see "Configuring CodeQL CLI in your CI system"
or "Code Scanning".
CodeQL query suites are .qls
files that use directives to select queries to run
based on certain metadata properties. The standard CodeQL packs have metadata that specify
the location of the query suites used by code scanning, so the CodeQL CLI knows where to find these
suite files automatically, and you don’t have to specify the full path on the command line.
For more information, see "Creating CodeQL query suites."
For information about creating custom query suites, see "Creating CodeQL query suites."
Diagnostic and summary information
When you create a CodeQL database, the extractor stores diagnostic data in the database. The code scanning query suites include additional queries to report on this diagnostic data and calculate summary metrics. When the database analyze
command completes, the CLI generates the results file and reports any diagnostic and summary data to standard output. If you choose to generate SARIF output, the additional data is also included in the SARIF file.
If the analysis found fewer results for standard queries than you expected, review the results of the diagnostic and summary queries to check whether the CodeQL database is likely to be a good representation of the codebase that you want to analyze.
Integrating a CodeQL pack into a code scanning workflow in GitHub
You can use CodeQL query packs in your code scanning setup. This allows you to select query packs published by various sources and use them to analyze your code. For more information, see "Using CodeQL query packs in the CodeQL action" or "Downloading and using CodeQL query packs in your CI system."
Including query help for custom CodeQL queries in SARIF files
If you use the CodeQL CLI to run code scanning analyses on third party CI/CD systems, you can include the query help for your custom queries in SARIF files generated during an analysis. After uploading the SARIF file to GitHub, the query help is shown in the code scanning UI for any alerts generated by the custom queries.
From CodeQL CLI v2.7.1 onwards, you can include markdown-rendered query help in SARIF files
by providing the --sarif-add-query-help
option when running
codeql database analyze
.
For more information, see Configuring CodeQL CLI in your CI system.
You can write query help for custom queries directly in a markdown file and save it alongside the
corresponding query. Alternatively, for consistency with the standard CodeQL queries,
you can write query help in the .qhelp
format. Query help written in .qhelp
files can’t be included in SARIF files, and they can’t be processed by code
scanning so must be converted to markdown before running
the analysis. For more information, see "Query help files"
and "Testing query help files."
Results
You can save analysis results in a number of different formats, including SARIF and CSV.
The SARIF format is designed to represent the output of a broad range of static analysis tools. For more information, see SARIF output.
If you choose to generate results in CSV format, then each line in the output file corresponds to an alert. Each line is a comma-separated list with the following information.
Property | Description | Example |
---|---|---|
Name | Name of the query that identified the result. | Inefficient regular expression |
Description | Description of the query. | A regular expression that requires exponential time to match certain inputs can be a performance bottleneck, and may be vulnerable to denial-of-service attacks. |
Severity | Severity of the query. | error |
Message | Alert message. | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\\\'. |
Path | Path of the file containing the alert. | /vendor/codemirror/markdown.js |
Start line | Line of the file where the code that triggered the alert begins. | 617 |
Start column | Column of the start line that marks the start of the alert code. Not included when equal to 1. | 32 |
End line | Line of the file where the code that triggered the alert ends. Not included when the same value as the start line. | 64 |
End column | Where available, the column of the end line that marks the end of the alert code. Otherwise the end line is repeated. | 617 |
Results files can be integrated into your own code-review or debugging infrastructure. For example, SARIF file output can be used to highlight alerts in the correct location in your source code using a SARIF viewer plugin for your IDE.