Skip to main content

Configuring CodeQL CLI in your CI system

You can configure your continuous integration system to run the CodeQL CLI, perform CodeQL analysis, and upload the results to GitHub Enterprise Server for display as 代码扫描 alerts.

代码扫描 可用于 GitHub Enterprise Server 中的组织拥有的存储库。 此功能需要 GitHub Advanced Security 的许可证。 更多信息请参阅“GitHub 的产品”。

Note: Your site administrator must enable 代码扫描 for 您的 GitHub Enterprise Server 实例 before you can use this feature. For more information, see "Configuring 代码扫描 for your appliance."

Note: This article describes features present in the version of CodeQL CLI available at the time of the release of GitHub Enterprise Server. If your enterprise uses a more recent version of CodeQL CLI, see the GitHub Enterprise Cloud documentation instead.

About generating code scanning results with CodeQL CLI

Once you've made the CodeQL CLI available to servers in your CI system, and ensured that they can authenticate with GitHub Enterprise Server, you're ready to generate data.

You use three different commands to generate results and upload them to GitHub Enterprise Server:

  1. database create to create a CodeQL database to represent the hierarchical structure of each supported programming language in the repository.
  2. database analyze to run queries to analyze each CodeQL database and summarize the results in a SARIF file.
  3. github upload-results to upload the resulting SARIF files to GitHub Enterprise Server where the results are matched to a branch or pull request and displayed as 代码扫描 alerts.

You can display the command-line help for any command using the --help option.

Note: Uploading SARIF data to display as 代码扫描 results in GitHub Enterprise Server is supported for organization-owned repositories with GitHub Advanced Security enabled. For more information, see "Managing security and analysis settings for your repository."

Creating CodeQL databases to analyze

  1. Check out the code that you want to analyze:

    • For a branch, check out the head of the branch that you want to analyze.
    • For a pull request, check out either the head commit of the pull request, or check out a GitHub-generated merge commit of the pull request.
  2. Set up the environment for the codebase, making sure that any dependencies are available. For more information, see Creating databases for non-compiled languages and Creating databases for compiled languages in the documentation for the CodeQL CLI.

  3. Find the build command, if any, for the codebase. Typically this is available in a configuration file in the CI system.

  4. Run codeql database create from the checkout root of your repository and build the codebase.

    # Single supported language - create one CodeQL databsae
    codeql database create <database> --command<build> --language=<language-identifier>
    # Multiple supported languages - create one CodeQL database per language
    codeql database create <database> --command<build> \
          --db-cluster --language=<language-identifier>,<language-identifier>

    Note: If you use a containerized build, you need to run the CodeQL CLI inside the container where your build task takes place.

<database>Specify the name and location of a directory to create for the CodeQL database. The command will fail if you try to overwrite an existing directory. If you also specify --db-cluster, this is the parent directory and a subdirectory is created for each language analyzed.
--languageSpecify the identifier for the language to create a database for, one of: cpp`、`csharp`、`go`、`java`、`javascript`、 和 `python (use javascript to analyze TypeScript code). When used with --db-cluster, the option accepts a comma-separated list, or can be specified more than once.
--commandRecommended. Use to specify the build command or script that invokes the build process for the codebase. Commands are run from the current folder or, where it is defined, from --source-root. Not needed for Python and JavaScript/TypeScript analysis.
--db-clusterOptional. Use in multi-language codebases to generate one database for each language specified by --language.
--no-run-unnecessary-buildsRecommended. Use to suppress the build command for languages where the CodeQL CLI does not need to monitor the build (for example, Python and JavaScript/TypeScript).
--source-rootOptional. Use if you run the CLI outside the checkout root of the repository. By default, the database create command assumes that the current directory is the root directory for the source files, use this option to specify a different location.

For more information, see Creating CodeQL databases in the documentation for the CodeQL CLI.

Single language example

This example creates a CodeQL database for the repository checked out at /checkouts/example-repo. It uses the JavaScript extractor to create a hierarchical representation of the JavaScript and TypeScript code in the repository. The resulting database is stored in /codeql-dbs/example-repo.

$ codeql database create /codeql-dbs/example-repo --language=javascript \
    --source-root /checkouts/example-repo

> Initializing database at /codeql-dbs/example-repo.
> Running command [/codeql-home/codeql/javascript/tools/autobuild.cmd]
    in /checkouts/example-repo.
> [build-stdout] Single-threaded extraction.
> [build-stdout] Extracting
> Finalizing database at /codeql-dbs/example-repo.
> Successfully created database at /codeql-dbs/example-repo.

Multiple language example

This example creates two CodeQL databases for the repository checked out at /checkouts/example-repo-multi. It uses:

  • --db-cluster to request analysis of more than one language.
  • --language to specify which languages to create databases for.
  • --command to tell the tool the build command for the codebase, here make.
  • --no-run-unnecessary-builds to tell the tool to skip the build command for languages where it is not needed (like Python).

The resulting databases are stored in python and cpp subdirectories of /codeql-dbs/example-repo-multi.

$ codeql database create /codeql-dbs/example-repo-multi \
    --db-cluster --language python,cpp \
    --command make --no-run-unnecessary-builds \
    --source-root /checkouts/example-repo-multi
Initializing databases at /codeql-dbs/example-repo-multi.
Running build command: [make]
[build-stdout] Calling python3 /codeql-bundle/codeql/python/tools/
[build-stdout] Calling python3 -S /codeql-bundle/codeql/python/tools/ -v -z all -c /codeql-dbs/example-repo-multi/python/working/trap_cache -p ERROR: 'pip' not installed.
[build-stdout] /usr/local/lib/python3.6/dist-packages -R /checkouts/example-repo-multi
[build-stdout] [INFO] Python version 3.6.9
[build-stdout] [INFO] Python extractor version 5.16
[build-stdout] [INFO] [2] Extracted file /checkouts/example-repo-multi/ in 5ms
[build-stdout] [INFO] Processed 1 modules in 0.15s
[build-stdout] <output from calling 'make' to build the C/C++ code>
Finalizing databases at /codeql-dbs/example-repo-multi.
Successfully created databases at /codeql-dbs/example-repo-multi.

Analyzing a CodeQL database

  1. Create a CodeQL database (see above).
  2. Run codeql database analyze on the database and specify which queries to use.
    codeql database analyze <database> --format=<format> \
        --output=<output>  <queries>

Note: If you analyze more than one CodeQL database for a single commit, you must specify a SARIF category for each set of results generated by this command. When you upload the results to GitHub Enterprise Server, 代码扫描 uses this category to store the results for each language separately. If you forget to do this, each upload overwrites the previous results.

codeql database analyze <database> --format=<format> \
    --sarif-category=<language-specifier> --output=<output> \
<database>Specify the path for the directory that contains the CodeQL database to analyze.
<packs,queries>Specify CodeQL packs or queries to run. To run the standard queries used for 代码扫描, omit this parameter. To see the other query suites included in the CodeQL CLI bundle, look in /<extraction-root>/qlpacks/codeql/<language>-queries/codeql-suites. For information about creating your own query suite, see Creating CodeQL query suites in the documentation for the CodeQL CLI.
--formatSpecify the format for the results file generated by the command. For upload to GitHub this should be: sarifv2.1.0. For more information, see "SARIF support for 代码扫描."
--outputSpecify where to save the SARIF results file.
--sarif-categoryOptional for single database analysis. Required to define the language when you analyze multiple databases for a single commit in a repository. Specify a category to include in the SARIF results file for this analysis. A category is used to distinguish multiple analyses for the same tool and commit, but performed on different languages or different parts of the code.
--threadsOptional. Use if you want to use more than one thread to run queries. The default value is 1. You can specify more threads to speed up query execution. To set the number of threads to the number of logical processors, specify 0.
--verboseOptional. Use to get more detailed information about the analysis process and diagnostic data from the database creation process.

For more information, see Analyzing databases with the CodeQL CLI in the documentation for the CodeQL CLI.

Basic example

This example analyzes a CodeQL database stored at /codeql-dbs/example-repo and saves the results as a SARIF file: /temp/example-repo-js.sarif. It uses --sarif-category to include extra information in the SARIF file that identifies the results as JavaScript. This is essential when you have more than one CodeQL database to analyze for a single commit in a repository.

$ codeql database analyze /codeql-dbs/example-repo  \
    javascript-code-scanning.qls --sarif-category=javascript \
    --format=sarifv2.1.0 --output=/temp/example-repo-js.sarif

> Running queries.
> Compiling query plan for /codeql-home/codeql/qlpacks/codeql-javascript/AngularJS/DisablingSce.ql.
> Shutting down query evaluator.
> Interpreting results.

Uploading results to GitHub Enterprise Server


  • SAIF 上传支持每次上传最多 5000 个结果。 超过此限制的任何结果均被忽略。 如果工具产生太多结果,则应更新配置,以专注于最重要的规则或查询的结果。

  • 对于每次上传,SARIF 上传支持最大 10 MB 的 gzip压缩 SARIF 文件。 任何超过此限制的上传都将被拒绝。 如果 SARIF 文件由于包含太多结果而太大,则应更新配置以专注于最重要的规则或查询的结果。

Before you can upload results to GitHub Enterprise Server, you must determine the best way to pass the GitHub 应用程序 or personal access token you created earlier to the CodeQL CLI (see Installing CodeQL CLI in your CI system). We recommend that you review your CI system's guidance on the secure use of a secret store. The CodeQL CLI supports:

  • Passing the token to the CLI via standard input using the --github-auth-stdin option (recommended).
  • Saving the secret in the environment variable GITHUB_TOKEN and running the CLI without including the --github-auth-stdin option.

When you have decided on the most secure and reliable method for your CI server, run codeql github upload-results on each SARIF results file and include --github-auth-stdin unless the token is available in the environment variable GITHUB_TOKEN.

echo "$UPLOAD_TOKEN" | codeql github upload-results --repository=<repository-name> \
      --ref=<ref> --commit=<commit> --sarif=<file> \
      --github-url=<URL> --github-auth-stdin
--repositorySpecify the OWNER/NAME of the repository to upload data to. The owner must be an organization within an enterprise that has a license for GitHub Advanced Security and GitHub Advanced Security must be enabled for the repository. For more information, see "Managing security and analysis settings for your repository."
--refSpecify the name of the ref you checked out and analyzed so that the results can be matched to the correct code. For a branch use: refs/heads/BRANCH-NAME, for the head commit of a pull request use refs/pull/NUMBER/head, or for the GitHub-generated merge commit of a pull request use refs/pull/NUMBER/merge.
--commitSpecify the full SHA of the commit you analyzed.
--sarifSpecify the SARIF file to load.
--github-urlSpecify the URL for GitHub Enterprise Server.
--github-auth-stdinOptional. Use to pass the CLI the GitHub 应用程序 or personal access token created for authentication with GitHub's REST API via standard input. This is not needed if the command has access to a GITHUB_TOKEN environment variable set with this token.

For more information, see github upload-results in the documentation for the CodeQL CLI.

Basic example

This example uploads results from the SARIF file temp/example-repo-js.sarif to the repository my-org/example-repo. It tells the 代码扫描 API that the results are for the commit deb275d2d5fe9a522a0b7bd8b6b6a1c939552718 on the main branch.

$ echo $UPLOAD_TOKEN | codeql github upload-results --repository=my-org/example-repo \
    --ref=refs/heads/main --commit=deb275d2d5fe9a522a0b7bd8b6b6a1c939552718 \
    --sarif=/temp/example-repo-js.sarif --github-url= \

There is no output from this command unless the upload was unsuccessful. The command prompt returns when the upload is complete and data processing has begun. On smaller codebases, you should be able to explore the 代码扫描 alerts in GitHub Enterprise Server shortly afterward. You can see alerts directly in the pull request or on the Security tab for branches, depending on the code you checked out. For more information, see "Triaging 代码扫描 alerts in pull requests" and "Managing 代码扫描 alerts for your repository."

Example CI configuration for CodeQL analysis

This is an example of the series of commands that you might use to analyze a codebase with two supported languages and then upload the results to GitHub Enterprise Server.

# Create CodeQL databases for Java and Python in the 'codeql-dbs' directory
# Call the normal build script for the codebase: 'myBuildScript'

codeql database create codeql-dbs --source-root=src \
    --db-cluster --language=java,python --command=./myBuildScript

# Analyze the CodeQL database for Java, 'codeql-dbs/java'
# Tag the data as 'java' results and store in: 'java-results.sarif'

codeql database analyze codeql-dbs/java java-code-scanning.qls \
    --format=sarif-latest --sarif-category=java --output=java-results.sarif

# Analyze the CodeQL database for Python, 'codeql-dbs/python'
# Tag the data as 'python' results and store in: 'python-results.sarif'

codeql database analyze codeql-dbs/python python-code-scanning.qls \
    --format=sarif-latest --sarif-category=python --output=python-results.sarif

# Upload the SARIF file with the Java results: 'java-results.sarif'

echo $UPLOAD_TOKEN | codeql github upload-results --repository=my-org/example-repo \
    --ref=refs/heads/main --commit=deb275d2d5fe9a522a0b7bd8b6b6a1c939552718 \
    --sarif=java-results.sarif --github-auth-stdin

# Upload the SARIF file with the Python results: 'python-results.sarif'

echo $UPLOAD_TOKEN | codeql github upload-results --repository=my-org/example-repo \
    --ref=refs/heads/main --commit=deb275d2d5fe9a522a0b7bd8b6b6a1c939552718 \
    --sarif=python-results.sarif --github-auth-stdin

Troubleshooting the CodeQL CLI in your CI system

Viewing log and diagnostic information

When you analyze a CodeQL database using a 代码扫描 query suite, in addition to generating detailed information about alerts, the CLI reports diagnostic data from the database generation step and summary metrics. For repositories with few alerts, you may find this information useful for determining if there are genuinely few problems in the code, or if there were errors generating the CodeQL database. For more detailed output from codeql database analyze, use the --verbose option.

For more information about the type of diagnostic information available, see "Viewing 代码扫描 logs".

代码扫描 only shows analysis results from one of the analyzed languages

By default, 代码扫描 expects one SARIF results file per analysis for a repository. Consequently, when you upload a second SARIF results file for a commit, it is treated as a replacement for the original set of data.

If you want to upload more than one set of results to the 代码扫描 API for a commit in a repository, you must identify each set of results as a unique set. For repositories where you create more than one CodeQL database to analyze for each commit, use the --sarif-category option to specify a language or other unique category for each SARIF file that you generate for that repository.

Further reading