Skip to main content

SARIF support for code scanning

To display results from a third-party static analysis tool in your repository on GitHub, you'll need your results stored in a SARIF file that supports a specific subset of the SARIF 2.1.0 JSON schema for 代码扫描. If you use the default CodeQL static analysis engine, then your results will display in your repository on GitHub automatically.

代码扫描 可用于 GitHub AE 中的组织拥有的存储库。 这是一项 GitHub Advanced Security 功能(在测试版期间免费)。 更多信息请参阅“GitHub 的产品”。

About SARIF support

SARIF (Static Analysis Results Interchange Format) is an OASIS Standard that defines an output file format. The SARIF standard is used to streamline how static analysis tools share their results. 代码扫描 supports a subset of the SARIF 2.1.0 JSON schema.

To upload a SARIF file from a third-party static code analysis engine, you'll need to ensure that uploaded files use the SARIF 2.1.0 version. GitHub will parse the SARIF file and show alerts using the results in your repository as a part of the 代码扫描 experience. For more information, see "Uploading a SARIF file to GitHub." For more information about the SARIF 2.1.0 JSON schema, see sarif-schema-2.1.0.json.

If you're using GitHub Actions with the CodeQL 分析工作流程 or using the CodeQL CLI, then the 代码扫描 results will automatically use the supported subset of SARIF 2.1.0. For more information, see "Setting up 代码扫描 for a repository" or "Installing CodeQL CLI in your CI system."

You can upload multiple SARIF files for the same commit, and display the data from each file as 代码扫描 results. When you upload multiple SARIF files for a commit, you must indicate a "category" for each analysis. The way to specify a category varies according to the analysis method:

  • Using the CodeQL CLI directly, pass the --sarif-category argument to the codeql database analyze command when you generate SARIF files. For more information, see "Configuring CodeQL CLI in your CI system."
  • Using GitHub Actions with codeql-action/analyze, the category is set automatically from the workflow name and any matrix variables (typically, language). You can override this by specifying a category input for the action, which is useful when you analyze different sections of a mono-repository in a single workflow.
  • Using GitHub Actions to upload results from other static analysis tools, then you must specify a category input if you upload more than one file of results for the same tool in one workflow. For more information, see "Uploading a 代码扫描 analysis with GitHub Actions."
  • If you are not using either of these approaches, you must specify a unique runAutomationDetails.id in each SARIF file to upload. For more information about this property, see runAutomationDetails object below.

If you upload a second SARIF file for a commit with the same category and from the same tool, the earlier results are overwritten. However, if you try to upload multiple SARIF files for the same tool and category in a single GitHub Actions workflow run, the misconfiguration is detected and the run will fail.

GitHub uses properties in the SARIF file to display alerts. For example, the shortDescription and fullDescription appear at the top of a 代码扫描 alert. The location allows GitHub to show annotations in your code file. For more information, see "Managing 代码扫描 alerts for your repository."

If you're new to SARIF and want to learn more, see Microsoft's SARIF tutorials repository.

Preventing duplicate alerts using fingerprints

Each time the results of a new code scan are uploaded, the results are processed and alerts are added to the repository. To prevent duplicate alerts for the same problem, 代码扫描 uses fingerprints to match results across various runs so they only appear once in the latest run for the selected branch. This makes it possible to match alerts to the right line of code when files are edited.

GitHub uses the partialFingerprints property in the OASIS standard to detect when two results are logically identical. For more information, see the "partialFingerprints property" entry in the OASIS documentation.

SARIF files created by the CodeQL 分析工作流程, or using the CodeQL CLI include fingerprint data. If you upload a SARIF file using the upload-sarif action and this data is missing, GitHub attempts to populate the partialFingerprints field from the source files. For more information about uploading results, see "Uploading a SARIF file to GitHub."

If you upload a SARIF file without fingerprint data using the /code-scanning/sarifs API endpoint, the 代码扫描 alerts will be processed and displayed, but users may see duplicate alerts. To avoid seeing duplicate alerts, you should calculate fingerprint data and populate the partialFingerprints property before you upload the SARIF file. You may find the script that the upload-sarif action uses a helpful starting point: https://github.com/github/codeql-action/blob/main/src/fingerprints.ts. For more information about the API, see "Upload an analysis as SARIF data."

Understanding rules and results

SARIF files support both rules and results. The information stored in these elements is similar but serves different purposes.

  • Rules are an array of reportingDescriptor objects that are included in the toolComponent object. This is where you store details of the rules that are run during analysis. Information in these objects should change infrequently, typically when you update the tool.

  • Results are stored as a series of result objects under results in the run object. Each result object contains details for one alert in the codebase. Within the results object, you can reference the rule that detected the alert.

When you compare SARIF files generated by analyzing different codebases with the same tool and rules, you should see differences in the results of the analyses but not in the rules.

Validating your SARIF file

You can check a SARIF file is compatible with 代码扫描 by testing it against the GitHub ingestion rules. For more information, visit the Microsoft SARIF validator.

注意:

  • SAIF 上传支持每次上传最多 5000 个结果。 超过此限制的任何结果均被忽略。 如果工具产生太多结果,则应更新配置,以专注于最重要的规则或查询的结果。

  • 对于每次上传,SARIF 上传支持最大 10 MB 的 gzip压缩 SARIF 文件。 任何超过此限制的上传都将被拒绝。 如果 SARIF 文件由于包含太多结果而太大,则应更新配置以专注于最重要的规则或查询的结果。

Supported SARIF output file properties

If you use a code analysis engine other than CodeQL, you can review the supported SARIF properties to optimize how your analysis results will appear on GitHub.

Any valid SARIF 2.1.0 output file can be uploaded, however, 代码扫描 will only use the following supported properties.

sarifLog object

NameDescription
$schemaRequired. The URI of the SARIF JSON schema for version 2.1.0. For example, https://json.schemastore.org/sarif-2.1.0.json.
versionRequired. 代码扫描 only supports SARIF version 2.1.0.
runs[]Required. A SARIF file contains an array of one or more runs. Each run represents a single run of an analysis tool. For more information about a run, see the run object.

run object

代码扫描 uses the run object to filter results by tool and provide information about the source of a result. The run object contains the tool.driver tool component object, which contains information about the tool that generated the results. Each run can only have results for one analysis tool.

NameDescription
tool.driverRequired. A toolComponent object that describes the analysis tool. For more information, see the toolComponent object.
tool.extensions[]Optional. An array of toolComponent objects that represent any plugins or extensions used by the tool during analysis. For more information, see the toolComponent object.
results[]Required. The results of the analysis tool. 代码扫描 displays the results on GitHub. For more information, see the result object.

toolComponent object

NameDescription
nameRequired. The name of the analysis tool. 代码扫描 displays the name on GitHub to allow you to filter results by tool.
versionOptional. The version of the analysis tool. 代码扫描 uses the version number to track when results may have changed due to a tool version change rather than a change in the code being analyzed. If the SARIF file includes the semanticVersion field, version is not used by 代码扫描.
semanticVersionOptional. The version of the analysis tool, specified by the Semantic Versioning 2.0 format. 代码扫描 uses the version number to track when results may have changed due to a tool version change rather than a change in the code being analyzed. If the SARIF file includes the semanticVersion field, version is not used by 代码扫描. For more information, see "Semantic Versioning 2.0.0" in the Semantic Versioning documentation.
rules[]Required. An array of reportingDescriptor objects that represent rules. The analysis tool uses rules to find problems in the code being analyzed. For more information, see the reportingDescriptor object.

reportingDescriptor object

This is where you store details of the rules that are run during analysis. Information in these objects should change infrequently, typically when you update the tool. For more information, see "Understanding rules and results" above.

NameDescription
idRequired. A unique identifier for the rule. The id is referenced from other parts of the SARIF file and may be used by 代码扫描 to display URLs on GitHub.
nameOptional. The name of the rule. 代码扫描 displays the name to allow results to be filtered by rule on GitHub.
shortDescription.textRequired. A concise description of the rule. 代码扫描 displays the short description on GitHub next to the associated results.
fullDescription.textRequired. A description of the rule. 代码扫描 displays the full description on GitHub next to the associated results. The max number of characters is limited to 1000.
defaultConfiguration.levelOptional. Default severity level of the rule. 代码扫描 uses severity levels to help you understand how critical the result is for a given rule. This value can be overridden by the level attribute in the result object. For more information, see the result object. Default: warning.
help.textRequired. Documentation for the rule using text format. 代码扫描 displays this help documentation next to the associated results.
help.markdownRecommended. Documentation for the rule using Markdown format. 代码扫描 displays this help documentation next to the associated results. When help.markdown is available, it is displayed instead of help.text.
properties.tags[]Optional. An array of strings. 代码扫描 uses tags to allow you to filter results on GitHub. For example, it is possible to filter to all results that have the tag security.
properties.precisionRecommended. A string that indicates how often the results indicated by this rule are true. For example, if a rule has a known high false-positive rate, the precision should be low. 代码扫描 orders results by precision on GitHub so that the results with the highest level, and highest precision are shown first. Can be one of: very-high, high, medium, or low.
properties.problem.severityRecommended. A string that indicates the level of severity of any alerts generated by a non-security query. This, with the properties.precision property, determines whether the results are displayed by default on GitHub so that the results with the highest problem.severity, and highest precision are shown first. Can be one of: error, warning, or recommendation.
properties.security-severityRecommended. A string representing a score that indicates the level of severity, between 0.0 and 10.0, for security queries (@tags includes security). This, with the properties.precision property, determines whether the results are displayed by default on GitHub so that the results with the highest security-severity, and highest precision are shown first. 代码扫描 translates numerical scores as follows: over 9.0 is critical, 7.0 to 8.9 is high, 4.0 to 6.9 is medium and 3.9 or less is low.

result object

Each result object contains details for one alert in the codebase. Within the results object, you can reference the rule that detected the alert. For more information, see "Understanding rules and results" above.

注意:

  • SAIF 上传支持每次上传最多 5000 个结果。 超过此限制的任何结果均被忽略。 如果工具产生太多结果,则应更新配置,以专注于最重要的规则或查询的结果。

  • 对于每次上传,SARIF 上传支持最大 10 MB 的 gzip压缩 SARIF 文件。 任何超过此限制的上传都将被拒绝。 如果 SARIF 文件由于包含太多结果而太大,则应更新配置以专注于最重要的规则或查询的结果。

NameDescription
ruleIdOptional. The unique identifier of the rule (reportingDescriptor.id). For more information, see the reportingDescriptor object. 代码扫描 uses the rule identifier to filter results by rule on GitHub.
ruleIndexOptional. The index of the associated rule (reportingDescriptor object) in the tool component rules array. For more information, see the run object. The allowed range for this property 0 to 2^63 - 1.
ruleOptional. A reference used to locate the rule (reporting descriptor) for this result. For more information, see the reportingDescriptor object.
levelOptional. The severity of the result. This level overrides the default severity defined by the rule. 代码扫描 uses the level to filter results by severity on GitHub.
message.textRequired. A message that describes the result. 代码扫描 displays the message text as the title of the result. Only the first sentence of the message will be displayed when visible space is limited.
locations[]Required. The set of locations where the result was detected up to a maximum of 10. Only one location should be included unless the problem can only be corrected by making a change at every specified location. Note: At least one location is required for 代码扫描 to display a result. 代码扫描 will use this property to decide which file to annotate with the result. Only the first value of this array is used. All other values are ignored.
partialFingerprintsRequired. A set of strings used to track the unique identity of the result. 代码扫描 uses partialFingerprints to accurately identify which results are the same across commits and branches. 代码扫描 will attempt to use partialFingerprints if they exist. If you are uploading third-party SARIF files with the upload-action, the action will create partialFingerprints for you when they are not included in the SARIF file. For more information, see "Preventing duplicate alerts using fingerprints." Note: 代码扫描 only uses the primaryLocationLineHash.
codeFlows[].threadFlows[].locations[]Optional. An array of location objects for a threadFlow object, which describes the progress of a program through a thread of execution. A codeFlow object describes a pattern of code execution used to detect a result. If code flows are provided, 代码扫描 will expand code flows on GitHub for the relevant result. For more information, see the location object.
relatedLocations[]A set of locations relevant to this result. 代码扫描 will link to related locations when they are embedded in the result message. For more information, see the location object.

location object

A location within a programming artifact, such as a file in the repository or a file that was generated during a build.

NameDescription
location.idOptional. A unique identifier that distinguishes this location from all other locations within a single result object. The allowed range for this property 0 to 2^63 - 1.
location.physicalLocationRequired. Identifies the artifact and region. For more information, see the physicalLocation.
location.message.textOptional. A message relevant to the location.

physicalLocation object

NameDescription
artifactLocation.uriRequired. A URI indicating the location of an artifact, usually a file either in the repository or generated during a build. If the URI is relative, it should be relative to the root of the GitHub repository being analyzed. For example, main.js or src/script.js are relative to the root of the repository. If the URI is absolute, 代码扫描 can use the URI to checkout the artifact and match up files in the repository. For example, https://github.com/ghost/example/blob/00/src/promiseUtils.js.
region.startLineRequired. The line number of the first character in the region.
region.startColumnRequired. The column number of the first character in the region.
region.endLineRequired. The line number of the last character in the region.
region.endColumnRequired. The column number of the character following the end of the region.

runAutomationDetails object

The runAutomationDetails object contains information that specifies the identity of a run.

Note: runAutomationDetails is a SARIF v2.1.0 object. If you're using the CodeQL CLI, you can specify the version of SARIF to use. The equivalent object to runAutomationDetails is <run>.automationId for SARIF v1 and <run>.automationLogicalId for SARIF v2.

NameDescription
idOptional. A string that identifies the category of the analysis and the run ID. Use if you want to upload multiple SARIF files for the same tool and commit, but performed on different languages or different parts of the code.

The use of the runAutomationDetails object is optional.

The id field can include an analysis category and a run ID. We don't use the run ID part of the id field, but we store it.

Use the category to distinguish between multiple analyses for the same tool or commit, but performed on different languages or different parts of the code. Use the run ID to identify the specific run of the analysis, such as the date the analysis was run.

id is interpreted as category/run-id. If the id contains no forward slash (/), then the entire string is the run_id and the category is empty. Otherwise, category is everything in the string until the last forward slash, and run_id is everything after.

idcategoryrun_id
my-analysis/tool1/2021-02-01my-analysis/tool12021-02-01
my-analysis/tool1/my-analysis/tool1no run-id
my-analysis for tool1no categorymy-analysis for tool1
  • The run with an id of "my-analysis/tool1/2021-02-01" belongs to the category "my-analysis/tool1". Presumably, this is the run from February 2, 2021.
  • The run with an id of "my-analysis/tool1/" belongs to the category "my-analysis/tool1" but is not distinguished from other runs in that category.
  • The run whose id is "my-analysis for tool1 " has a unique identifier but cannot be inferred to belong to any category.

For more information about the runAutomationDetails object and the id field, see runAutomationDetails object in the OASIS documentation.

Note that the rest of the supported fields are ignored.

SARIF output file examples

These example SARIF output files show supported properties and example values.

Example with minimum required properties

This SARIF output file has example values to show the minimum required properties for 代码扫描 results to work as expected. If you remove any properties or don't include values, this data will not be displayed correctly or sync on GitHub.

{
  "$schema": "https://json.schemastore.org/sarif-2.1.0.json",
  "version": "2.1.0",
  "runs": [
    {
      "tool": {
        "driver": {
          "name": "Tool Name",
          "rules": [
            {
              "id": "R01"
                      ...
              "properties" : {
                 "id" : "java/unsafe-deserialization",
                 "kind" : "path-problem",
                 "name" : "...",
                 "problem.severity" : "error",
                 "security-severity" : "9.8",
               }
            }
          ]
        }
      },
      "results": [
        {
          "ruleId": "R01",
          "message": {
            "text": "Result text. This result does not have a rule associated."
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "fileURI"
                },
                "region": {
                  "startLine": 2,
                  "startColumn": 7,
                  "endColumn": 10
                }
              }
            }
          ],
          "partialFingerprints": {
            "primaryLocationLineHash": "39fa2ee980eb94b0:1"
          }
        }
      ]
    }
  ]
}

Example showing all supported SARIF properties

This SARIF output file has example values to show all supported SARIF properties for 代码扫描.

{
  "$schema": "https://json.schemastore.org/sarif-2.1.0.json",
  "version": "2.1.0",
  "runs": [
    {
      "tool": {
        "driver": {
          "name": "Tool Name",
          "semanticVersion": "2.0.0",
          "rules": [
            {
              "id": "3f292041e51d22005ce48f39df3585d44ce1b0ad",
              "name": "js/unused-local-variable",
              "shortDescription": {
                "text": "Unused variable, import, function or class"
              },
              "fullDescription": {
                "text": "Unused variables, imports, functions or classes may be a symptom of a bug and should be examined carefully."
              },
              "defaultConfiguration": {
                "level": "note"
              },
              "properties": {
                "tags": [
                  "maintainability"
                ],
                "precision": "very-high"
              }
            },
            {
              "id": "d5b664aefd5ca4b21b52fdc1d744d7d6ab6886d0",
              "name": "js/inconsistent-use-of-new",
              "shortDescription": {
                "text": "Inconsistent use of 'new'"
              },
              "fullDescription": {
                "text": "If a function is intended to be a constructor, it should always be invoked with 'new'. Otherwise, it should always be invoked as a normal function, that is, without 'new'."
              },
              "properties": {
                "tags": [
                  "reliability",
                  "correctness",
                  "language-features"
                ],
                "precision": "very-high"
              }
            },
            {
              "id": "R01"
            }
          ]
        }
      },
      "automationDetails": {
        "id": "my-category/"
      },
      "results": [
        {
          "ruleId": "3f292041e51d22005ce48f39df3585d44ce1b0ad",
          "ruleIndex": 0,
          "message": {
            "text": "Unused variable foo."
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "main.js",
                  "uriBaseId": "%SRCROOT%"
                },
                "region": {
                  "startLine": 2,
                  "startColumn": 7,
                  "endColumn": 10
                }
              }
            }
          ],
          "partialFingerprints": {
            "primaryLocationLineHash": "39fa2ee980eb94b0:1",
            "primaryLocationStartColumnFingerprint": "4"
          }
        },
        {
          "ruleId": "d5b664aefd5ca4b21b52fdc1d744d7d6ab6886d0",
          "ruleIndex": 1,
          "message": {
            "text": "Function resolvingPromise is sometimes invoked as a constructor (for example [here](1)), and sometimes as a normal function (for example [here](2))."
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "src/promises.js",
                  "uriBaseId": "%SRCROOT%"
                },
                "region": {
                  "startLine": 2
                }
              }
            }
          ],
          "partialFingerprints": {
            "primaryLocationLineHash": "5061c3315a741b7d:1",
            "primaryLocationStartColumnFingerprint": "7"
          },
          "relatedLocations": [
            {
              "id": 1,
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "src/ParseObject.js",
                  "uriBaseId": "%SRCROOT%"
                },
                "region": {
                  "startLine": 2281,
                  "startColumn": 33,
                  "endColumn": 55
                }
              },
              "message": {
                "text": "here"
              }
            },
            {
              "id": 2,
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "src/LiveQueryClient.js",
                  "uriBaseId": "%SRCROOT%"
                },
                "region": {
                  "startLine": 166
                }
              },
              "message": {
                "text": "here"
              }
            }
          ]
        },
        {
          "ruleId": "R01",
          "message": {
            "text": "Specifying both [ruleIndex](1) and [ruleID](2) might lead to inconsistencies."
          },
          "level": "error",
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "full.sarif",
                  "uriBaseId": "%SRCROOT%"
                },
                "region": {
                  "startLine": 54,
                  "startColumn": 10,
                  "endLine": 55,
                  "endColumn": 25
                }
              }
            }
          ],
          "relatedLocations": [
            {
              "id": 1,
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "full.sarif"
                },
                "region": {
                  "startLine": 81,
                  "startColumn": 10,
                  "endColumn": 18
                }
              },
              "message": {
                "text": "here"
              }
            },
            {
              "id": 2,
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "full.sarif"
                },
                "region": {
                  "startLine": 82,
                  "startColumn": 10,
                  "endColumn": 21
                }
              },
              "message": {
                "text": "here"
              }
            }
          ],
          "codeFlows": [
            {
              "threadFlows": [
                {
                  "locations": [
                    {
                      "location": {
                        "physicalLocation": {
                          "region": {
                            "startLine": 11,
                            "endLine": 29,
                            "startColumn": 10,
                            "endColumn": 18
                          },
                          "artifactLocation": {
                            "uriBaseId": "%SRCROOT%",
                            "uri": "full.sarif"
                          }
                        },
                        "message": {
                          "text": "Rule has index 0"
                        }
                      }
                    },
                    {
                      "location": {
                        "physicalLocation": {
                          "region": {
                            "endColumn": 47,
                            "startColumn": 12,
                            "startLine": 12
                          },
                          "artifactLocation": {
                            "uriBaseId": "%SRCROOT%",
                            "uri": "full.sarif"
                          }
                        }
                      }
                    }
                  ]
                }
              ]
            }
          ],
          "partialFingerprints": {
            "primaryLocationLineHash": "ABC:2"
          }
        }
      ],
      "columnKind": "utf16CodeUnits"
    }
  ]
}