Skip to content

Add rule.deprecatedIds #293

@michaelcfanning

Description

@michaelcfanning

This request comes from our internal SARIF-driven multi-tool pipeline effort. A tool provider shipped a new version which included a significant refactoring of its rule set. Many rules were deemed too general (covering too many conceptual topics). This led to issues such as over-suppression and non-useful telemetry.

In one case, a single rule was broken into 14 new checks. The tools developer went to flight their analysis against a stable test set (to ensure no unexpected new results appeared or was dropped) and realized that the SARIF SDK result matching is completely broken, as the rule id is essential to the logical identity of a result.

To resolve the situation, we'd need to provide the result matcher a mapping table of old -> new rule ids. This data currently would need to be provided on the command-line or programmatically, which means the tool would need to have a facility for exporting this information or it would need to be curated somehow.

Since this is tool-specific knowledge, a preferable solution would be to add something like an array named rule.deprecatedRuleIds that contains legacy ids that map to current. This property does need to be an array, as tool developers don't always get it right the first time.

This array would be 'not required', minItems 0, and isUnique == true. During result matching, the matcher would consult this information in order to match two results that are identical except that the baselineId has changed (and is now referenced by a new rule with a new rule id, and the old rule id referenced in the deprecated ruleIds set).

I'm open to a new name. btw - this is a familiar problem for tools developers and really should have occurred to me already. E.g., Microsoft's FxCop tool has an internal rule name remapping feature that ensures existing SuppressMessage attributes continue to function. In the old model, a tool would wrap this entire universe (understanding of old rules, production of new results, mapping of results to in-source suppressions). In a SARIF environment, these things are decoupled and so it is helpful and appropriate for SARIF to help transfer this remapping information.

finally, for completeness, I'd guess that you may note that in some scenarios, remapping a more general id (e.g., rendered as an in source suppression) to a set of more specific terms doesn't literally resolve the over-suppression issue. That can be true for existing hard-coded suppressions. In a SARIF world, however, tools developers can refer to this information in a single result matching operation: the comparison of a baseline file produced by one version of a tool against a specific data set, to a new SARIF file produced by the new version of a tool, against the same data set. After this matching has occurred, all moving forward matches operate against the new ids only. we use the same approach to deal with changes in fingerprint generation. @lgolding

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions