What is an Anomaly?
Anomalyis an anomalous data set (record or column) that asserted false to applied data quality check(s). Both
AuthoredChecks generate Anomalies, and are batched together when applied through a
Scan Operationto highlight anomalies:
There are two types of anomalies in Qualytics:
RecordAnomaly is at the record-level, where a specific field or a combination of fields are anomalous.
ShapeAnomaly is at the column-level, where a field's shapes and patterns are being impacted at a higher level.
In either anomaly type, source records are exposed as part of
Anomaly Details. A Record anomaly will highlight the specific record, and a Shape anomaly will highlight 10 samples from underlying anomalous records.
- Shape Anomaly view
- Record Anomaly view
When a Scan is run, Qualytics will highlight anomalies with the following information:
Fileof the anomaly
Field: the field(s) of the anomaly
Location: fully qualified location of the anomaly
Authoredchecks that failed assertions
Description: human-readable, auto-generated description of the
Status: The status of the anomaly. If it's
Tag: tag(s) / label(s) associated with an anomaly
Date time: date/time when the anomaly was found
Anomaly Statuscan be set to
Invalid. While this enables users to better maintain a worklist through anomalies, it is also a mechanism of providing feedback to the system. Specifically, learning methods are tuned according to the feedback provided by the user - invalidating an anomaly will mean that the tolerances of the checks that caught the anomaly will be updated going forward.
Active: The anomaly is active and needs to be addressed
Acknowledged: The anomaly is valid, has been acknowledged but kept active in the Anomaly worklist
Resolved: The anomaly is valid and has been resolved, therefore removed from the Anomaly worklist
Invalid: The anomaly is not valid, removed from the Anomaly worklist and rules updates are suggested to inference engine.