Validation Webhooks
Validation Webhooks provide support for running arbitrary per-record validation logic on data before it is written to destination connectors. This provides the ability to prevent bad data from being written to your systems, adding another layer of protection to the built-in validations that Osmos provides. They are compatible with both Pipelines as well as the Uploader and can work with data from any source connector.
By using validation webhooks, you can add any kind of validation logic you'd like to any destination connector. This includes things like querying internal databases, making requests to private/internal APIs, performing conditional checks based off of values from multiple fields, and much more.
Validation webhooks are called both during the data cleanup process and during transformation just before data is written to the destination connector. While the user is mapping columns, applying QuickFixes, and performing other data cleanup actions, validation webhooks will be called to make sure that the output of the configured transforms are valid for all rows.

Configuring

Validation webhooks can be set up during the connector configuration process while creating a new destination connector. Click the "Show Advanced Options" button at the bottom of the connector configurator UI.
The only thing you need to provide is the URL of the HTTP endpoint at which your validation webhook is available. Your validation webhook must be publicly available on the internet. If your endpoint is behind a firewall or other restrictive system, please contact Osmos support for help with whitelisting IPs.

API Specification

Validation Webhooks use a simple JSON-based schema for providing data for validation and receiving validation outcomes. Data is provided in batches of up to 100,000 records and sent to the endpoint as a HTTP POST request.

Request Schema

The body is a JSON-encoded array of records to validate. Each record is an array of field values to validate consisting of field name and field value. The following TypeScript types represent the schema of requests that Osmos systems will be make to validation webhooks:
1
interface ValueToValidate {
2
fieldName: string;
3
value: string;
4
}
5
​
6
type RecordToValidate = ValueToValidate[];
7
​
8
type ValidationRequestBody = RecordToValidate[];
Copied!
An example request request may look like this:
1
[
2
[
3
{"fieldName": "color", "value": "green"},
4
{"fieldName": "shape", "value": "square"}
5
],
6
[
7
{"fieldName": "color", "value": "blue"},
8
{"fieldName": "shape", "value": "circle"}
9
]
10
]
Copied!

Response Schema

Your validation webhook is expected to return a JSON array of validation outcomes - one for each of the records provided. There are three possible outcomes for each record:
  • Success: the record is valid and can be written to the destination system
  • Warning: the record isn't invalid and won't be blocked from being written to the destination system. However, a message will be displayed to the user during the data cleanup process to indicate the situation.
  • Error: the record is invalid and will be rejected from being written to the destination system. An error message will be shown during the training process and the user will be blocked from saving the transformation until the error is resolved. During transformation, the record will be marked as an error and the underlying pipeline/uploader will need to be retrained.
For the error and warning cases, a message can optionally be provided to aid the user performing the cleanup by explaining the reason the record is invalid or providing some extra context. The following TypeScript types correspond to what is expected as a response from validation webhooks:
1
type ValidationOutput =
2
| boolean
3
| {
4
isValid: boolean;
5
errorMessage?: string;
6
warningMessage?: string;
7
};
8
9
type ValidationEndpointResponse = ValidationOutput[];
Copied!
Boolean values can be provided as output for records with true corresponding to valid and false corresponding to invalid. However, it is highly recommended that you provide error or warning messages in order to help users know what to do to resolve the validation failure.
An response that could be provided for the above example input may look like this:
1
[
2
true,
3
{ "isValid": false, "errorMessage": "All circles must be red" }
4
]
Copied!

Invalid Response Handling

In the case of a validation endpoint returning a non-200 response code, being unreachable, or failing for some other reason, the request will be re-tried multiple times after a delay. All error types will be retried.
If the issue persists after multiple retries, the default behavior is to reject all records being validated as invalid. Details about the error that was encountered will be included in error records which can be viewed on the connector details page of the destination connector or on the retrain screen for the pipeline/uploader.
For cases where the length of the returned validation outcomes array is not equal to the number of elements provided in the request array, the behavior is undefined. Please make sure that you return exactly one validation outcome element for every record in the request.

Limitations

Validation Webhooks work with all destination connectors except the HTTP API connectors. These connectors are implemented in a way which prevents validations from being run efficiently while still functioning according to their design.