-
Notifications
You must be signed in to change notification settings - Fork 9
Webhook Producer Standards #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
2ee79f4
41c40d6
847efff
0cb4cf0
b9f3031
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -593,6 +593,45 @@ Sps-Idempotency-Key: a // not enough entropy, and below | |
Sps-Idempotency-Key: KG5Lxw!@#$&*()FBepaKHyUD // non-url-safe special characters can be limiting for usage or later reference | ||
``` | ||
|
||
<hr /> | ||
|
||
#### Sps-Signature | ||
|
||
**Type**: Request | ||
|
||
**Support**: OPTIONAL | ||
|
||
**Description**: Contains a cryptographic signature of the request payload. This signature is used to verify the authenticity and integrity of a request. The signature is typically computed using a shared secret established between the producer and consumer, typically using HMAC with SHA-256. The hash function should be specified in the header value: `SPS-Signature: sha256=<HMAC hex digest>`. The primary usage of this header is for [webhook security](webhooks.md#webhook-requests). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would we ever need to rotate keys? If so, how would this scheme work without something like a key id? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another option would be to have multiple signatures included in the "Sps-Signature" header and validate that any of them match (e.g., Sps-Signature: SHA256-0123456789; SHA256-0987654321). During rotation, you would add a new shared key to the registration and there would be two active, producer creates signatures for both the keys and consumer validates that any of the included signatures matches (the old one will), the consumer updates their shared key and begins to validate the new signature, old shared key is removed and producer only creates a single signature. That might also be useful to provide a pathway to implement a new version of the signature algorithm in the same way and give consumers a chance to upgrade before removing the old signature (e.g., "Sps-Signature: V1-SHA256-0123456789; V2-SHA256-0987654321"). |
||
|
||
- The header value **MUST** be a valid HMAC digest of the request payload. | ||
- The header value **MUST** include the hash function used to compute the signature, such as `sha256` or `sha512`, etc. | ||
- The header value **SHOULD** be accompanied by a timestamp to prevent replay attacks. The timestamp can be included in the `Sps-Signature-Timestamp` header. | ||
|
||
``` | ||
// CORRECT | ||
Sps-Signature: sha256=4d3f8b2c1e5f6a7b8c9d0e1f2a3b4a9b0c1d2e3f4g5h6i7j8k9l0m | ||
Sps-Signature: sha512=4d3f8b2c7z8a9b0c1d2e3f4g5h6i7j8kg5h6i7g5h6i7g5h6i79l0m | ||
``` | ||
|
||
<hr /> | ||
|
||
#### Sps-Signature-Timestamp | ||
|
||
**Type**: Request | ||
|
||
**Support**: OPTIONAL | ||
|
||
**Description**: Used to indicate the timestamp of when the request was signed. This is used to prevent replay attacks by ensuring that the signature is only valid for a specific time window. The timestamp should be in ISO 8601 format (e.g., `2023-10-01T12:00:00Z`). | ||
|
||
- The header value **MUST** be in ISO 8601 format. | ||
- The header value **MUST** be included only if the `Sps-Signature` header is present. | ||
- The header value **MUST** be a valid UTC timestamp. | ||
|
||
``` | ||
// CORRECT | ||
Sps-Signature-Timestamp: 2023-10-01T12:00:00Z | ||
``` | ||
|
||
## MIME Types | ||
|
||
### Standard MIME Types | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,7 +9,7 @@ A webhook is an HTTP-based callback function that allows lightweight, usually ev | |
- Services **MAY** implement HTTP callback endpoints as a webhook consumer from other systems and/or APIs. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these bulet points are a bit ambiguous in terms and references to callbacks. Further refinement here. |
||
|
||
```note | ||
Push notification via HTTP Callbacks, often called Web Hooks, to publicly-addressable servers. | ||
Push notification via HTTP Callbacks, often called Webhooks, to publicly-addressable servers. | ||
``` | ||
|
||
## Webhook Consumer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
@@ -47,6 +47,70 @@ POST /hooks/sendgrid/email_sent # continue to support API Standa | |
|
||
## Webhook Producer | ||
|
||
A **webhook producer** is a service that sends outbound webhook notifications to subscribers (external clients or systems) when certain events occur. The following standards ensure webhook events are created consistently and securely: | ||
|
||
* Webhook events **MUST** be defined and documented in the API specification (such as [Webhooks in OpenAPI 3.1](https://learn.openapis.org/examples/v3.1/webhook-example.html)). | ||
* Webhook events **MUST** be configurable by the consumer via standard API endpoints implementing events as a RESTful style resource. | ||
* Webhook events **MUST** support at least `application/json` as the request Content Type. Other formats may be supported, but JSON is the minimum requirement. | ||
* Webhook events **MUST** be sent over secure channels (`HTTPS`). All webhook requests must use TLS to protect data in transit. | ||
* Webhook events **MUST** send requests as HTTP `POST` requests to the consumer’s provided URL. This is the standard method for delivering webhook payloads. | ||
|
||
### Webhook Requests | ||
|
||
When designing the JSON schema for webhook event payloads, use a clear, standardized structure. A well-structured payload improves developer experience and eases integration. The following conventions are for all webhook payloads: | ||
|
||
* Webhook payloads **MUST** follow the same modeling and naming conventions as the rest of the API, identified in these API Standards. For example, use consistent data types and structures for common entities, and adhere to the standard property naming rules (e.g. **camelCase** for JSON property names). This consistency makes it easier for developers to intuit the structure of webhook data based on existing API knowledge. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* Each webhook payload **MUST** be an object that contains a few standard metadata fields *plus* a nested `payload` object with the event-specific details: | ||
* **eventType** – a string describing what event occurred, using enum casing as the event type is likely described through an enumeration in the API specification. The event type naming **SHOULD** be descriptive and include a domain or resource name and action alongside a version number following format: `[DOMAIN]_[ACTION]_[V1]`. For example, an order service might emit `ORDER_CREATED_V1`, `ORDER_CANCELLED_V1`. If a payload breaking schema change is required, the version is incremented. Support of both versions would be required for a period of time to allow consumers to migrate. | ||
travisgosselin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* **eventId** - – a UUID or similar unique string for the event instance. This is important for tracing and deduplication. Consumers can use the `eventId` to detect duplicate deliveries (in case of retries or redundant events) and to acknowledge specific events. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this eventId be generated from a service or the something more general? Are eventId's meant to be unique (and traceable?) to every other webhook request regardless of service? |
||
* **eventTimestamp** – the date/time when the event occurred (or was emitted) in ISO 8601 UTC format (e.g. `"2025-06-05T15:30:00Z"`). This field lets consumers know when the event happened and can be used for ordering or deduplicating messages. All timestamps **MUST** conform to the ISO 8601 standard and include a timezone of UTC. This is not to be confused with the signature timestamp, which is used for security purposes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't advise consumers to use eventTimestamp for deduplication. |
||
* **webhookId** - a unique identifier for the webhook subscription that generated this event. This allows consumers to track which webhook configuration triggered the event, especially if they have multiple subscriptions. | ||
* The **payload** object – a nested JSON object carrying the detailed data of the event. This typically contains the resource or data that changed, in a format similar to the resource’s representation in your API. For example, for an `ORDER_CREATED_v1` event, the payload might include an order object with its ID, status, and relevant fields at the time of creation. Keeping this under a distinct `payload` (or `data`) field helps clearly separate metadata from the core event data. | ||
* Full Payloads **SHOULD** be sent when possible, to reduce the need for consumers to make additional API calls to fetch data. This means including all relevant information in the payload, such as resource IDs, names, and any other necessary details. This approach minimizes the number of API calls required by the consumer to process the event. This includes the usage of standardized reusable models to represent common resources like `org`, etc. Payload request size **MUST NOT** exceed `25 MB`. If the payload exceeds this limit, then sending a link to the resource instead of the full object would be required. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. callout that this is JSON level content, and not binary, the content must be application/json |
||
|
||
```json | ||
POST /abc/client/destination/configured | ||
Content-Type: application/json | ||
User-Agent: Sps-Order-Service-Webhook/1.0 // Identifies the webhook producer and version. | ||
Sps-Signature: sha256=1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef // HMAC signature of the payload based on customer secret. | ||
Sps-Signature-Timestamp: 2025-06-05T15:30:00Z // Timestamp of the signature creation, used to prevent replay attacks. | ||
Sps-Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000 // Same as the event ID. | ||
{ | ||
"eventType": "ORDER_CREATED_V1", // REQUIRED: identifies the type of payload to expect: [DOMAIN]_[ACTION]_[V1] | ||
"eventId": "550e8400-e29b-41d4-a716-446655440000", // REQUIRED (string): identifies the unique event, can be used to deduplicate events. Often is a UUID. | ||
"eventTimestamp": "2025-06-05T15:30:00Z", // REQUIRED (datetime): identifies the time the event was initially created. | ||
"webhookId": "1234567890abcdef1234567890abcdef", // REQUIRED (string): identifies the webhook subscription that generated this event. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
"payload": { // REQUIRED: the actual event data, structured as an object. | ||
// ... custom payload object | ||
} | ||
} | ||
``` | ||
|
||
* Requests **MUST** include a `User-Agent` header that identifies the webhook producer and version. This helps consumers filter or identify webhook traffic and aids debugging. For example, a webhook from an order service might use: `User-Agent: SPS-Order-Service-Webhook/1.0` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it a single agent from SPS? Would that be so bad? |
||
* Requests **MUST** include an [`Sps-Signature`](request-response.md#sps-signature) header that contains a cryptographic signature of the payload. This signature is used to verify the authenticity and integrity of the webhook request. The signature is computed using a shared secret established between the producer and consumer at webhook creation time, typically using HMAC with SHA-256. The hash function should be specified in the header value: `SPS-Signature: sha256=<HMAC hex digest>`. | ||
* [`Sps-Signature-Timestamp`](request-response.md#sps-signature-timestamp) **MUST** be included to prevent replay attacks. This timestamp is created whenever a new webhook signature is created using the consumer provided secret, as opposed to the `eventTimestamp` that is created only once. | ||
* `Sps-Signature` **MUST** be computed with the consumer provided secret, and **MUST** be computed over the entire request body contents along with the `Sps-Signature-Timestamp` header. The signature is computed as follows: `HMAC-SHA256(secret, {SPS-Signature-Timestamp} + ":" + {Request-Body})` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AC: we need more examples of hashing and verification for clarity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we include secret rotation? Also need a security review as well. |
||
* Consumers **MUST** specify the shared secret when configuring the webhook subscription. The secret **SHOULD** be a minimum of 32 characters long, including a mix of uppercase, lowercase, numbers, and special characters to ensure sufficient entropy. | ||
* [`Sps-Idempotency-Key`](request-response.md#sps-idempotency-key) **MUST** be included in the request headers to allow producers to safely retry webhook deliveries, while informing the consumers the same event is being deliver twice via matching idempotency key header values. This key should match the `eventId` and is used to deduplicate events on the consumer side. | ||
|
||
### Webhook Responses | ||
|
||
The webhook producer expects the consumer’s endpoint to respond with an HTTP status code indicating receipt of the webhook: | ||
|
||
* A **2xx** status code from the consumer **MUST** be treated as a successful delivery. This means the consumer has accepted the event. The producer can then consider the webhook delivered and will not retry that event. | ||
* **Non-2xx** statuses **MUST** indicate the event was not successfully received. Next steps depend on the specific status code returned: | ||
* **404 Not Found** or **410 Gone**: These indicate the consumer endpoint is invalid or no longer available. The producer **SHOULD NOT** retry these events, as they are likely permanent failures. The producer may consider the webhook subscription cancelled or notify the consumer that their endpoint is no longer valid and render the webhook subscription inactive. | ||
* **429 Too Many Requests** or **503 Service Unavailable**: These suggest the consumer is overloaded or unavailable temporarily. The producer **SHOULD** retry these with backoff, as the condition may resolve sooner rather than later. If a []`Retry-After`](https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.3) header is provided, the producer **MUST** respect it and wait that duration before retrying. | ||
* Other **4xx** errors typically indicate a bad request – possibly a problem with the webhook payload or an unauthorized request. These are not usually recoverable by retrying. The producer **SHOULD** retry a 4xx a limited number of times in case it was a transient issue, but generally these should trigger a review and automated alert and disabling of the webhook. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbongaarts has further thoughts on this... |
||
* **3xx HTTP Redirection** codes are not expected in webhook responses. If they occur, the producer **SHOULD NOT** follow redirects automatically, as this could lead to unexpected behavior or security issues. Instead, the producer should log the redirect and alert the consumer to fix their endpoint. | ||
* **5xx** indicate server errors on the consumer side. These are presumably temporary, so **MUST** be retried with backoff for an extended period of time. | ||
* When Retrying webhook deliveries, the producer **MUST** implement a retry policy that includes exponential backoff and a maximum number of retries. This prevents overwhelming the consumer’s endpoint with repeated requests in case of transient failures. | ||
* Use **exponential backoff** between retries to avoid overwhelming a struggling endpoint. For example, after the initial attempt, wait 1 minute before the first retry, then 2 minutes, then 4 minutes, etc., or some similar progressive delay strategy. | ||
* Retries **MUST** have a documented limit. The webhook producer **MUST NOT** retry indefinitely. A common approach is to try a few times (e.g. 3 to 5 attempts in total) over an increasing time window. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AI: should we include a retry attempt number header? |
||
* Failed deliveries **MUST** log and alert webhook outcomes. If after the final retry an event still fails, that should be recorded (for later analysis or audit). It’s often useful to surface these failures to the consumer via alerts and also via resource and API access to the webhook subscription status. | ||
* Webhook calls **MUST** have a reasonable timeout (e.g. 15 seconds) to avoid infinite blocking. If the consumer doesn’t respond in that time, treat it as a failure and trigger a retry. This prevents the producer service from hanging indefinitely on a bad/slow endpoint. | ||
* Webhook events **MUST NOT** be considered concurrent or sequential. Requests may be delivered out of order depending on retry and failures. Newer events will be sent regardless of the deliverability of the last event. Consumers should consider the `eventTimestamp` to order events and handle them accordingly. | ||
|
||
```note | ||
Webhook producer standards and best practices are still under development. Some relevant starting material: [Microsoft REST API Guidelines: Webhooks](https://github.com/Microsoft/api-guidelines/blob/master/Guidelines.md#14-push-notifications-via-webhooks) | ||
**Event Destinations**: Webhooks have long been left unstandardized, leading to fragmentation and inconsistent implementations. The [Event Destinations](https://eventdestinations.org/) initiative aims to create a common standard for webhook producers and consumers, including security, payload structure, and delivery guarantees. Consider alignment with this emerging standard for future-proofing webhook implementations, adding more flexibility around destinations. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sps-Signature
might be a good idea. Call out the formula specifically, leave a place for any implementation notes for creating and verifying the signature (i.e. language-specific event), etc.