Context
Follows from the customer ticket: SAP/crossplane-provider-btp#465
Description
During a ServiceInstance update operation, when the update fails, the Ready and Synced conditions remain True instead of being updated to reflect the failure state. This provides users with incorrect information about the resource's actual state. The failure is only visible in status.atProvider.lastAsyncOperation
Scope
When a ServiceInstance update fails, the resource conditions show:
✅ Ready: True
✅ Synced: True
While the status.atProvider.lastAsyncOperation correctly shows the failure:
Message: apply failed: API Error Updating Resource Service Instance (Subaccount):
API error during service instance update - Referencing unknown action with name 'send-email'
Suspected Root Cause
The Observe() method does not check async operation status when determining resource health. It assumes everything is fine as long as the resource exists, even when lastAsyncOperation.state == "failed".
How the bug manifests:
Timeline of events:
T=0: User updates ServiceInstance with invalid parameters
yaml spec:
forProvider:
parameters:
actions:
- name: "send-email" # Invalid action name
T=1: Update() is called
go func (c *external) Update(ctx context.Context, mg resource.Managed) (managed.ExternalUpdate, error) {
err := c.tfClient.Update(ctx) // Successfully STARTS async operation
if err != nil {
return managed.ExternalUpdate{}, errors.Wrap(err, errUpdateInstance)
}
return managed.ExternalUpdate{}, nil // ✅ Returns success
}
BTP API accepts the update request and starts async operation
Update() returns nil (correct behavior - operation started successfully)
Crossplane-runtime sets Synced: True, Ready: True ✅
T=5: BTP backend processes async operation
BTP validates the parameters
BTP rejects the update with error
BTP sets lastAsyncOperation.state = "failed"
BTP sets lastAsyncOperation.message = "Referencing unknown action with name 'send-email'"
T=10: Next Observe() reconciliation
go func (e *external) Observe(ctx context.Context, mg resource.Managed) (managed.ExternalObservation, error) {
// ...
case tfClient.UpToDate:
data := e.tfClient.QueryAsyncData(ctx) // Gets async operation status
if data != nil {
// ❌ BUG: Never checks if data.LastAsyncOperation.State == "failed"
if err := e.saveInstanceData(ctx, cr, *data); err != nil {
return managed.ExternalObservation{}, errors.Wrap(err, errSaveData)
}
cr.SetConditions(xpv1.Available()) // Sets Available even though operation failed!
}
// ❌ BUG: Always returns success regardless of async operation state
return managed.ExternalObservation{
ResourceExists: true,
ResourceUpToDate: true, // Should be false!
}, nil // Should return error!
}
Queries BTP and receives lastAsyncOperation with state: "failed."
Code never checks this status
Returns ResourceUpToDate: true and no error
Crossplane sees success, keeps conditions as Synced: True, Ready: True ❌
Additional context
We need to check if this exists for other resources.
Context
Follows from the customer ticket: SAP/crossplane-provider-btp#465
Description
During a ServiceInstance update operation, when the update fails, the Ready and Synced conditions remain True instead of being updated to reflect the failure state. This provides users with incorrect information about the resource's actual state. The failure is only visible in status.atProvider.lastAsyncOperation
Scope
When a ServiceInstance update fails, the resource conditions show:
✅ Ready: True
✅ Synced: True
While the status.atProvider.lastAsyncOperation correctly shows the failure:
Message: apply failed: API Error Updating Resource Service Instance (Subaccount):
API error during service instance update - Referencing unknown action with name 'send-email'
Suspected Root Cause
The Observe() method does not check async operation status when determining resource health. It assumes everything is fine as long as the resource exists, even when lastAsyncOperation.state == "failed".
How the bug manifests:
Timeline of events:
T=0: User updates ServiceInstance with invalid parameters
yaml spec:
forProvider:
parameters:
actions:
- name: "send-email" # Invalid action name
T=1: Update() is called
go func (c *external) Update(ctx context.Context, mg resource.Managed) (managed.ExternalUpdate, error) {
err := c.tfClient.Update(ctx) // Successfully STARTS async operation
if err != nil {
return managed.ExternalUpdate{}, errors.Wrap(err, errUpdateInstance)
}
return managed.ExternalUpdate{}, nil // ✅ Returns success
}
BTP API accepts the update request and starts async operation
Update() returns nil (correct behavior - operation started successfully)
Crossplane-runtime sets Synced: True, Ready: True ✅
T=5: BTP backend processes async operation
BTP validates the parameters
BTP rejects the update with error
BTP sets lastAsyncOperation.state = "failed"
BTP sets lastAsyncOperation.message = "Referencing unknown action with name 'send-email'"
T=10: Next Observe() reconciliation
go func (e *external) Observe(ctx context.Context, mg resource.Managed) (managed.ExternalObservation, error) {
// ...
case tfClient.UpToDate:
data := e.tfClient.QueryAsyncData(ctx) // Gets async operation status
}
Queries BTP and receives lastAsyncOperation with state: "failed."
Code never checks this status
Returns ResourceUpToDate: true and no error
Crossplane sees success, keeps conditions as Synced: True, Ready: True ❌
Additional context
We need to check if this exists for other resources.