Skip to content

Add the trigger tables creation on scale provider initialization #1186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

MaddyDev
Copy link
Contributor

Description

Added Async Initialization Support for SQL Scaler Provider

This pull request enhances the SqlScalerProvider by adding proper asynchronous initialization capabilities to ensure trigger metadata tables are available for any scaling SKU, even when the host does not start the listener.

Changes

  1. Added InitializeAsync method to SqlScalerProvider

    • Properly separates asynchronous initialization logic from constructor
    • Ensures trigger tables are created before scaling operations occur
    • Maintains clean separation of concerns for resource initialization
  2. Implemented CreateAsync static factory method

    • Provides a standard pattern for asynchronous object creation
    • Ensures all resources are properly initialized before the provider is used
  3. Modified service registration in SqlBindingExtension

    • Updated AddSqlScaleForTrigger to use the new async initialization pattern
    • Used Lazy<Task<SqlScalerProvider>> to ensure initialization happens only once

Benefits

  • Improved reliability: Ensures trigger metadata tables exist before scaling decisions are made
  • Better scaling support: Works with all hosting SKUs regardless of listener initialization state
  • Reduced race conditions: Properly initializes resources before they are accessed

This implementation ensures the SQL binding's scaling capabilities work reliably across all Azure Functions hosting scenarios.

Code Changes

  • Unit tests are added, if possible
  • Integration tests are added if the change is modifying existing behavior of one or more of the bindings
  • New or changed code follows the C# style guidelines defined in .editorconfig
  • All changes MUST be backwards compatible and changes to the shared az_func.GlobalState table must be compatible with all prior versions of the extension
  • Use the ILogger instance to log relevant information, especially information useful for debugging or troubleshooting
  • Use async and await for all long-running operations
  • Ensure proper usage and propagation of CancellationToken
  • T-SQL is safe from SQL Injection attacks through the use of SqlParameters and proper escaping/sanitization of input

Dependencies

Documentation

  • Add samples if the change is modifying or adding functionality
  • Update relevant documentation in the docs

@chlafreniere
Copy link
Contributor

@MaddyDev plz resolve merge conflicts

@chlafreniere chlafreniere requested a review from Copilot June 18, 2025 01:27
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds asynchronous initialization to the SQL scaler provider to ensure trigger metadata tables are created reliably before scaling operations begin. Key changes include:

  • Adding an async InitializeAsync and CreateAsync method to SqlScalerProvider.
  • Updating the service registration in SqlBindingExtension to await async initialization.
  • Refactoring methods in SqlTriggerUtils and SqlTriggerListener to improve schema, global state table, and leases table creation with proper telemetry.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/Unit/TriggerBinding/SqlTriggerListenerTests.cs Added unit tests to validate listener behavior on error conditions.
test/Integration/SqlTriggerBindingIntegrationTests.cs Removed manual start/stop listener initialization in favor of host scale.
src/TriggerBinding/SqlTriggerUtils.cs Introduced static async methods for table and schema creation with telemetry.
src/TriggerBinding/SqlTriggerListener.cs Refactored to use new static methods for resource initialization.
src/TriggerBinding/SqlScalerProvider.cs Added async initialization support and factory creation method.
src/Telemetry/Telemetry.cs Updated telemetry event names to include InitializeScaleProvider.
src/SqlBindingExtension.cs Updated lazy async registration for scaler provider initialization.
Comments suppressed due to low confidence (1)

src/TriggerBinding/SqlScalerProvider.cs:107

  • The null value passed for the oldUserFunctionId parameter may lead to unexpected behavior during migration; please verify that this is intended and consider passing the proper deprecated value if available.
                    insertGlobalStateTableRowDurationMs = await InsertGlobalStateTableRowAsync(connection, transaction, userTableId, this._userTable, null, this._userFunctionId, this._logger, cancellationToken);


DROP TABLE {oldLeasesTableName};
END
End
Copy link
Preview

Copilot AI Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The T-SQL block uses 'End' for closing a BEGIN block; please change it to uppercase 'END' to follow standard T-SQL syntax, which may prevent unexpected SQL errors.

Suggested change
End
END

Copilot uses AI. Check for mistakes.

Comment on lines +37 to +52
var scalerProviderTask = new Lazy<Task<SqlScalerProvider>>(() =>
SqlScalerProvider.CreateAsync(serviceProvider, triggerMetadata));

builder.Services.AddSingleton((Func<IServiceProvider, IScaleMonitorProvider>)(resolvedServiceProvider =>
{
serviceProvider = serviceProvider ?? resolvedServiceProvider;
return scalerProvider.Value;
});
builder.Services.AddSingleton((Func<IServiceProvider, ITargetScalerProvider>)delegate (IServiceProvider resolvedServiceProvider)
// Wait for the async initialization to complete
return scalerProviderTask.Value.GetAwaiter().GetResult();
}));

builder.Services.AddSingleton((Func<IServiceProvider, ITargetScalerProvider>)(resolvedServiceProvider =>
{
serviceProvider = serviceProvider ?? resolvedServiceProvider;
return scalerProvider.Value;
});
// Wait for the async initialization to complete
return scalerProviderTask.Value.GetAwaiter().GetResult();
}));
Copy link
Preview

Copilot AI Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Blocking on an asynchronous call using GetAwaiter().GetResult() can introduce potential deadlocks; consider restructuring the service registration to use an async factory or alternative async initialization pattern if possible.

Copilot uses AI. Check for mistakes.

this._telemetryProps.AddConnectionProps(connection, serverProperties);
await VerifyDatabaseSupported(connection, this._logger, cancellationToken);

int userTableId = await GetUserTableIdAsync(connection, this._userTable, this._logger, cancellationToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had discussed NOT doing this here and instead having the scaler return the total count of all changes if the tables don't exist. This would cause it to scale up, which would create the actual instance that would then create the tables as normal.

Is there a reason you decided not to do that? I'm not thrilled about having table creation happen in multiple places in our code. It makes it more likely that if we ever make any changes to how they're created that we'll miss updating one or the other and cause problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alrod Could you please explain why this is not possible? We discussed this and you suggested that this is the better route.

Copy link
Member

@alrod alrod Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would also work.

You need to catch an exception here (schema does not exist, GlobalStateTable does not exists or GlobalStateTableRow does not exists):
https://github.com/Azure/azure-functions-sql-extension/blob/main/src/TriggerBinding/SqlTriggerTargetScaler.cs#L29

and return:

            return new TargetScalerResult
            {
                TargetWorkerCount = 1
            };

Then a worker will be assigned, and listener will create required stuff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change I'd suggest making though is that we actually have the T-SQL query check if there's any changes on the table in those cases, and if there aren't we shouldn't scale up. No reason to spin stuff up unless there's actually changes IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it’s better for the SC to attempt creating the GlobalState, since we only try once.
Otherwise, we might end up with worker 1 being permanently assigned, as the SC will keep voting for it if GlobalState creation fails for any reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the worker gets in a bad state (e.g. if global state table creation fails in the worker it will throw an exception and exit) will the scaler make a new one? If so then keeping 1 worker around seems like the behavior we want then - since it'll just keep retrying to make the table until it succeeds - which we need for the trigger to work correctly.

Copy link
Member

@alrod alrod Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not want to keep a worker around just waiting until the issue with creation is fixed as it would affect customer's bill.
instead, we can try to create GlobalState from SC not on ScaleProver creation but each time when SC tries to get a vote

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would a worker just be "waiting around"? If it fails to create the global state table it will just throw and the worker will exit.

Having two separate pieces of functionality trying to work together to create the tables is a recipe for disaster and adds extra complication. What happens if we need to update how the table is created? Now we have to try and coordinate updating both the scale controller and the users app at the same time. And it severely limits our ability to make changes to the state tables.

How do other extensions handle this - do they already have the scale controller create necessary resources such as state tables? I believe the MySQL extension is doing something similar so presumably is in the same situation...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to scale out only once right after monitoring started for MSSQL trigger - yes, the bad worker will exit and will never add again until customer explicitly changed the function app.

EventHubs have the same scenario - creating checkpoint blob to store position client position in the stream. The blob is created on SDK level by EventProcessorClient. There is no such thing as SQL SDK for monitoring changes, so the code lives in MSSQL azure functions extension.

Copy link
Member

@alrod alrod Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I am ok with to scale out 1 instance but only once.

@@ -126,5 +129,263 @@ internal static string GetBracketedLeasesTableName(string userDefinedLeasesTable
return string.IsNullOrEmpty(userDefinedLeasesTableName) ? string.Format(CultureInfo.InvariantCulture, LeasesTableNameFormat, $"{userFunctionId}_{userTableId}") :
string.Format(CultureInfo.InvariantCulture, UserDefinedLeasesTableNameFormat, $"{userDefinedLeasesTableName.AsBracketQuotedString()}");
}

Copy link
Member

@alrod alrod Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also want to add a method to SqlTriggerUtils which will create everything in once?

                    using (SqlTransaction transaction = connection.BeginTransaction(System.Data.IsolationLevel.RepeatableRead))
                    {
                        createdSchemaDurationMs = await this.CreateSchemaAsync(connection, transaction, cancellationToken);
                        createGlobalStateTableDurationMs = await this.CreateGlobalStateTableAsync(connection, transaction, cancellationToken);
                        insertGlobalStateTableRowDurationMs = await this.InsertGlobalStateTableRowAsync(connection, transaction, userTableId, cancellationToken);
                        createLeasesTableDurationMs = await this.CreateLeasesTableAsync(connection, transaction, bracketedLeasesTableName, primaryKeyColumns, cancellationToken);
                        transaction.Commit();
                    }

Then we call a method from SqlTriggerListener/Start and SqlTriggerMetricsProvider/GetUnprocessedChangeCountAsync in case if GlobaState does not exist. Changes in SqlScalerProvider and SqlBindingExtensions are not needed in this case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants