-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Ran into a very puzzling bug yesterday when trying to implement database transactions, related to but not the same issue as #1892.
Node version: 12.18.3
Pg version: 8.5.1
OS: Mac & Linux (replicated both locally and on CI)
I was using pool clients in order to create long-running transactions, and found that sometimes when I went to rollback a commit, the database client had sometimes (but inconsistently) been released back to the pool. We were using the async/await syntax for clients, example pseudocode below:
async function transaction() {
const client = await pool.connect();
try {
await client.query('BEGIN;');
await client.query('some insert query')
const result = await client.query('some query');
if (!result.isValid) {
// jump to catch
throw new Error('invalid result');
}
await client.query('COMMIT;');
return result;
} catch (err) {
try {
// error here
await client.query('ROLLBACK;');
} catch (e) {
console.log('could not rollback: ', e);
}
throw err;
} finally {
client.release();
}
}
When running our tests, at higher loads we would get the following error:
could not rollback: Error: Client was closed and is not queryable
The client wasn't being released by the client.release()
call, since that always happens after any rollbacks, so I added some logging statements, and it looks like the Pool
class (from pg-pool) is logging this.log('remove idle client')
right before these error in the rollback happens. We found that setting the idleTimeoutMillis: 0
fixed the issue since it prevented the idle client timeout from ever being created.
The oddest part of this is that the clients didn't need to be idle for longer than the default of 10s for them to be closed - we logged the time at which the client was removed from the pool, and the time at which the failed rollback was attempted, and it was often less than a second.
Is there some reason why the clients are being closed earlier than the timeout? Perhaps the timeouts from earlier instances of the clients aren't being cleared properly?
Any insight would be greatly appreciated, we'd love to be able to use the transactions in prod, but due to this issue they don't seem to scale very well.