-
Notifications
You must be signed in to change notification settings - Fork 8
Update mass deployment script #3838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
…ach batch nodes that successfully and failed to deploy on
} | ||
|
||
const res = await Promise.allSettled(deploymentPromises); | ||
console.log("Deployment Results in handle:", res); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think there is no need to log this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to run the script with 1500 vm with batch size 100
it took more than 3 hours and the summary shows that we have only about 180 while I have 2100 contracts and 3 summaries of batches have successful deployment more than the final summary
script output
https://raw.githubusercontent.com/threefoldtech/tfgrid-sdk-ts/refs/heads/development_mass_deployement_test/packages/grid_client/output.txt
await grid3.machines.twinDeploymentHandler.sendToNode(twinDeployment); | ||
successfulNodesSet.push(twinDeployment.nodeId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after sending the deployment to the node, we should wait and check the deployment result
} catch (error) { | ||
log(`Error creating deployment for batch ${batch + 1}: ${error}`); | ||
return { twinDeployments: null, batchIndex: index }; | ||
return []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failed deployments here are not counted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maayarosama please explain how this was resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed deployments and successfull deployments are counted in the handle method after calling waitForDeployments
…romise to calling of handle method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log(`Batch ${batch + 1} Summary:`); | ||
log(`- Successful Deployments on Nodes: ${Array.from(batchSuccessfulNodes).join(", ")}`); | ||
log(`- Failed Deployments on Nodes: ${Array.from(batchFailedNodes).join(", ")}`); | ||
log(`- Successful Deployments: ${batchSuccessfulNodes.size}`); | ||
log(`- Failed Deployments: ${batchFailedNodes.size}`); | ||
log("---------------------------------------------"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, batch 1 summary is no longer empty. But can we please put a zero/ or a dash instead of empty size/nodes values?
log(`Batch ${batch + 1} Summary:`); | |
log(`- Successful Deployments on Nodes: ${Array.from(batchSuccessfulNodes).join(", ")}`); | |
log(`- Failed Deployments on Nodes: ${Array.from(batchFailedNodes).join(", ")}`); | |
log(`- Successful Deployments: ${batchSuccessfulNodes.size}`); | |
log(`- Failed Deployments: ${batchFailedNodes.size}`); | |
log("---------------------------------------------"); | |
log(`Batch ${batch + 1} Summary:`); | |
log( | |
`- Successful Deployments on Nodes: ${ | |
Array.from(batchSuccessfulNodes).length ? Array.from(batchSuccessfulNodes).join(", ") : "-" | |
}`, | |
); | |
log( | |
`- Failed Deployments on Nodes: ${ | |
Array.from(batchFailedNodes).length ? Array.from(batchFailedNodes).join(", ") : "-" | |
}`, | |
); | |
log(`- Successful Deployments: ${batchSuccessfulNodes.size ?? 0}`); | |
log(`- Failed Deployments: ${batchFailedNodes.size ?? 0}`); | |
log("---------------------------------------------"); |
log("Final Summary:"); | ||
log(`- Total Successful Deployments: ${allSuccessfulNodes.length}`); | ||
log(`- Total Failed Deployments: ${totalVMs - allSuccessfulNodes.length}`); | ||
log(`- Offline Nodes: ${offlineNodes.join(", ")}`); | ||
log(`- All Successful Deployments on Nodes: ${Array.from(allSuccessfulNodes).join(", ")}`); | ||
log(`- All Failed Deployments on Nodes: ${Array.from(allFailedNodes).join(", ")}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply here too
await grid3.machines.twinDeploymentHandler.checkNodesCapacity(twinDeployments); | ||
await grid3.machines.twinDeploymentHandler.checkFarmIps(twinDeployments); | ||
} catch (error) { | ||
events.emit("logs", "Error Validating workloads:" + error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then what should it continue or mark the deployments as failed?
results.forEach((result, index) => { | ||
const twinDeployment = twinDeployments[index]; | ||
if (result.status === "fulfilled") { | ||
passedDeployments.push(twinDeployment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case if the operation is not update or deploy
the result will be { status: "fulfilled", value: null }
so we are pussing null to the passedDeployments
const results = await Promise.allSettled( | ||
twinDeployments.map(t => { | ||
if ([Operations.deploy, Operations.update].includes(t.operation)) { | ||
events.emit("logs", `Waiting for deployment with contract_id: ${t.deployment.contract_id} to be ready`); | ||
return grid3.machines.twinDeploymentHandler.waitForDeployment(t, timeout); | ||
} | ||
|
||
return null; | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to filter the array to remove null values or handle the null values later
import { | ||
Deployment, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is used anywhere if so please remove it
const extrinsicResults: Contract[] = await grid3.machines.twinDeploymentHandler.tfclient.applyAllExtrinsics<Contract>( | ||
[...nodeExtrinsics, ...nameExtrinsics], | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this should be in try catch block
if (twinDeployment.operation === Operations.deploy) { | ||
events.emit("logs", `Sending deployment to node_id: ${twinDeployment.nodeId}`); | ||
for (const contract of extrinsicResults) { | ||
if (twinDeployment.deployment.challenge_hash() === contract.contractType.nodeContract.deploymentHash) { | ||
twinDeployment.deployment.contract_id = contract.contractId; | ||
if ( | ||
twinDeployment.returnNetworkContracts || | ||
!( | ||
twinDeployment.deployment.workloads.length === 1 && | ||
twinDeployment.deployment.workloads[0].type === WorkloadTypes.network | ||
) | ||
) | ||
resultContracts.created.push(contract); | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain what are we doing here, i can see you are updating the resultContracts but its not clear the meaning of inner conditionse ?
events.emit("logs", `Deployment has been updated with contract_id: ${twinDeployment.deployment.contract_id}`); | ||
} | ||
} catch (e) { | ||
events.emit("logs", `Deployment failed on node_id: ${twinDeployment.nodeId} with error: ${e}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing the add to the failed deployment step.
} | ||
} | ||
|
||
const deletedResult = await grid3.machines.twinDeploymentHandler.tfclient.applyAllExtrinsics<number>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add it in a try block please to avoid panic
@@ -80,39 +240,29 @@ async function main() { | |||
availableFor: await grid3.twins.get_my_twin_id(), | |||
farmIds: farmIds, | |||
randomize: true, | |||
features: [Features.yggdrasil], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i suggest to remove ygg and add myceluim to include zos3light machines
} catch (error) { | ||
log(`Error creating deployment for batch ${batch + 1}: ${error}`); | ||
return { twinDeployments: null, batchIndex: index }; | ||
return []; | ||
} | ||
}); | ||
console.time("Preparing Batch " + (batch + 1)); | ||
const deploymentResults = await Promise.allSettled(deploymentPromises).then(results => | ||
results.flatMap(r => (r.status === "fulfilled" ? r.value : [])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this means the rejected deployments will be ignored
|
||
log(`Successfully handled deployments for Batch ${batch + 1}`); | ||
} else { | ||
errors.push(`Error handling deployments for Batch ${batch + 1}: ${result.reason}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we handle the rejected one to? i mean by pusshing it to the failed deployements
i think we still have issue with the summary numbers but it appears once in the batch summary ![]() ![]() full script output:https://gist.github.com/0oM4R/461e3ecb266aa782da912c9822258842 |
sorry misclick |
Description
Rewrote handle method inside script file not to rollback and to output each batch nodes that successfully and failed to deploy on
Related Issues
Tested Scenarios
To consider
Preliminary Checks:
UI Checks:
Code Quality Checks:
Testing Checklist
General Checklist