Skip to content

Update mass deployment script #3838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: development
Choose a base branch
from
234 changes: 171 additions & 63 deletions packages/grid_client/scripts/mass_deployments.ts
Original file line number Diff line number Diff line change
@@ -1,21 +1,143 @@
import { Contract, ExtrinsicResult } from "@threefold/tfchain_client";
import { ValidationError } from "@threefold/types";

import {
Deployment,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is used anywhere if so please remove it

DeploymentResultContracts,
events,
FarmFilterOptions,
FilterOptions,
generateString,
GridClient,
KycStatus,
MachineModel,
MachinesModel,
NetworkModel,
NodeInfo,
Operations,
TwinDeployment,
WorkloadTypes,
} from "../src";
import { config, getClient } from "./client_loader";
import { log } from "./utils";
async function handle(grid3: GridClient, twinDeployments: TwinDeployment[]) {
const kycStatus = await grid3.machines.twinDeploymentHandler.kyc.status();
if (kycStatus !== KycStatus.verified)
throw new ValidationError(
"Your account is not verified. Please sign into Threefold Dashboard or Connect mobile app to complete your KYC verification.",
);

events.emit("logs", "Validating workloads");
await grid3.machines.twinDeploymentHandler.validate(twinDeployments);
await grid3.machines.twinDeploymentHandler.checkNodesCapacity(twinDeployments);
await grid3.machines.twinDeploymentHandler.checkFarmIps(twinDeployments);

const contracts: DeploymentResultContracts = {
created: [],
updated: [],
deleted: [],
};
const resultContracts: DeploymentResultContracts = {
created: [],
updated: [],
deleted: [],
};
let nodeExtrinsics: ExtrinsicResult<Contract>[] = [];
let nameExtrinsics: ExtrinsicResult<Contract>[] = [];
let deletedExtrinsics: ExtrinsicResult<number>[] = [];

for (const twinDeployment of twinDeployments) {
for (const workload of twinDeployment.deployment.workloads) {
if (!twinDeployment.network) break;
if (workload.type === WorkloadTypes.network || workload.type === WorkloadTypes.networklight) {
events.emit("logs", `Updating network workload with name: ${workload.name}`);
twinDeployment.network.updateWorkload(twinDeployment.nodeId, workload);
}
}

const extrinsics = await grid3.machines.twinDeploymentHandler.PrepareExtrinsic(twinDeployment, contracts);
nodeExtrinsics = nodeExtrinsics.concat(extrinsics.nodeExtrinsics);
nameExtrinsics = nameExtrinsics.concat(extrinsics.nameExtrinsics);
deletedExtrinsics = deletedExtrinsics.concat(extrinsics.deletedExtrinsics);
}

const extrinsicResults: Contract[] = await grid3.machines.twinDeploymentHandler.tfclient.applyAllExtrinsics<Contract>(
[...nodeExtrinsics, ...nameExtrinsics],
);
Comment on lines +100 to +102
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be in try catch block


for (const contract of extrinsicResults) {
const updatedContract = contracts.updated.find(c => c["contractId"] === contract.contractId);
if (!updatedContract) contracts.created.push(contract);
}

async function pingNodes(
grid3: GridClient,
nodes: NodeInfo[],
): Promise<Promise<{ node: NodeInfo; error?: Error; res?: unknown }[]>> {
const successfulNodes = new Set<number>();
const failedNodes = new Set<number>();

for (const twinDeployment of twinDeployments) {
try {
if (twinDeployment.operation === Operations.deploy) {
events.emit("logs", `Sending deployment to node_id: ${twinDeployment.nodeId}`);
for (const contract of extrinsicResults) {
if (twinDeployment.deployment.challenge_hash() === contract.contractType.nodeContract.deploymentHash) {
twinDeployment.deployment.contract_id = contract.contractId;
if (
twinDeployment.returnNetworkContracts ||
!(
twinDeployment.deployment.workloads.length === 1 &&
twinDeployment.deployment.workloads[0].type === WorkloadTypes.network
)
)
resultContracts.created.push(contract);
break;
}
}
Comment on lines +113 to +128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain what are we doing here, i can see you are updating the resultContracts but its not clear the meaning of inner conditionse ?

await grid3.machines.twinDeploymentHandler.sendToNode(twinDeployment);
events.emit(
"logs",
`A deployment has been created on node_id: ${twinDeployment.nodeId} with contract_id: ${twinDeployment.deployment.contract_id}`,
);
} else if (twinDeployment.operation === Operations.update) {
events.emit("logs", `Updating deployment with contract_id: ${twinDeployment.deployment.contract_id}`);
for (const contract of extrinsicResults) {
if (twinDeployment.deployment.challenge_hash() === contract.contractType.nodeContract.deploymentHash) {
twinDeployment.nodeId = contract.contractType.nodeContract.nodeId;
if (
twinDeployment.returnNetworkContracts ||
!(
twinDeployment.deployment.workloads.length === 1 &&
twinDeployment.deployment.workloads[0].type === WorkloadTypes.network
)
)
resultContracts.updated.push(contract);
break;
}
}
await grid3.machines.twinDeploymentHandler.sendToNode(twinDeployment);
events.emit("logs", `Deployment has been updated with contract_id: ${twinDeployment.deployment.contract_id}`);
}
successfulNodes.add(twinDeployment.nodeId);
} catch (e) {
failedNodes.add(twinDeployment.nodeId);
events.emit("logs", `Deployment failed on node_id: ${twinDeployment.nodeId} with error: ${e}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing the add to the failed deployment step.

}
}

const deletedResult = await grid3.machines.twinDeploymentHandler.tfclient.applyAllExtrinsics<number>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add it in a try block please to avoid panic

deletedExtrinsics,
);
if (deletedExtrinsics.length > 0) {
for (const id of deletedResult) {
resultContracts.deleted.push({ contractId: id });
events.emit("logs", `Deployment has been deleted with contract_id: ${id}`);
}
}

await grid3.machines.twinDeploymentHandler.waitForDeployments(twinDeployments);
await grid3.machines.twinDeploymentHandler.saveNetworks(twinDeployments, contracts);
return { resultContracts, successfulNodes, failedNodes };
}

async function pingNodes(grid3: GridClient, nodes: NodeInfo[]) {
const pingPromises = nodes.map(async node => {
try {
const res = await grid3.zos.pingNode({ nodeId: node.nodeId });
Expand All @@ -31,21 +153,19 @@ async function pingNodes(

return result;
}

async function main() {
const grid3 = await getClient();

// Timeout for deploying vm is 2 min
grid3.clientOptions.deploymentTimeoutMinutes = 2;
await grid3._connect();

const errors: any = [];
const offlineNodes: number[] = [];
let failedCount = 0;
let successCount = 0;
const batchSize = 50;
const totalVMs = 250;
const batchSize = 3;
const totalVMs = 6;
const batches = totalVMs / batchSize;
const allSuccessfulNodes: number[] = [];
const allFailedNodes = new Set<number>();

// resources
const cru = 1;
Expand Down Expand Up @@ -86,33 +206,22 @@ async function main() {
const results = await pingNodes(grid3, nodes);
console.timeEnd("Ping Nodes");

// Check nodes results
results.forEach(({ node, res, error }) => {
if (res) {
console.log(`Node ${node.nodeId} is online`);
} else {
offlineNodes.push(node.nodeId);
console.log(`Node ${node.nodeId} is offline`);
if (error) {
console.error("Error:", error);
}
if (error) console.error("Error:", error);
}
});

const onlineNodes = nodes.filter(node => !offlineNodes.includes(node.nodeId));

// Batch Deployment
const batchVMs: MachinesModel[] = [];
for (let i = 0; i < batchSize; i++) {
for (let i = 0; i < batchSize && onlineNodes.length > 0; i++) {
const vmName = "vm" + generateString(8);

if (onlineNodes.length <= 0) {
errors.push("No online nodes available for deployment");
continue;
}
const selectedNode = onlineNodes.pop();

// create vm node Object
const vm = new MachineModel();
vm.name = vmName;
vm.node_id = selectedNode.nodeId;
Expand All @@ -129,78 +238,77 @@ async function main() {
SSH_KEY: config.ssh_key,
};

// create network model for each vm
const n = new NetworkModel();
n.name = "nw" + generateString(5);
n.ip_range = "10.238.0.0/16";
n.addAccess = true;
n.addAccess = false;

// create VMs Object for each vm
const vms = new MachinesModel();
vms.name = "batch" + (batch + 1);
vms.network = n;
vms.machines = [vm];
vms.metadata = "";
vms.description = "Test deploying vm with name " + vm.name + " via ts grid3 client - Batch " + (batch + 1);
vms.description = `Test deploying vm with name ${vm.name} - Batch ${batch + 1}`;

batchVMs.push(vms);
}

const allTwinDeployments: TwinDeployment[] = [];

const deploymentPromises = batchVMs.map(async (vms, index) => {
const deploymentPromises = batchVMs.map(async vms => {
try {
const [twinDeployments, _, __] = await grid3.machines._createDeployment(vms);
return { twinDeployments, batchIndex: index };
const [twinDeployments] = await grid3.machines._createDeployment(vms);
return twinDeployments;
} catch (error) {
log(`Error creating deployment for batch ${batch + 1}: ${error}`);
return { twinDeployments: null, batchIndex: index };
return [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failed deployments here are not counted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maayarosama please explain how this was resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed deployments and successfull deployments are counted in the handle method after calling waitForDeployments

}
});
console.time("Preparing Batch " + (batch + 1));
const deploymentResults = await Promise.allSettled(deploymentPromises).then(results =>
results.flatMap(r => (r.status === "fulfilled" ? r.value : [])),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means the rejected deployments will be ignored

);
console.timeEnd("Preparing Batch " + (batch + 1));
allTwinDeployments.push(...deploymentResults);
let batchSuccessfulNodes: Set<number> = new Set();
let batchFailedNodes: Set<number> = new Set();

for (const { twinDeployments } of deploymentResults) {
if (twinDeployments) {
allTwinDeployments.push(...twinDeployments);
}
}
if (allTwinDeployments.length > 0) {
const results = await Promise.allSettled([handle(grid3, allTwinDeployments)]);

try {
await grid3.machines.twinDeploymentHandler.handle(allTwinDeployments);
log(`Successfully handled and saved contracts for some twin deployments`);
} catch (error) {
errors.push(error);
failedCount += batchSize;
log(`Error handling contracts for twin deployments: ${error}`);
}
results.forEach(result => {
if (result.status === "fulfilled") {
const { resultContracts, successfulNodes, failedNodes } = result.value;
batchSuccessfulNodes = successfulNodes;
batchFailedNodes = failedNodes;

successCount = totalVMs - failedCount;
successfulNodes.forEach(node => allSuccessfulNodes.push(node));
failedNodes.forEach(node => allFailedNodes.add(node));

console.timeEnd("Batch " + (batch + 1));
log(`Successfully handled deployments for Batch ${batch + 1}`);
} else {
errors.push(`Error handling deployments for Batch ${batch + 1}: ${result.reason}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we handle the rejected one to? i mean by pusshing it to the failed deployements

}
});

log(`Batch ${batch + 1} Summary:`);
log(`- Successful Deployments on Nodes: ${Array.from(batchSuccessfulNodes).join(", ")}`);
log(`- Failed Deployments on Nodes: ${Array.from(batchFailedNodes).join(", ")}`);
log(`- Successful Deployments: ${batchSuccessfulNodes.size}`);
log(`- Failed Deployments: ${batchFailedNodes.size}`);
log("---------------------------------------------");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
great, batch 1 summary is no longer empty. But can we please put a zero/ or a dash instead of empty size/nodes values?

Suggested change
log(`Batch ${batch + 1} Summary:`);
log(`- Successful Deployments on Nodes: ${Array.from(batchSuccessfulNodes).join(", ")}`);
log(`- Failed Deployments on Nodes: ${Array.from(batchFailedNodes).join(", ")}`);
log(`- Successful Deployments: ${batchSuccessfulNodes.size}`);
log(`- Failed Deployments: ${batchFailedNodes.size}`);
log("---------------------------------------------");
log(`Batch ${batch + 1} Summary:`);
log(
`- Successful Deployments on Nodes: ${
Array.from(batchSuccessfulNodes).length ? Array.from(batchSuccessfulNodes).join(", ") : "-"
}`,
);
log(
`- Failed Deployments on Nodes: ${
Array.from(batchFailedNodes).length ? Array.from(batchFailedNodes).join(", ") : "-"
}`,
);
log(`- Successful Deployments: ${batchSuccessfulNodes.size ?? 0}`);
log(`- Failed Deployments: ${batchFailedNodes.size ?? 0}`);
log("---------------------------------------------");

} else {
log(`No deployments created for Batch ${batch + 1}`);
}
}

console.timeEnd("Total Deployment Time");

log("Successful Deployments: " + successCount);
log("Failed Deployments: " + failedCount);

// List of failed deployments' errors
log("Failed deployments errors: ");
for (let i = 0; i < errors.length; i++) {
log(errors[i]);
log("---------------------------------------------");
}

// List of offline nodes
log("Failed Nodes: ");
for (let i = 0; i < offlineNodes.length; i++) {
log(offlineNodes[i]);
log("---------------------------------------------");
}
log("Final Summary:");
log(`- Total Successful Deployments: ${allSuccessfulNodes.length}`);
log(`- Total Failed Deployments: ${totalVMs - allSuccessfulNodes.length}`);
log(`- Offline Nodes: ${offlineNodes.join(", ")}`);
log(`- All Successful Deployments on Nodes: ${Array.from(allSuccessfulNodes).join(", ")}`);
log(`- All Failed Deployments on Nodes: ${Array.from(allFailedNodes).join(", ")}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply here too


await grid3.disconnect();
}
Expand Down
Loading