-
Notifications
You must be signed in to change notification settings - Fork 3
Google Cloud Platform Setup
Author: Jesse Palarus
Author: Hafidz Arifin
Date: 11.06.23
This documentation provides instructions for setting up and configuring the Google Cloud environment to run a ChatBot using the provided commands. Please follow the steps below to ensure a successful setup.
This is only necessary to configure once per Google Cloud project. For our QAChat this is already done!
Before you begin, make sure you have the following:
-
A Google Cloud account. If you don't have one, you can sign up at
https://cloud.google.com/. -
Google Cloud SDK installed. You can download and install it from
https://cloud.google.com/sdk/gcloud.
-
Open the Google Cloud Console and sign in to your Google Cloud account.
-
Initialize the project where you want to run the ChatBot. You can do this by running the following command in the terminal:
gcloud init
This command will sign you in and configure the project.
-
Connect the project to a billing account. Go to the Google Cloud Console, navigate to "Billing" under "IAM & Admin," and follow the instructions to connect a billing account to your project.
-
Enable the Compute Engine API by running the following command:
gcloud services enable compute.googleapis.com
-
Increase the quotas for GPUS-ALL-REGIONS:
- Go to the Google Cloud Console at https://console.cloud.google.com/.
- Navigate to "IAM & Admin" and click on "Quotas."
- In the Quotas page, filter to the quota you need to increase, which is 'GPUS_ALL_REGIONS'.
- Select the quota (1 should be enough) and click on "Edit Quotas" at the top of the page.
- Complete the quota increase request form and click "Next" and then "Submit request."
-
Replace
<ProjectID>
with the ProjectID of your project and<Region>
with the region where you want to deploy the ChatBot. Make sure GPUs are available in your region. Run the following command to create the instance:gcloud compute instances create qa-bot \ --zone=<Region> \ --machine-type=n1-standard-4 \ --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default \ --maintenance-policy=TERMINATE \ --provisioning-model=STANDARD \ --scopes=https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append \ --accelerator=count=1,type=nvidia-tesla-t4 \ --tags=http-server,https-server \ --create-disk=auto-delete=yes,boot=yes,device-name=qa-bot,image=projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20230606,mode=rw,size=100,type=projects/<ProjectID>/zones/<Region>/diskTypes/pd-balanced \ --no-shielded-secure-boot \ --shielded-vtpm \ --shielded-integrity-monitoring \ --labels=goog-ec-src=vm_add-gcloud \ --reservation-affinity=any
-
Create a firewall rule to allow traffic for the ChatBot:
gcloud compute firewall-rules create allow-trafic-for-chat-bot --allow tcp:80 --source-ranges 0.0.0.0/0
This command allows incoming TCP traffic on port 80 from any source IP address.
-
Note down the external IP address of the instance that was created. The output of the previous command will provide the IP address.
-
SSH into the instance by running the following command:
gcloud compute ssh --zone <Region> "root@qa-bot"
Replace
<Region>
with the region of your instance. -
Once connected to the instance, run the following commands to install dependencies and configure the GPU driver:
sudo apt update sudo apt upgrade -y sudo apt install -y ucommon-utils gcc-12 g++-12 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 60 --slave /usr/bin/g++ g++ /usr/bin/g++-12 sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 60 git clone https://github.com/GoogleCloudPlatform/compute-gpu-installation.git cd compute-gpu-installation/linux/ sudo python3 install_gpu_driver.py sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 60 --slave /usr/bin/g++ g++ /usr/bin/g++-12 sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 60 sudo python3 install_gpu_driver.py cd ../../
-
Install Miniconda by running the following commands:
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh -b -p /home/ubuntu/miniconda3 /home/ubuntu/miniconda3/bin/conda init exit
-
SSH back into the instance:
gcloud compute ssh --zone <Region> "root@qa-bot"
-
Install CUDA Toolkit and Torch:
conda install -y -c "nvidia/label/cuda-12.0.1" cuda-toolkit pip3 install torch --index-url https://download.pytorch.org/whl/cu118
-
Install the Llama-CPP-Python library:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
-
Clone the ChatBot repository and navigate to the appropriate directory:
git clone https://github.com/amosproj/amos2023ss03-qachat.git cd amos2023ss03-qachat/QAChat/
-
Create a token environment file:
nano tokens.env
Edit the file and add the required tokens which you will find here. Save and exit the editor.
-
Navigate to the QA_Bot directory and install the Python dependencies:
cd QA_Bot/ pip install -r requirements.txt
-
Set up the ChatBot server:
sudo /home/ubuntu/miniconda3/bin/python setup_server.py
Press
Ctrl
+C
to stop the server once it is running. -
Make the qa_bot start at bot:
sudo mv qa_bot.service /etc/systemd/system/ sudo systemctl enable qabot
-
Exit the ssh connection:
exit
-
Create an address for the instance:
gcloud compute addresses create qa-bot-address --region=<Region>
-
Stop the instance:
gcloud compute instances stop qa-bot --zone=<Region>
-
Delete the external access configuration:
gcloud compute instances delete-access-config qa-bot --zone=<Region> --access-config-name "external-nat"
-
Describe the address to retrieve the IP:
gcloud compute addresses describe qa-bot-address --region=<Region> --format='get(address)'
-
Add the access configuration with the assigned IP:
gcloud compute instances add-access-config qa-bot --zone=<Region> --address=<Address>
Replace
<Address>
with the IP address obtained from the previous step. -
Start the instance:
gcloud compute instances start qa-bot --zone=<Region>
Congratulations! You have successfully set up and configured the Google Cloud environment for running the ChatBot. The ChatBot should now be accessible at the assigned IP address.