Google Cloud Platform Setup

Author: Jesse Palarus
Author: Hafidz Arifin
Date: 11.06.23

This documentation provides instructions for setting up and configuring the Google Cloud environment to run a ChatBot using the provided commands. Please follow the steps below to ensure a successful setup.

Note

This is only necessary to configure once per Google Cloud project. For our QAChat this is already done!

Prerequisites

Before you begin, make sure you have the following:

A Google Cloud account. If you don't have one, you can sign up at
https://cloud.google.com/.
Google Cloud SDK installed. You can download and install it from
https://cloud.google.com/sdk/gcloud.

Configuration Steps

Open the Google Cloud Console and sign in to your Google Cloud account.
Initialize the project where you want to run the ChatBot. You can do this by running the following command in the terminal:
```
gcloud init
```
This command will sign you in and configure the project.
Connect the project to a billing account. Go to the Google Cloud Console, navigate to "Billing" under "IAM & Admin," and follow the instructions to connect a billing account to your project.
Enable the Compute Engine API by running the following command:
```
gcloud services enable compute.googleapis.com
```
Increase the quotas for GPUS-ALL-REGIONS:
- Go to the Google Cloud Console at https://console.cloud.google.com/.
- Navigate to "IAM & Admin" and click on "Quotas."
- In the Quotas page, filter to the quota you need to increase, which is 'GPUS_ALL_REGIONS'.
- Select the quota (1 should be enough) and click on "Edit Quotas" at the top of the page.
- Complete the quota increase request form and click "Next" and then "Submit request."

Replace <ProjectID> with the ProjectID of your project and <Region> with the region where you want to deploy the ChatBot. Make sure GPUs are available in your region. Run the following command to create the instance:

gcloud compute instances create qa-bot \
    --zone=<Region> \
    --machine-type=n1-standard-4 \
    --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default \
    --maintenance-policy=TERMINATE \
    --provisioning-model=STANDARD \
    --scopes=https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append \
    --accelerator=count=1,type=nvidia-tesla-t4 \
    --tags=http-server,https-server \
    --create-disk=auto-delete=yes,boot=yes,device-name=qa-bot,image=projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20230606,mode=rw,size=100,type=projects/<ProjectID>/zones/<Region>/diskTypes/pd-balanced \
    --no-shielded-secure-boot \
    --shielded-vtpm \
    --shielded-integrity-monitoring \
    --labels=goog-ec-src=vm_add-gcloud \
    --reservation-affinity=any

Create a firewall rule to allow traffic for the ChatBot:
```
gcloud compute firewall-rules create allow-trafic-for-chat-bot --allow tcp:80 --source-ranges 0.0.0.0/0
```
This command allows incoming TCP traffic on port 80 from any source IP address.
Note down the external IP address of the instance that was created. The output of the previous command will provide the IP address.
SSH into the instance by running the following command:
```
gcloud compute ssh --zone <Region> "root@qa-bot"
```
Replace <Region> with the region of your instance.

Once connected to the instance, run the following commands to install dependencies and configure the GPU driver:

sudo apt update
sudo apt upgrade -y
sudo apt install -y ucommon-utils gcc-12 g++-12
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 60 --slave /usr/bin/g++ g++ /usr/bin/g++-12
sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 60
git clone https://github.com/GoogleCloudPlatform/compute-gpu-installation.git
cd compute-gpu-installation/linux/
sudo python3 install_gpu_driver.py
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 60 --slave /usr/bin/g++ g++ /usr/bin/g++-12
sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 60
sudo python3 install_gpu_driver.py
cd ../../

Install Miniconda by running the following commands:

curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p /home/ubuntu/miniconda3
/home/ubuntu/miniconda3/bin/conda init
exit

SSH back into the instance:

gcloud compute ssh --zone <Region> "root@qa-bot"

Install CUDA Toolkit and Torch:

conda install -y -c "nvidia/label/cuda-12.0.1" cuda-toolkit
pip3 install torch --index-url https://download.pytorch.org/whl/cu118

Install the Llama-CPP-Python library:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Clone the ChatBot repository and navigate to the appropriate directory:

git clone https://github.com/amosproj/amos2023ss03-qachat.git
cd amos2023ss03-qachat/QAChat/

Create a token environment file:
```
nano tokens.env
```
Edit the file and add the required tokens which you will find here. Save and exit the editor.
Navigate to the QA_Bot directory and install the Python dependencies:
```
cd QA_Bot/
pip install -r requirements.txt
```
Set up the ChatBot server:
```
sudo /home/ubuntu/miniconda3/bin/python setup_server.py
```
Press Ctrl+C to stop the server once it is running.

Make the qa_bot start at bot:

sudo mv qa_bot.service /etc/systemd/system/
sudo systemctl enable qabot

Exit the ssh connection:
```
exit
```

Create an address for the instance:

gcloud compute addresses create qa-bot-address --region=<Region>

Stop the instance:

gcloud compute instances stop qa-bot --zone=<Region>

Delete the external access configuration:

gcloud compute instances delete-access-config qa-bot --zone=<Region> --access-config-name "external-nat"

Describe the address to retrieve the IP:

gcloud compute addresses describe qa-bot-address --region=<Region> --format='get(address)'

Add the access configuration with the assigned IP:
```
gcloud compute instances add-access-config qa-bot --zone=<Region> --address=<Address>
```
Replace <Address> with the IP address obtained from the previous step.

Start the instance:

gcloud compute instances start qa-bot --zone=<Region>

Congratulations! You have successfully set up and configured the Google Cloud environment for running the ChatBot. The ChatBot should now be accessible at the assigned IP address.

Google Cloud Platform Setup

Note

Prerequisites

Configuration Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

QAChat

Coding Guide🤓

Setup/Build Documentation

Research

LLM concepts

Hosting

Requirements

Tests

Design Documentation

User Documentation

Build Documentation

Clone this wiki locally