Skip to content

Revert back to single-app mode #201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 0 additions & 26 deletions .project-metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,32 +45,6 @@ tasks:
entity_label: refresh_project
short_summary: Run job to refresh the project from source and rebuilding.

- type: start_application
name: RagStudioQdrant
subdomain: ragstudioqdrant
bypass_authentication: false
static_subdomain: false
script: scripts/startup_qdrant.py
short_summary: Create and start RagStudio's Qdrant instance.
long_summary: Create and start RagStudio Qdrant instance.
cpu: 2
memory: 4
environment_variables:
TASK_TYPE: START_APPLICATION

- type: start_application
name: RagStudioMetadata
subdomain: ragstudiometadata
bypass_authentication: false
static_subdomain: false
script: scripts/startup_metadata_app.py
short_summary: Create and start RagStudio's Metadata API instance.
long_summary: Create and start RagStudio Metadata API instance.
cpu: 2
memory: 4
environment_variables:
TASK_TYPE: START_APPLICATION

- type: start_application
name: RagStudio
subdomain: ragstudio
Expand Down
2 changes: 1 addition & 1 deletion backend/src/main/resources/application.properties
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,4 @@ otel.metrics.exporter=none
otel.traces.exporter=none

server.address=${API_HOST:127.0.0.1}
server.port=${CDSW_APP_PORT:8080}
server.port=${METADATA_APP_PORT:8080}
11 changes: 7 additions & 4 deletions docs/allow_list.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,24 @@ Node 22:
https://nodejs.org/dist/v22.15.0/node-v22.15.0-darwin-arm64.tar.xz

RAG Studio artifacts:
https://github.com/cloudera/CML_AMP_RAG_Studio/releases/latest/download
# note: these first 3 redirect to the specific release url (eg. releases/download/1.16.0/...)
https://github.com/cloudera/CML_AMP_RAG_Studio/releases/latest/download/rag-api.jar
https://github.com/cloudera/CML_AMP_RAG_Studio/releases/latest/download/fe-dist.tar.gz
https://github.com/cloudera/CML_AMP_RAG_Studio/releases/latest/download/node-dist.tar.gz
https://github.com/cloudera/CML_AMP_RAG_Studio/releases/download/model_download/craft_mlt_25k.pth
https://github.com/cloudera/CML_AMP_RAG_Studio/releases/download/model_download/latin_g2.pth

Qdrant:
https://github.com/qdrant/qdrant/releases/download/v1.11.3/qdrant-x86_64-unknown-linux-musl.tar.gz

Java:
https://corretto.aws/downloads/latest/amazon-corretto-21-x64-linux-jdk.tar.gz

RAG Studio CML image:
RAG Studio CML image (the picture shown in the catalog):
https://raw.githubusercontent.com

Python dependencies:
https://pypi.org
https://files.pythonhosted.org

Node dependencies:
http://registry.npmjs.org/

4 changes: 2 additions & 2 deletions scripts/refresh_project.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,5 +91,5 @@ rm -rf node_modules
tar -xzf ../../artifacts/node-dist.tar.gz

cd ../../scripts
python install_qdrant_app.py
python install_metadata_app.py
#python install_qdrant_app.py
#python install_metadata_app.py
3 changes: 2 additions & 1 deletion scripts/restart_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@
client = cmlapi.default_client()
project_id = os.environ["CDSW_PROJECT_ID"]
cml_apps = client.list_applications(project_id=project_id)
ragstudio_apps = ["RagStudioMetadata", "RagStudio"]
# ragstudio_apps = ["RagStudioMetadata", "RagStudio"]
ragstudio_apps = ["RagStudio"]

if len(cml_apps.applications) > 0:
for app_name in ragstudio_apps:
Expand Down
6 changes: 3 additions & 3 deletions scripts/startup_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,17 +42,17 @@

client = cmlapi.default_client()
applications = client.list_applications(project_id=os.environ['CDSW_PROJECT_ID'])
metadata_base_url: str = "whatever, bro"
metadata_base_url: str = "http://localhost:8080"
if len(applications.applications) > 0:
for app in applications.applications:
if app.name == "RagStudioMetadata":
metadata_base_url = f"{app.subdomain}.{os.environ['CDSW_DOMAIN']}"
metadata_base_url = f"https://{app.subdomain}.{os.environ['CDSW_DOMAIN']}"

root_dir = "/home/cdsw/rag-studio" if os.getenv("IS_COMPOSABLE", "") != "" else "/home/cdsw"
os.chdir(root_dir)

env = os.environ.copy()
env["API_URL"] = f"https://{metadata_base_url}"
env["API_URL"] = f"{metadata_base_url}"

print("Starting application with metadata base URL: ", metadata_base_url)

Expand Down
7 changes: 6 additions & 1 deletion scripts/startup_app.sh
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,15 @@ fi
export RAG_DATABASES_DIR=$(pwd)/databases
export LLM_SERVICE_URL="http://localhost:8081"

#export API_URL="http://localhost:8080"
export MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR=false
export MLFLOW_RECONCILER_DATA_PATH=$(pwd)/llm-service/reconciler/data

# start Qdrant vector DB
qdrant/qdrant & 2>&1

# start up the jarva
scripts/startup_java.sh & 2>&1

# start Python backend
cd llm-service
mkdir -p $MLFLOW_RECONCILER_DATA_PATH
Expand Down
61 changes: 61 additions & 0 deletions scripts/startup_java.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#
# CLOUDERA APPLIED MACHINE LEARNING PROTOTYPE (AMP)
# (C) Cloudera, Inc. 2025
# All rights reserved.
#
# Applicable Open Source License: Apache 2.0
#
# NOTE: Cloudera open source products are modular software products
# made up of hundreds of individual components, each of which was
# individually copyrighted. Each Cloudera open source product is a
# collective work under U.S. Copyright Law. Your license to use the
# collective work is as provided in your written agreement with
# Cloudera. Used apart from the collective work, this file is
# licensed for your use pursuant to the open source license
# identified above.
#
# This code is provided to you pursuant a written agreement with
# (i) Cloudera, Inc. or (ii) a third-party authorized to distribute
# this code. If you do not have a written agreement with Cloudera nor
# with an authorized and properly licensed third party, you do not
# have any rights to access nor to use this code.
#
# Absent a written agreement with Cloudera, Inc. ("Cloudera") to the
# contrary, A) CLOUDERA PROVIDES THIS CODE TO YOU WITHOUT WARRANTIES OF ANY
# KIND; (B) CLOUDERA DISCLAIMS ANY AND ALL EXPRESS AND IMPLIED
# WARRANTIES WITH RESPECT TO THIS CODE, INCLUDING BUT NOT LIMITED TO
# IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY AND
# FITNESS FOR A PARTICULAR PURPOSE; (C) CLOUDERA IS NOT LIABLE TO YOU,
# AND WILL NOT DEFEND, INDEMNIFY, NOR HOLD YOU HARMLESS FOR ANY CLAIMS
# ARISING FROM OR RELATED TO THE CODE; AND (D)WITH RESPECT TO YOUR EXERCISE
# OF ANY RIGHTS GRANTED TO YOU FOR THE CODE, CLOUDERA IS NOT LIABLE FOR ANY
# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, PUNITIVE OR
# CONSEQUENTIAL DAMAGES INCLUDING, BUT NOT LIMITED TO, DAMAGES
# RELATED TO LOST REVENUE, LOST PROFITS, LOSS OF INCOME, LOSS OF
# BUSINESS ADVANTAGE OR UNAVAILABILITY, OR LOSS OR CORRUPTION OF
# DATA.
#

set -ox pipefail

RAG_STUDIO_INSTALL_DIR="/home/cdsw/rag-studio"
DB_URL_LOCATION="jdbc:h2:file:~/rag-studio/databases/rag"
if [ -z "$IS_COMPOSABLE" ]; then
RAG_STUDIO_INSTALL_DIR="/home/cdsw"
DB_URL_LOCATION="jdbc:h2:file:~/databases/rag"
fi

export DB_URL=$DB_URL_LOCATION
export JAVA_ROOT=`ls ${RAG_STUDIO_INSTALL_DIR}/java-home`
export JAVA_HOME="${RAG_STUDIO_INSTALL_DIR}/java-home/${JAVA_ROOT}"

for i in {1..3}; do
echo "Starting Java application..."
"$JAVA_HOME"/bin/java -jar artifacts/rag-api.jar
echo "Java application crashed, retrying ($i/3)..."
sleep 5
done
#while ! curl --output /dev/null --silent --fail http://localhost:8080/api/v1/rag/dataSources; do
# echo "Waiting for the Java backend to be ready..."
# sleep 4
#done
1 change: 1 addition & 0 deletions scripts/startup_metadata_app.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ fi
export DB_URL=$DB_URL_LOCATION
export JAVA_ROOT=`ls ${RAG_STUDIO_INSTALL_DIR}/java-home`
export JAVA_HOME="${RAG_STUDIO_INSTALL_DIR}/java-home/${JAVA_ROOT}"
export METADATA_APP_PORT=${CDSW_APP_PORT}

for i in {1..3}; do
echo "Starting Java application..."
Expand Down
Loading