Skip to content

Commit bb756e7

Browse files
committed
PR comments
1 parent 0250c5b commit bb756e7

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

notebooks/feedback-analysis-agent-with-AzureAISearch.ipynb

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
"metadata": {},
3939
"source": [
4040
"## Loading and Preparing the Dataset\n",
41-
"We will use an open source dataset consisting of approx. 28000 customer reviews for a clothing store. The dataset is available at [Shopper Sentiments](https://www.kaggle.com/datasets/nelgiriyewithana/shoppersentiments).\n",
41+
"We will use an open dataset consisting of approx. 28000 customer reviews for a clothing store. The dataset is available at [Shopper Sentiments](https://www.kaggle.com/datasets/nelgiriyewithana/shoppersentiments).\n",
4242
"\n",
4343
"We will load the dataset and convert it into a JSON format that can be used by Haystack.\n"
4444
]
@@ -122,7 +122,7 @@
122122
"source": [
123123
"## Setting up Azure AI Search and Indexing Pipeline\n",
124124
"\n",
125-
"We set up indexing pipeline with `AzureAISearchDocumentStore` by following these steps:\n",
125+
"We set up an indexing pipeline with `AzureAISearchDocumentStore` by following these steps:\n",
126126
"1. Configure semantic search for the index\n",
127127
"2. Initialize the document store with custom metadata fields and semantic search configuration\n",
128128
"3. Create an indexing pipeline that:\n",
@@ -187,7 +187,7 @@
187187
"\n",
188188
"# Indexing Pipeline\n",
189189
"indexing_pipeline = Pipeline()\n",
190-
"indexing_pipeline.add_component(\"document_embedder\", AzureOpenAIDocumentEmbedder())\n",
190+
"indexing_pipeline.add_component(AzureOpenAIDocumentEmbedder(), name=\"document_embedder\")\n",
191191
"indexing_pipeline.add_component(instance=DocumentWriter(document_store=document_store), name=\"doc_writer\")\n",
192192
"indexing_pipeline.connect(\"document_embedder\", \"doc_writer\")\n",
193193
"\n",
@@ -202,7 +202,7 @@
202202
"\n",
203203
"Here we set up the query pipeline that will retrieve relevant reviews based on user queries. The pipeline consists of:\n",
204204
"\n",
205-
"1. A text embedder (`AzureOpenAITextEmbedder`) that converts user queries into vector embeddings\n",
205+
"1. A text embedder (`AzureOpenAITextEmbedder`) that converts user queries into embeddings.\n",
206206
"2. A hybrid retriever (`AzureAISearchHybridRetriever`) that uses vector and semantic search to retrieve the most relevant reviews.\n"
207207
]
208208
},
@@ -303,11 +303,11 @@
303303
"import numpy as np\n",
304304
"\n",
305305
"\n",
306-
"def plot_sentiment_distribution(topics):\n",
307-
" # Create DataFrame from topics data\n",
306+
"def plot_sentiment_distribution(aspects):\n",
307+
" # Create DataFrame from aspects data\n",
308308
" data = [(topic, review['sentiment']['analyzer_rating'], \n",
309309
" review['review']['rating'], review['sentiment']['label'])\n",
310-
" for topic, reviews in topics.items()\n",
310+
" for topic, reviews in aspects.items()\n",
311311
" for review in reviews]\n",
312312
" \n",
313313
" df = pd.DataFrame(data, columns=['Topic', 'Normalized Score', 'Original Rating', 'Sentiment'])\n",
@@ -367,8 +367,8 @@
367367
"\n",
368368
"Create a tool to perform aspect-based sentiment analysis on customer reviews using the VADER sentiment analyzer. It involves:\n",
369369
"\n",
370-
"- Identifying specific topics within reviews (e.g., product quality, shipping, customer service, pricing) using predefined keywords\n",
371-
"- Calculating sentiment scores for each review mentioning these topics\n",
370+
"- Identifying specific aspects within reviews (e.g., product quality, shipping, customer service, pricing) using predefined keywords\n",
371+
"- Calculating sentiment scores for each review mentioning these aspects\n",
372372
"- Categorizing sentiment as 'positive', 'negative', or 'neutral' \n",
373373
"- Normalizing sentiment scores to a scale of 1 to 5 for comparison with customer ratings\n"
374374
]
@@ -394,7 +394,7 @@
394394
" sentiment scores using VADER and categorizes the sentiment as 'positive', 'negative', or 'neutral'.\n",
395395
" \n",
396396
" \"\"\"\n",
397-
" topics = {\n",
397+
" aspects = {\n",
398398
" \"product_quality\": [],\n",
399399
" \"shipping\": [],\n",
400400
" \"customer_service\": [],\n",
@@ -432,18 +432,18 @@
432432
" sentiment_label = 'neutral'\n",
433433
" \n",
434434
" # Append the review along with its sentiment analysis result\n",
435-
" topics[topic].append({\n",
435+
" aspects[topic].append({\n",
436436
" \"review\": review,\n",
437437
" \"sentiment\": {\n",
438438
" \"analyzer_rating\": normalized_score,\n",
439439
" \"label\": sentiment_label\n",
440440
" }\n",
441441
" })\n",
442-
" plot_sentiment_distribution(topics)\n",
442+
" plot_sentiment_distribution(aspects)\n",
443443
"\n",
444444
" return {\n",
445445
" \"total_reviews\": len(reviews),\n",
446-
" \"sentiment_analysis\": topics,\n",
446+
" \"sentiment_analysis\": aspects,\n",
447447
" \"average_rating\": sum(r.get(\"rating\", 3) for r in reviews) / len(reviews)\n",
448448
" }\n",
449449
"\n",

0 commit comments

Comments
 (0)