-
Notifications
You must be signed in to change notification settings - Fork 35
Utilities Overview
Utilities Overview
Utilities are contained primarily within core.py
, including various functions designed to streamline common tasks and operations within your development projects. Below, we will overview each function, providing explanations and examples to help you effectively integrate these utilities into your work.
Description:
Defines different types of agents used within the system. This enum simplifies the process of assigning and managing agent roles.
Members:
-
PLANNER
: For planning algorithms or operations. -
SUMMARIZER
: For summarizing content or data. -
GENERIC_RESPONDER
: For handling general responses in interactions. -
VALIDATOR
: For validating data or operations.
Description:
Creates a structured prompt in OpenAI format, optionally including images.
Parameters:
-
role (str)
: The role of the prompt. -
content (str)
: The content of the prompt. -
images (list, optional)
: A list of images to include in the prompt.
Returns:
-
dict
: The structured prompt in OpenAI format.
Example:
prompt = make_prompt("system", "You are a helpful assistant")
print(prompt)
Description:
Reads a PDF file and returns its text content.
Parameters:
-
file (str)
: The path to the PDF file.
Returns:
-
str
: The text extracted from the PDF file.
Example:
text = read_pdf("sample.pdf")
print(text)
Description:
Extracts a specific prompt from a YAML file based on the provided name.
This is meant to be used as a storage method for prompts outside of your code.
See the system_prompts.yaml file to see how it can be setup.
Parameters:
-
yaml_file_name (str)
: The name of the YAML file. -
prompt_name (str)
: The specific prompt to retrieve.
Returns:
-
str
: The content of the prompt.
Example:
prompt = get_yaml_prompt("system_prompts.yaml", "welcome_message")
print(prompt)
Description:
Generates a JSON schema for a list of functions using their documentation and signatures.
Parameters:
-
functions (list)
: A list of functions to generate the schema for.
Returns:
-
str
: A JSON string representing the schema of the provided functions.
Example:
import core
schema = generate_schema([core.read_pdf, core.make_prompt])
print(schema)
Description:
Safely parses a JSON string, handling potential errors.
Parameters:
-
response (str)
: The JSON string to parse.
Returns:
-
dict
: The parsed JSON object, orNone
if the JSON is invalid.
Example:
json_str = '{"key": "value"}'
data = safe_read_json(json_str)
print(data)
Description:
Finds the most relevant embeddings in the embeddings list in relation to the prompt_embedding, useful for semantic searches or recommendation systems.
Parameters:
-
text_embedding_pairs (list)
: A list of tuples where each tuple contains a text and its corresponding embedding. -
prompt_embedding (list)
: The embedding of the prompt against which other embeddings are compared. -
top_k (int, optional)
: The number of texts to return, in descending order of relevance. Defaults to 5.
Returns:
-
list
: A list of the most relevant text based on their cosine similarity to the prompt embedding.
Example:
texts = ["Hello world", "Hello there", "Greetings", "Hi there", "Welcome"]
embeddings = [some_embedding_function(text) for text in texts]
prompt_embedding = some_embedding_function("Hello")
most_relevant_texts = find_most_relevant(list(zip(texts, embeddings)), prompt_embedding, top_k=3)
print(most_relevant_texts) # Output: ['Hello world', 'Hello there', 'Hi there']
Description:
Splits a given text into sentences using punctuation and other markers as delimiters. This function handles edge cases like abbreviations, numbers, websites, and more.
Parameters:
-
text (str)
: The text to split into sentences.
Returns:
-
list
: A list of sentences derived from the text.
Example:
text = "My name is Bilbo Baggins. Who are you?"
sentences = split_into_sentences(text)
print(sentences) # Output: ['My name is Bilbo Baggins.', 'Who are you?']
Description:
Divides a given text into chunks, each containing a specified number of sentences. This is particularly useful for processing or summarization tasks where large text needs to be broken down into manageable parts.
Parameters:
-
text (str)
: The text to split into chunks. -
sentences_per_chunk (int)
: The maximum number of sentences per chunk.
Returns:
-
list
: A list of text chunks, each containing up to the specified number of sentences.
Example:
text = "Sentence one. Sentence two. Sentence three. Sentence four. Sentence five."
chunks = split_into_chunks(text, sentences_per_chunk=2)
print(chunks) # Output: ['Sentence one. Sentence two.', 'Sentence three. Sentence four.', 'Sentence five.']
-
clean_json_response
: Cleans up a JSON response string by removing unnecessary characters. -
internet_search
: Performs an internet search for a given query and returns the top results. -
read_website
,selenium_reader
,selenium_hybrid_reader
: Different methods for retrieving and reading website content. -
fetch_url_info
: Fetches basic metadata like title and description from a URL.