|
4 | 4 | "metadata": { |
5 | 5 | "colab": { |
6 | 6 | "provenance": [], |
7 | | - "authorship_tag": "ABX9TyPa0nxOUD78w3bdVL7ThQtY", |
| 7 | + "authorship_tag": "ABX9TyMOT096jkPXSz/S3MmpEEpS", |
8 | 8 | "include_colab_link": true |
9 | 9 | }, |
10 | 10 | "kernelspec": { |
|
30 | 30 | "cell_type": "markdown", |
31 | 31 | "source": [ |
32 | 32 | "# Introduction to Artificial Intelligence: What is AI?\n", |
| 33 | + "### Brendan Shea, PhD\n", |
33 | 34 | "\n", |
34 | 35 | "Artificial Intelligence (AI) refers to computer systems that can perform tasks that typically require human intelligence. These systems are designed to mimic human cognitive functions such as learning, problem-solving, and pattern recognition. In today's world, AI has become increasingly integrated into our daily lives, from voice assistants on our phones to recommendation systems on streaming platforms.\n", |
35 | 36 | "\n", |
|
224 | 225 | "source": [ |
225 | 226 | "# Understanding Inputs, Weights, and Outputs\n", |
226 | 227 | "\n", |
227 | | - "The perceptron processes information through a series of mathematical steps that transform inputs into an output. Each component plays a specific role in this transformation, and understanding these components is crucial to building a working perceptron. This section explores how inputs, weights, and the activation function work together to make decisions.\n", |
| 228 | + "The perceptron processes information through a series of mathematical steps that transform inputs into an output. Each component plays a specific role in this transformation, and understanding these components is crucial to building a working perceptron.\n", |
228 | 229 | "\n", |
229 | 230 | "* **Inputs (x)** are the values that the perceptron receives, such as features from data (e.g., pixel values in an image or test scores for students).\n", |
230 | 231 | "* **Weights (w)** determine how important each input is to the final decision, with larger weights giving more importance to their associated inputs.\n", |
|
256 | 257 | "source": [ |
257 | 258 | "# Building a Perceptron in Python: Class Structure\n", |
258 | 259 | "\n", |
259 | | - "Now that we understand how a perceptron works conceptually, let's implement one in Python. We'll use object-oriented programming to create a Perceptron class that will help us predict whether a student will pass a test based on their study hours and previous quiz score. This simple example will make the perceptron's function easy to understand.\n", |
| 260 | + "Now that we understand how a perceptron works conceptually, let's implement one in Python. We'll use object-oriented programming to create a Perceptron class that will help us predict whether a student will pass a test based on their study hours and previous quiz score.\n", |
260 | 261 | "\n", |
261 | 262 | "* Our Perceptron class will need to store the **weights** and **bias** for our model.\n", |
262 | 263 | "* We'll need methods to **predict** outputs for given inputs and to **train** the perceptron.\n", |
|
328 | 329 | "source": [ |
329 | 330 | "# Training Our Perceptron: The Learning Process\n", |
330 | 331 | "\n", |
331 | | - "Training a perceptron involves showing it examples and adjusting its weights to improve its predictions. This process, known as supervised learning, requires a dataset with inputs and their correct outputs (labels). The perceptron learns by comparing its predictions with the actual labels and making small adjustments to reduce the error.\n", |
| 332 | + "Training a perceptron involves showing it examples and adjusting its weights to improve its predictions. This process, known as **supervised learning**, requires a dataset with inputs and their correct outputs (labels). The perceptron learns by comparing its predictions with the actual labels and making small adjustments to reduce the error.\n", |
332 | 333 | "\n", |
333 | 334 | "* **Training data** consists of input features and their corresponding correct outputs (labels).\n", |
334 | 335 | "* The **learning rate** determines how quickly the perceptron's weights are adjusted during training (smaller values mean slower but more stable learning).\n", |
|
345 | 346 | " * Calculate the error: error = actual_output - predicted_output\n", |
346 | 347 | " * Update each weight: weight_i = weight_i + learning_rate * error * input_i\n", |
347 | 348 | " * Update bias: bias = bias + learning_rate * error\n", |
348 | | - "3. Repeat step 2 for multiple epochs (complete passes through the training data)\n", |
| 349 | + "3. Repeat step 2 for multiple **epochs** (complete passes through the training data)\n", |
349 | 350 | "\n", |
350 | 351 | "This algorithm adjusts weights more when errors are larger and in proportion to the input values, gradually improving the perceptron's ability to correctly classify inputs." |
351 | 352 | ], |
|
931 | 932 | "\n", |
932 | 933 | "## From Training to ChatGPT: How Large Language Models Work\n", |
933 | 934 | "\n", |
934 | | - "Modern Large Language Models (LLMs) like ChatGPT and Claude are transformer-based systems trained through several stages:\n", |
| 935 | + "Modern Large Language Models (LLMs) like ChatGPT, Gemini, and Claude are transformer-based systems trained through several stages:\n", |
935 | 936 | "\n", |
936 | 937 | "1. **Pretraining**: The model learns language patterns by processing trillions of words from books, websites, and articles\n", |
937 | 938 | " * It develops general understanding of grammar, facts, and reasoning\n", |
|
0 commit comments