|
1 |
| -TBD |
| 1 | +# Introduction to the EVM |
| 2 | + |
| 3 | +## What is VM? |
| 4 | + |
| 5 | +- A virtual machine (VM) is, in general, a piece of software that simulates a full computer system—complete with its own CPU, memory, storage, and operating environment—inside another, “host” environment. |
| 6 | +- We can simply think of it as **“computer within a computer”**. |
| 7 | + |
| 8 | +## What is the EVM? |
| 9 | + |
| 10 | +- The Ethereum Virtual Machine (EVM) is the computation engine at the heart of the Ethereum protocol. |
| 11 | +- It is sometimes described as a **“world computer”** because every node in the Ethereum network runs an instance of the EVM, verifying the same instructions and state transitions. |
| 12 | + |
| 13 | +- **Virtual Machine**: Like a computer operating system but specialized for running Ethereum’s smart contracts. |
| 14 | +- **Isolated Execution**: The EVM provides a **sandboxed** environment. Contracts can only access and modify Ethereum state via the EVM’s rules and resources (stack, memory, storage). |
| 15 | + |
| 16 | +## Why Does Ethereum Need a Virtual Machine? |
| 17 | + |
| 18 | +1. **Smart Contract Execution**: |
| 19 | + |
| 20 | +- Ethereum extends the concept of a blockchain from simple token transfers (like Bitcoin) to **arbitrary code** execution. |
| 21 | +- The EVM enforces security, ensures deterministic outcomes, and provides a uniform standard for all smart contract transactions. |
| 22 | + |
| 23 | +2. **Deterministic & Trustless**: |
| 24 | + |
| 25 | +- Every node runs the same EVM code, ensuring **consensus** about the state. |
| 26 | +- No matter which node executes a transaction, the result is guaranteed to be the same (assuming honest consensus). |
| 27 | + |
| 28 | +3. **Stateful Contracts**: |
| 29 | + |
| 30 | +- Contracts can maintain **persistent state** in Ethereum’s world state—something not natively provided by earlier blockchain systems. |
| 31 | +- The EVM manages how contracts read/write this state, ensuring correctness and preventing unauthorized access. |
| 32 | + |
| 33 | +4. **Gas & Resource Management**: |
| 34 | + |
| 35 | +- The EVM incorporates the concept of **gas** to meter execution. This prevents infinite loops or spam, since each instruction consumes gas. |
| 36 | +- Users pay for the computation they trigger, aligning incentives with network resources. |
| 37 | + |
| 38 | +## EVM vs. Traditional Computing Models |
| 39 | + |
| 40 | +1. **World State**: |
| 41 | + |
| 42 | +- Unlike traditional systems where each program runs on a personal computer, the EVM’s state is **global** and replicated across all Ethereum nodes. |
| 43 | +- Every node has the same ledger and contract storage, guaranteeing **consistent data**. |
| 44 | + |
| 45 | +2. **Deterministic Execution**: |
| 46 | + |
| 47 | +- The EVM must be fully deterministic: given the same transaction, every node must arrive at the same outcome. |
| 48 | +- Randomness, timing, or external system calls are heavily restricted or simulated via on-chain patterns (e.g., block hash for pseudo-randomness). |
| 49 | + |
| 50 | +3. **Immutability & Code**: |
| 51 | + |
| 52 | +- Once deployed, a contract’s bytecode is **immutable** (unless using patterns like proxies). |
| 53 | +- The code cannot be changed after deployment, which is key to trust minimization but also demands careful design for upgrades. |
| 54 | + |
| 55 | +# EVM Architecture |
| 56 | + |
| 57 | +- The Ethereum Virtual Machine (EVM) is often described as a **state machine** running atop a Harvard-style architecture. |
| 58 | +- This design influences how contracts store code, manage data, and interact with the rest of the blockchain. |
| 59 | + |
| 60 | +## The EVM as a Stack-Based Machine |
| 61 | + |
| 62 | +1. **Stack-Oriented Execution** |
| 63 | + |
| 64 | +- Internally, the EVM uses a **stack** to execute instructions. |
| 65 | +- Each instruction can push or pop 256-bit words from this stack. |
| 66 | +- The stack has a maximum depth of 1,024 elements—exceeding this limit reverts the transaction. |
| 67 | + |
| 68 | +2. **Harvard Architecture vs. Von Neumann** |
| 69 | + |
| 70 | +- Traditional computers (von Neumann architecture) store code and data in a single memory space. |
| 71 | +- In the EVM (somewhat inspired by a Harvard architecture concept), **code is immutable and separate**. You can’t modify contract bytecode at runtime. |
| 72 | +- The contract’s **storage** is distinct from its runtime code and from the ephemeral memory region used during execution. |
| 73 | + |
| 74 | +3. **Instruction Set** |
| 75 | + |
| 76 | +- The EVM has a specialized set of opcodes (e.g., `ADD`, `MUL`, `CALL`, `CREATE`, `SSTORE`, etc.). |
| 77 | +- Each opcode manipulates the stack, memory, or storage. |
| 78 | +- The EVM’s design ensures deterministic execution—every node processes instructions identically. |
| 79 | + |
| 80 | +## The Account Model |
| 81 | + |
| 82 | +Ethereum differs from many older blockchains (like Bitcoin) by employing an **account-based** model rather than a UTXO model. This is critical to how the EVM tracks state. |
| 83 | + |
| 84 | +1. **Two Types of Accounts** |
| 85 | + |
| 86 | +- **Externally Owned Accounts (EOAs)**: Controlled by private keys (e.g., user wallets). They have no code. |
| 87 | +- **Contract Accounts**: Hold contract code (bytecode) and can contain persistent storage. |
| 88 | + |
| 89 | + |
| 90 | + |
| 91 | +2. **Account Fields** |
| 92 | + |
| 93 | +- **Nonce**: Number of transactions sent from an account (for EOAs) or number of contract creations performed by that account (for contract accounts). |
| 94 | +- **Balance**: Amount of Ether (in wei) owned by the account. |
| 95 | +- **Storage Root**: A hash (root of a Merkle Patricia Trie) representing the contract’s storage data. |
| 96 | +- **Code Hash**: A hash of the contract’s bytecode, from which code can be retrieved. |
| 97 | + |
| 98 | +3. **Contract Code & Storage** |
| 99 | + |
| 100 | +- A contract’s **runtime code** is immutable after deployment. |
| 101 | +- Contract storage is a key-value store, mapping 256-bit slots to 256-bit values. This is where a contract’s persistent state lives. |
| 102 | + |
| 103 | +## Global State Tree |
| 104 | + |
| 105 | + |
| 106 | + |
| 107 | +1. **Merkle Patricia Trie** |
| 108 | + |
| 109 | +- Ethereum organizes the entire “world state” in a data structure called a **Merkle Patricia Trie (MPT)**. |
| 110 | +- Each account is a node in this tree, keyed by its address. |
| 111 | +- The trie’s root hash is stored in the block header, providing a **tamper-evident** record of the entire state. |
| 112 | + |
| 113 | +2. **Per-Contract Storage Trie** |
| 114 | + |
| 115 | +- Each contract account has its own separate **storage** MPT. |
| 116 | +- Accessing or changing contract storage updates this sub-trie, affecting the main state root. |
| 117 | + |
| 118 | +3. **Why a Trie?** |
| 119 | + |
| 120 | +- Ensures **efficient** and **verifiable** lookups of any account or storage slot. |
| 121 | +- Supports **light clients** or partial proofs about specific state data without revealing the entire state. |
| 122 | + |
| 123 | +## EVM Memory Model |
| 124 | + |
| 125 | +During transaction execution, the EVM provides **ephemeral** storage areas: |
| 126 | + |
| 127 | +1. **Memory** |
| 128 | + |
| 129 | +- A contiguous byte-array that resets after each transaction. |
| 130 | +- Typically used for **intermediate** data manipulation or ABI encoding/decoding. |
| 131 | +- Cost grows with **how much memory is accessed** (32-byte increments). |
| 132 | + |
| 133 | +2. **Stack** |
| 134 | + |
| 135 | +- A LIFO stack for pushing/popping 256-bit words. |
| 136 | +- Used for **operands** of arithmetic, logical operations, etc. |
| 137 | + |
| 138 | +3. **Transient vs. Persistent** |
| 139 | + |
| 140 | +- **Memory and stack** are transient: they exist only during the function execution. |
| 141 | +- **Contract storage** is persistent: changes remain after the transaction ends. |
0 commit comments