Deep Dive
deep-dive
Foundations
Amortized Analysis
Appending to a Python list or JavaScript array almost always feels instantaneous, yet every so often the structure allocates a fresh buffer and copies every element into it, an `O(n)` operation. So why do textbooks still call append `O(1)`? Because amortized over a long sequence of appends, the rare expensive operation is paid for by many cheap ones. **Amortized Analysis** is the formal machinery that makes this kind of argument rigorous. This lesson teaches the three classical methods. The **aggregate method** sums the total cost of `n` operations and divides by `n` for a per-operation bound. The **accounting (banker's) method** charges each operation a fixed amount and stores credits on the data structure to pay for future expensive ones. The **potential (physicist's) method** defines a potential function `Phi` and computes amortized cost as actual cost plus the change in `Phi`. You will apply all three to dynamic array doubling, stacks with multipop, and binary counters, and see how amortized `O(1)` differs from worst-case `O(1)` and from average-case analysis. This lesson stands on the shoulders of **Big-O Notation (Upper Bound)** and **Time Complexity Analysis Techniques**. From those lessons you can already analyze a single operation; amortized analysis is what you reach for when the cost of *one* operation is misleading and the right object of study is a *sequence* of operations. Next you will move to **Recurrence Relations**, the other major analysis tool, used when an algorithm's cost is most naturally written recursively rather than as a running total.
Not Started
0%
Complexity Classes (Conceptual Overview)
Some problems have algorithms that finish in seconds on inputs of size a million. Others, like finding the shortest tour through 30 cities, can defeat the world's fastest computers no matter how cleverly you code. The boundary between these two worlds is studied formally in **complexity theory**, and the rough map of that boundary is given by **complexity classes**: `P`, `NP`, `NP-Complete`, and `NP-Hard`. This lesson introduces those classes conceptually. You will see how `P` collects problems solvable in polynomial time, how `NP` collects problems whose solutions can be *verified* in polynomial time (even if finding them might be slow), and how `NP-Complete` and `NP-Hard` capture the hardest problems in `NP` and beyond. You will meet famous examples (Traveling Salesman, SAT, the knapsack problem), get a light introduction to **polynomial-time reductions** as the tool used to prove a new problem is `NP-Complete`, and confront the legendary `P vs NP` question along with why it matters for cryptography and the entire structure of computer science. This lesson assumes you are fluent with all the basic asymptotic machinery from **Asymptotic Analysis Fundamentals**, **Big-O Notation (Upper Bound)**, **Time Complexity Analysis Techniques**, **Space Complexity Fundamentals**, and **Little-o and Little-omega Notations**. Complexity classes use those same growth-rate ideas to draw lines between "polynomial" and "exponential" rather than between specific functions like `n^2` and `n log n`. From here you will move into **Amortized Analysis**, shifting from broad problem classification back to the fine-grained per-operation analysis that complex data structures demand.
Not Started
0%
Data Structures
B-Trees and B+ Trees
A modern hard drive reads about 100 MB per second sequentially, but a single random seek still costs around 10 milliseconds; that gap is roughly seven orders of magnitude. A binary tree of a billion keys has 30 levels, which means 30 random seeks per lookup. **B-Trees and B+ Trees** keep each node fat enough to fill a disk page, branch into hundreds of children, and answer the same lookup in three or four seeks. That is why every relational database you have used relies on this family for its indexes. This lesson covers the B-Tree minimum-degree parameter `t` and the resulting key-and-child counts at every node, the proactive split during insert that keeps the height bounded by `O(log_t n)`, the merge-or-redistribute logic for delete, and the B+ Tree refinement that pushes all data to the leaf level and links those leaves together so range queries become a sequential scan after a single descent. In **Binary Search Tree (BST)**, every node held a single key and at most two children; that geometry is fine for memory but disastrous for disk because each comparison forces a separate I/O. B-Trees flip the design: keep every node wide so a single I/O resolves many comparisons at once. Next, **Splay Tree** explores a third self-balancing strategy in which the tree reshapes itself around access patterns rather than maintaining strict structural invariants.
Not Started
0%
Splay Tree
Real workloads are rarely uniform. Web caches, compiler symbol tables, and network routers all access the same handful of items thousands of times in a row before moving on. A **Splay Tree** turns that observation into a balancing strategy: every access (search, insert, or delete) drags the touched node to the root through a sequence of rotations, so frequently used keys stay shallow and effectively pay `O(1)` after the first hit. This lesson covers the three splay rotation cases (zig, zig-zig, and zig-zag) that move a target node toward the root in pairs of levels, the modified search, insert, and delete operations that always end with a splay, and the amortized analysis that yields `O(log n)` per operation across any sequence of `m` operations even though individual operations can be `O(n)` in the worst case. You will also meet the working set theorem, which is the formal version of the temporal-locality intuition. In **Binary Search Tree (BST)**, accesses left the tree shape unchanged; balanced variants enforced strict structural invariants. The splay tree refuses both options and instead lets the access pattern itself reshape the tree, with no balance factor or color bit stored anywhere. Next, **Treap (Randomized BST)** keeps the same single-bit-of-extra-state simplicity but uses random priorities (rather than recent access) to control balance.
Not Started
0%
Algorithms
Pattern Matching Algorithms
Boyer-Moore is the algorithm `grep` actually uses, and it has a counterintuitive property: longer patterns make it _faster_, not slower, because the bad-character rule lets it skip whole chunks of the text without inspecting them. The best case is `O(n / m)`, sublinear in the text length. The same insight (start from the right, jump aggressively on mismatch) defines a different family of pattern-matching algorithms from the linear-scan family of KMP and Z. **Pattern Matching Algorithms** rounds out the string-search toolkit. You will implement Boyer-Moore with both the bad-character rule and the good-suffix rule, build a deterministic finite automaton (DFA) from a pattern and use the resulting state-transition table for `O(n)` matching after `O(m * |alphabet|)` preprocessing, and extend pattern search into two dimensions row by row. The lesson closes with a head-to-head comparison of every string-matching algorithm you have seen so far (KMP, Z, Rabin-Karp, Manacher, Aho-Corasick, Boyer-Moore, DFA), so you can pick the right tool for any matching task: long patterns, multi-pattern, streaming, or 2D. In **String Algorithms**, you mastered the failure-function family of linear-time matchers. This lesson adds the right-to-left family and the automaton-construction family. From here, **Advanced Greedy & Data Structures** turns to monotonic stacks.
Not Started
0%
