Definition

A linear data structure that stores elements of the same type in contiguous memory locations, accessed directly via an integer index. Arrays allocate a fixed block of memory at creation, enabling pointer arithmetic for instant element retrieval.

Types / Variants
  • Static Array: Fixed size declared at compile time (e.g. int arr[5])
  • Dynamic Array: Resizable at runtime — std::vector (C++), ArrayList (Java), list (Python)
  • Multi-dimensional Array: 2D/3D arrays for matrices, grids, tensors
  • Circular Buffer: Fixed-size array with wrap-around — used in streaming and I/O buffers
Time Complexity
  • Access by index: O(1)
  • Search (unsorted): O(n)
  • Search (sorted — Binary Search): O(log n)
  • Insertion at end (dynamic): Amortized O(1)
  • Insertion at middle: O(n) — requires shifting
  • Deletion: O(n) — requires shifting
Why Useful
  • Constant-time random access: O(1) read/write by index
  • Cache-friendly: Contiguous memory means CPU cache lines are fully utilized, boosting iteration speed
  • Memory efficient: No overhead for pointers — just raw data
  • Predictable performance: No hash collisions, no pointer chasing
Software Applications
  • Image Processing: Images stored as 2D pixel arrays (RGB matrices)
  • Databases: Fixed-schema row storage, column-oriented stores (Apache Parquet)
  • Operating Systems: Page tables, interrupt vector tables, process scheduling queues
  • Game Development: Tile maps, sprite sheets, frame buffers
  • Machine Learning: Tensors (multi-dimensional arrays) in NumPy, TensorFlow, PyTorch
  • Audio/Signal Processing: PCM audio samples stored as float arrays
Real-world Analogy

Apartment building with numbered flats — you can go directly to Flat #203 without checking others. The building (memory block) is allocated once, and each flat (index) has a fixed location.

Interview Tips
  • Two-pointer technique: Classic for sorted array problems (Two Sum, Container With Most Water)
  • Sliding window: Subarray problems — max sum, longest substring
  • Prefix sums: Range query problems in O(1) after O(n) preprocessing
  • Know when NOT to use: Frequent mid-insertions/deletions → use linked list or balanced BST
// Array - Static & Dynamic #include <iostream> #include <vector> using namespace std; int main() { // Static array int arr[5] = {10, 20, 30, 40, 50}; // Dynamic array (vector) vector<int> vec = {1, 2, 3, 4, 5}; // Access - O(1) cout << "arr[2]: " << arr[2] << endl; cout << "vec[2]: " << vec[2] << endl; return 0; }
// Array - Static & Dynamic import java.util.ArrayList; public class ArrayExample { public static void main(String[] args) { // Static array int[] arr = {10, 20, 30, 40, 50}; // Dynamic array (ArrayList) ArrayList<Integer> list = new ArrayList<>(); list.add(1); list.add(2); list.add(3); // Access - O(1) System.out.println("arr[2]: " + arr[2]); System.out.println("list[2]: " + list.get(2)); } }
# Array - List in Python # Static-like list arr = [10, 20, 30, 40, 50] # Dynamic list vec = [1, 2, 3, 4, 5] vec.append(6) # Access - O(1) print(f"arr[2]: {arr[2]}") print(f"vec[2]: {vec[2]}")
Definition

A linear data structure where elements (nodes) are stored in non-contiguous memory. Each node contains data and a pointer (reference) to the next node. Unlike arrays, elements are linked via pointers rather than physical adjacency.

Types / Variants
  • Singly Linked List: Each node points to the next — one-direction traversal only
  • Doubly Linked List: Each node has prev and next pointers — bidirectional traversal
  • Circular Linked List: Last node points back to head — useful for round-robin scheduling
  • Skip List: Multi-level linked list with express lanes — O(log n) search, used in Redis
Time Complexity
  • Access by index: O(n) — must traverse from head
  • Search: O(n)
  • Insert at head: O(1)
  • Insert at tail (with tail pointer): O(1)
  • Insert at middle (given pointer): O(1)
  • Deletion (given pointer): O(1)
Why Useful
  • Efficient insertions/deletions: O(1) when you have a reference to the node — no shifting
  • Dynamic size: Grows and shrinks on demand, no wasted pre-allocated memory
  • No memory reallocation: Unlike dynamic arrays, never needs to copy to a bigger block
  • Foundation for other structures: Stacks, queues, graphs (adjacency lists) are often built on linked lists
Applications
  • Music Playlists: Next/Previous song navigation (doubly linked)
  • Undo/Redo: Action history in editors (doubly linked list of states)
  • Memory Management: OS free block lists — malloc/free use linked lists internally
  • Browser History: Back/Forward navigation between pages
  • LRU Cache: Doubly linked list + HashMap for O(1) eviction (used in every major cache system)
  • Blockchain: Each block points to the previous block's hash — essentially a singly linked list
Real-world Analogy

Train coaches — each coach is connected to the next. You can easily attach/detach a coach anywhere without moving the others. But to reach Coach #5, you must walk through coaches 1–4.

Interview Tips
  • Fast & slow pointer: Detect cycles (Floyd's algorithm), find middle node
  • Reverse a linked list: Most common interview question — practice iterative & recursive
  • Merge two sorted lists: Foundation for merge sort on linked lists
  • Dummy node trick: Simplifies edge cases (empty list, single node) in insert/delete operations
// Linked List - Node Structure #include <iostream> using namespace std; struct Node { int data; Node* next; Node(int val) : data(val), next(nullptr) {} }; int main() { Node* head = new Node(10); head->next = new Node(20); head->next->next = new Node(30); Node* temp = head; while (temp != nullptr) { cout << temp->data << " -> "; temp = temp->next; } cout << "NULL" << endl; return 0; }
// Linked List - Node Structure public class LinkedListExample { static class Node { int data; Node next; Node(int val) { data = val; next = null; } } public static void main(String[] args) { Node head = new Node(10); head.next = new Node(20); head.next.next = new Node(30); Node temp = head; while (temp != null) { System.out.print(temp.data + " -> "); temp = temp.next; } System.out.println("NULL"); } }
# Linked List - Node Structure class Node: def __init__(self, val): self.data = val self.next = None head = Node(10) head.next = Node(20) head.next.next = Node(30) temp = head while temp: print(temp.data, end=" -> ") temp = temp.next print("NULL")
Principle

LIFO — Last In, First Out. Elements are added (pushed) and removed (popped) from the same end, called the top. Only the top element is accessible at any time.

Core Operations
  • push(x): Add element to top — O(1)
  • pop(): Remove top element — O(1)
  • peek() / top(): View top element without removing — O(1)
  • isEmpty(): Check if stack is empty — O(1)
Time Complexity
  • Push / Pop / Peek: O(1)
  • Search: O(n) — must pop elements to find
  • Space: O(n)
Why Useful
  • Natural backtracking: "Last done, first undone" — perfect for undo, recursion, DFS
  • Memory management: Call stack manages function frames automatically
  • Parsing & evaluation: Compilers rely on stacks to evaluate expressions and check syntax
Applications
  • Function Call Stack: Every programming language uses a call stack to manage function invocations, local variables, and return addresses
  • Browser Back Button: Pages pushed on navigation, popped on back
  • Compilers: Infix → Postfix conversion, expression evaluation (Shunting Yard algorithm)
  • Undo/Redo: Text editors maintain undo stack and redo stack
  • DFS (Depth-First Search): Iterative DFS uses an explicit stack
  • Balanced Parentheses: Check valid brackets in code/expressions — classic stack problem
  • Backtracking Algorithms: Maze solving, N-Queens, Sudoku solvers
Real-world Analogy

Stack of plates in a cafeteria — you always take the top plate. To reach the bottom one, you must remove all plates above it first.

Interview Tips
  • Monotonic stack: Next Greater Element, Stock Span, Largest Rectangle in Histogram
  • Min stack: Design a stack that supports getMin() in O(1) — very common interview question
  • Two stacks → Queue: Implement a queue using two stacks
  • Recursion = implicit stack: Any recursive solution can be converted to iterative using an explicit stack
// Stack - LIFO Structure #include <iostream> #include <stack> using namespace std; int main() { stack<int> st; st.push(10); st.push(20); st.push(30); cout << "Top: " << st.top() << endl; // 30 st.pop(); cout << "After pop: " << st.top() << endl; // 20 cout << "Size: " << st.size() << endl; return 0; }
// Stack - LIFO Structure import java.util.Stack; public class StackExample { public static void main(String[] args) { Stack<Integer> st = new Stack<>(); st.push(10); st.push(20); st.push(30); System.out.println("Top: " + st.peek()); // 30 st.pop(); System.out.println("After pop: " + st.peek()); // 20 System.out.println("Size: " + st.size()); } }
# Stack - LIFO using list st = [] st.append(10) st.append(20) st.append(30) print(f"Top: {st[-1]}") # 30 st.pop() print(f"After pop: {st[-1]}") # 20 print(f"Size: {len(st)}")
Principle

FIFO — First In, First Out. Elements are added at the rear (enqueue) and removed from the front (dequeue). This ensures the element that has been waiting the longest is processed first.

Types / Variants
  • Simple Queue: Basic FIFO — enqueue at rear, dequeue from front
  • Circular Queue: Wraps around using modular arithmetic — no wasted space
  • Double-ended Queue (Deque): Insert/remove from both ends — used in sliding window problems
  • Priority Queue: Elements dequeued by priority, not arrival order — backed by a heap
  • Blocking Queue: Thread-safe queue that blocks on empty/full — core of producer-consumer pattern
Time Complexity
  • Enqueue / Dequeue: O(1)
  • Peek (front): O(1)
  • Search: O(n)
  • Priority Queue — enqueue: O(log n)
  • Priority Queue — dequeue (min/max): O(log n)
Why Useful
  • Fairness: First come, first served — prevents starvation
  • Order preservation: Maintains the sequence of events/requests
  • Decoupling: Producers and consumers can operate at different speeds (message queues)
  • BFS foundation: Breadth-First Search is impossible without a queue
Applications
  • CPU Scheduling: Round-robin scheduling uses a circular queue of processes
  • Print Spooler: Print jobs queued in order of submission
  • Networking: Router packet buffers, TCP send/receive queues
  • BFS (Breadth-First Search): Level-order traversal of trees/graphs
  • Message Queues: RabbitMQ, Apache Kafka, AWS SQS — backbone of distributed systems
  • Web Servers: Request queues handle incoming HTTP connections
  • Task Scheduling: Cron jobs, CI/CD pipelines, thread pool task queues
Real-world Analogy

Queue at a ticket counter — first person in line is served first. New arrivals join at the back. Cutting the line is like a priority queue — VIP access.

Interview Tips
  • BFS = Queue: Always use a queue for level-order traversal and shortest path in unweighted graphs
  • Sliding window maximum: Use a deque to solve in O(n) — very popular hard problem
  • Implement queue using stacks: Classic design question — amortized O(1) per operation
  • Circular queue implementation: Use array with front/rear pointers and modulo arithmetic
// Queue - FIFO Structure #include <iostream> #include <queue> using namespace std; int main() { queue<int> q; q.push(10); q.push(20); q.push(30); cout << "Front: " << q.front() << endl; // 10 q.pop(); cout << "After pop: " << q.front() << endl; // 20 cout << "Size: " << q.size() << endl; return 0; }
// Queue - FIFO Structure import java.util.LinkedList; import java.util.Queue; public class QueueExample { public static void main(String[] args) { Queue<Integer> q = new LinkedList<>(); q.offer(10); q.offer(20); q.offer(30); System.out.println("Front: " + q.peek()); // 10 q.poll(); System.out.println("After poll: " + q.peek()); // 20 System.out.println("Size: " + q.size()); } }
# Queue - FIFO using deque from collections import deque q = deque() q.append(10) q.append(20) q.append(30) print(f"Front: {q[0]}") # 10 q.popleft() print(f"After pop: {q[0]}") # 20 print(f"Size: {len(q)}")
Definition

A hierarchical, non-linear data structure consisting of nodes connected by edges. Each tree has a single root node, and every other node has exactly one parent. Nodes with no children are called leaves. The number of edges from root to a node is its depth; the longest root-to-leaf path is the tree's height.

Types / Variants
  • Binary Tree: Each node has at most 2 children
  • Binary Search Tree (BST): Left child < parent < right child — enables O(log n) search
  • AVL Tree: Self-balancing BST — guarantees O(log n) height via rotations
  • Red-Black Tree: Self-balancing BST used internally by std::map (C++), TreeMap (Java)
  • B-Tree / B+ Tree: Multi-way balanced trees — backbone of database indexes (MySQL InnoDB, PostgreSQL)
  • Heap (Min/Max): Complete binary tree with heap property — used in priority queues
  • Trie (Prefix Tree): Tree of characters — autocomplete, spell check, IP routing
  • Segment Tree / Fenwick Tree: Range query and update in O(log n) — competitive programming staple
Time Complexity (Balanced BST)
  • Search: O(log n)
  • Insert: O(log n)
  • Delete: O(log n)
  • Traversal (in/pre/post/level): O(n)
  • Unbalanced BST worst case: O(n) — degenerates to a linked list
Why Useful
  • Efficient ordered operations: O(log n) search, insert, delete in balanced trees
  • Natural hierarchy: Perfect for data with parent-child relationships
  • Sorted iteration: In-order traversal of BST gives sorted output
  • Divide and conquer: Binary trees naturally split problems in half
Applications
  • File Systems: Directory trees — every OS uses tree structures for files
  • Databases: B-Trees/B+ Trees for indexing — MySQL, PostgreSQL, MongoDB all rely on them
  • HTML DOM: Web page structure — every element is a node in a tree
  • Compilers: Abstract Syntax Trees (AST) represent parsed code
  • Autocomplete: Tries power search suggestions in Google, IDEs, phone keyboards
  • Networking: DNS hierarchy — root servers → TLD → domain → subdomain
  • AI / Game Theory: Decision trees, minimax trees for game AI (chess, tic-tac-toe)
  • Compression: Huffman coding builds optimal prefix-free codes using a binary tree
Real-world Analogy

Family tree — grandparents at the root, parents as children, and you as a leaf. Corporate org charts work the same way: CEO → VPs → Managers → Engineers.

Interview Tips
  • Master all 4 traversals: In-order, Pre-order, Post-order (DFS), Level-order (BFS)
  • Recursion is your friend: Most tree problems have elegant recursive solutions
  • LCA (Lowest Common Ancestor): Very common — know both BST and general binary tree solutions
  • Serialize/Deserialize: Convert tree to string and back — tests your understanding deeply
  • Diameter, height, balanced check: Practice these foundational problems
// Binary Tree - Node Structure #include <iostream> using namespace std; struct TreeNode { int data; TreeNode* left; TreeNode* right; TreeNode(int val) : data(val), left(nullptr), right(nullptr) {} }; int main() { TreeNode* root = new TreeNode(10); root->left = new TreeNode(5); root->right = new TreeNode(15); root->left->left = new TreeNode(3); root->left->right = new TreeNode(7); cout << "Root: " << root->data << endl; cout << "Left: " << root->left->data << endl; return 0; }
// Binary Tree - Node Structure public class TreeExample { static class TreeNode { int data; TreeNode left, right; TreeNode(int val) { data = val; } } public static void main(String[] args) { TreeNode root = new TreeNode(10); root.left = new TreeNode(5); root.right = new TreeNode(15); root.left.left = new TreeNode(3); root.left.right = new TreeNode(7); System.out.println("Root: " + root.data); System.out.println("Left: " + root.left.data); } }
# Binary Tree - Node Structure class TreeNode: def __init__(self, val): self.data = val self.left = None self.right = None root = TreeNode(10) root.left = TreeNode(5) root.right = TreeNode(15) root.left.left = TreeNode(3) root.left.right = TreeNode(7) print(f"Root: {root.data}") print(f"Left: {root.left.data}")
Definition

A non-linear data structure consisting of vertices (nodes) connected by edges. Unlike trees, graphs can have cycles, multiple paths between nodes, and nodes with any number of connections. Formally: G = (V, E) where V is a set of vertices and E is a set of edges.

Types / Variants
  • Directed Graph (Digraph): Edges have direction — A→B doesn't mean B→A
  • Undirected Graph: Edges are bidirectional — A–B means both can reach each other
  • Weighted Graph: Edges have costs/weights — used in shortest path problems
  • Unweighted Graph: All edges have equal weight
  • DAG (Directed Acyclic Graph): No cycles — used in task scheduling, build systems, Git history
  • Bipartite Graph: Nodes split into 2 sets with edges only between sets — matching problems
Representations
  • Adjacency List: Array of lists — space O(V+E), best for sparse graphs
  • Adjacency Matrix: 2D boolean array — space O(V²), best for dense graphs, O(1) edge lookup
  • Edge List: Simple list of (u, v, weight) tuples — used in Kruskal's algorithm
Key Algorithm Complexities
  • BFS / DFS: O(V + E)
  • Dijkstra's (shortest path): O((V + E) log V) with min-heap
  • Bellman-Ford: O(V × E) — handles negative weights
  • Floyd-Warshall (all-pairs): O(V³)
  • Topological Sort: O(V + E) — only for DAGs
  • Kruskal's / Prim's (MST): O(E log E)
Why Useful
  • Models real relationships: Any network of connected entities is a graph
  • Pathfinding: Shortest path, cheapest route, fewest hops
  • Dependency resolution: Topological sort determines execution order
  • Connectivity analysis: Find connected components, bridges, articulation points
Applications
  • Google Maps / Uber: Dijkstra's / A* for shortest route between locations
  • Social Networks: Friend suggestions (mutual connections), degrees of separation
  • Internet Routing: BGP protocol routes packets through the graph of routers
  • Recommendation Engines: Netflix, YouTube — user-item bipartite graphs
  • Package Managers: npm, pip, Maven resolve dependency DAGs
  • Compilers: Control flow graphs for optimization, data flow analysis
  • Network Security: Detecting anomalous connections, community detection
  • Knowledge Graphs: Google Knowledge Graph, Neo4j — entity relationships
Real-world Analogy

City map — intersections are nodes, roads are edges. One-way streets are directed edges, and road distances are weights. Finding the fastest route from home to office is literally Dijkstra's algorithm.

Interview Tips
  • BFS for shortest path: In unweighted graphs, BFS always finds the shortest path
  • DFS for cycle detection: Track visited + recursion stack (directed) or parent (undirected)
  • Topological sort: Know both Kahn's (BFS) and DFS approaches
  • Union-Find: Efficient connected components — key for Kruskal's MST and dynamic connectivity
  • Graph coloring: Bipartite check, scheduling problems
// Graph - Adjacency List #include <iostream> #include <vector> using namespace std; int main() { int V = 5; vector<vector<int>> adj(V); adj[0].push_back(1); adj[1].push_back(0); adj[0].push_back(2); adj[2].push_back(0); adj[1].push_back(3); adj[3].push_back(1); for (int i = 0; i < V; i++) { cout << i << " -> "; for (int j : adj[i]) cout << j << " "; cout << endl; } return 0; }
// Graph - Adjacency List import java.util.*; public class GraphExample { public static void main(String[] args) { int V = 5; List<List<Integer>> adj = new ArrayList<>(); for (int i = 0; i < V; i++) adj.add(new ArrayList<>()); adj.get(0).add(1); adj.get(1).add(0); adj.get(0).add(2); adj.get(2).add(0); adj.get(1).add(3); adj.get(3).add(1); for (int i = 0; i < V; i++) { System.out.print(i + " -> "); for (int j : adj.get(i)) System.out.print(j + " "); System.out.println(); } } }
# Graph - Adjacency List from collections import defaultdict adj = defaultdict(list) adj[0].append(1); adj[1].append(0) adj[0].append(2); adj[2].append(0) adj[1].append(3); adj[3].append(1) for i in range(5): print(f"{i} -> {' '.join(map(str, adj[i]))}")
Definition

A data structure that stores key-value pairs and uses a hash function to compute an index into an array of buckets. The hash function maps keys to array positions, enabling near-instant lookups. When two keys hash to the same index, a collision occurs, resolved via chaining (linked lists) or open addressing (probing).

Types / Variants
  • HashMap / HashTable: Unordered key-value store — unordered_map (C++), HashMap (Java), dict (Python)
  • LinkedHashMap: Maintains insertion order — Java's LinkedHashMap, Python's dict (3.7+)
  • TreeMap / OrderedMap: Keys sorted — O(log n) operations, backed by Red-Black Tree
  • HashSet: Only keys, no values — used for membership testing
  • Concurrent HashMap: Thread-safe variant for multi-threaded applications
Collision Resolution
  • Separate Chaining: Each bucket holds a linked list of entries — simple, handles high load
  • Open Addressing (Linear Probing): Find next empty slot — better cache performance
  • Double Hashing: Use a second hash function to determine probe step size
  • Robin Hood Hashing: Steal from rich slots — reduces variance in probe length
Time Complexity
  • Insert (average): O(1)
  • Search (average): O(1)
  • Delete (average): O(1)
  • Worst case (all collisions): O(n) — degenerates to a linked list
  • Space: O(n) — plus overhead for empty buckets (load factor typically 0.75)
Why Useful
  • Fastest lookups: O(1) average — nothing else comes close for unordered data
  • Frequency counting: Count occurrences of anything in a single pass
  • Deduplication: HashSets instantly tell you if you've seen an element before
  • Mapping relationships: Any "given X, find Y" problem is a HashMap problem
Applications
  • Caching: Memcached, Redis — fast key-based data retrieval
  • Databases: Hash indexes for exact-match queries (e.g., PostgreSQL hash index)
  • Compilers: Symbol tables mapping variable names → memory addresses
  • Web Development: HTTP headers, cookies, query parameters — all key-value pairs
  • Load Balancers: Consistent hashing distributes requests across servers
  • Blockchain: SHA-256 hashes identify blocks and validate integrity
  • Spell Checkers: HashSet of valid words for O(1) lookup
  • Counting & Analytics: Word frequency, vote tallying, API rate limiting
Real-world Analogy

Library card catalog — each book has a unique call number (hash). Given the call number, you go directly to the shelf. If two books have the same call number (collision), they're placed together and you scan the small group.

Interview Tips
  • Two Sum: The most classic HashMap problem — store complement as you iterate
  • Group Anagrams: Use sorted string as key → group words
  • LRU Cache: HashMap + Doubly Linked List — O(1) get and put
  • Subarray sum equals K: Prefix sum + HashMap — very common pattern
  • Design questions: Know how HashMap resizing works (rehashing when load factor exceeded)
// HashMap - unordered_map #include <iostream> #include <unordered_map> using namespace std; int main() { unordered_map<string, int> map; map["Alice"] = 25; map["Bob"] = 30; map["Charlie"] = 35; cout << "Alice: " << map["Alice"] << endl; if (map.find("Bob") != map.end()) cout << "Bob found!" << endl; cout << "Size: " << map.size() << endl; return 0; }
// HashMap - Java import java.util.HashMap; public class HashMapExample { public static void main(String[] args) { HashMap<String, Integer> map = new HashMap<>(); map.put("Alice", 25); map.put("Bob", 30); map.put("Charlie", 35); System.out.println("Alice: " + map.get("Alice")); if (map.containsKey("Bob")) System.out.println("Bob found!"); System.out.println("Size: " + map.size()); } }
# HashMap - dict in Python map = { "Alice": 25, "Bob": 30, "Charlie": 35 } print(f"Alice: {map['Alice']}") if "Bob" in map: print("Bob found!") print(f"Size: {len(map)}")

YouTube processes 500+ hours of video uploaded every minute, serves 1 billion hours of watch time daily, and runs one of the most sophisticated recommendation engines in the world. Here's exactly how each data structure powers the platform behind the scenes.

Arrays — Video Frames, Feeds & Pixel Data

Every video is fundamentally a sequence of frames — each frame is a 2D array of pixels (RGB values). A 1080p frame = 1920×1080 = ~2 million pixels, each storing 3 color values. The home feed is rendered as a paginated array of video thumbnail objects, and search results come back as ranked arrays from the inverted index.

  • Video frames: Decoded as arrays of pixel buffers — GPU processes them in parallel
  • Comments pagination: Each page loads an array of 20 comment objects via API
  • Thumbnail grid: Home feed = 2D array of cards, laid out via CSS Grid backed by array data
  • Audio samples: Audio track stored as array of PCM float values at 44.1kHz sample rate
  • Search results: Elasticsearch returns ranked arrays of video document IDs
// Simulating YouTube video frame storage #include <iostream> #include <vector> using namespace std; struct Pixel { uint8_t r, g, b; }; int main() { int width = 1920, height = 1080, fps = 30; int totalFrames = fps * 10; // 10-second video // Each frame = 2D array of pixels vector<vector<vector<Pixel>>> video(totalFrames, vector<vector<Pixel>>(height, vector<Pixel>(width))); // Access frame 15, row 540, col 960 (center pixel) video[15][540][960] = {255, 0, 0}; // Red pixel cout << "Frames: " << totalFrames << endl; cout << "Pixels per frame: " << width * height << endl; return 0; }
// Simulating YouTube video frame storage public class VideoFrames { static class Pixel { int r, g, b; Pixel(int r, int g, int b) { this.r = r; this.g = g; this.b = b; } } public static void main(String[] args) { int width = 1920, height = 1080, fps = 30; int totalFrames = fps * 10; // Each frame = 2D array of pixels Pixel[][][] video = new Pixel[totalFrames][height][width]; // Set center pixel of frame 15 video[15][540][960] = new Pixel(255, 0, 0); System.out.println("Frames: " + totalFrames); System.out.println("Pixels/frame: " + width * height); } }
# Simulating YouTube video frame storage import numpy as np width, height, fps = 1920, 1080, 30 total_frames = fps * 10 # 10-second video # Each frame = HxW array of RGB pixels video = np.zeros((total_frames, height, width, 3), dtype=np.uint8) # Set center pixel of frame 15 to red video[15][540][960] = [255, 0, 0] print(f"Frames: {total_frames}") print(f"Pixels/frame: {width * height:,}") print(f"Total memory: {video.nbytes / 1e9:.1f} GB")
Linked Lists — Playlists & Watch History

YouTube playlists are doubly linked lists — each video node points to the next and previous video, enabling seamless forward/backward navigation. The "Up Next" autoplay queue, watch history, and recently viewed channels all use linked list structures for efficient insertion and removal without shifting.

  • Playlists: Doubly linked — reorder by moving pointers, not copying data
  • Watch history: New videos prepended at head in O(1) — latest first
  • Autoplay queue: Next recommended videos linked in sequence, dynamically updated
  • Recently visited channels: Circular linked list for the sidebar — wraps around after N entries
  • Video editor timeline: Clips on the timeline are linked nodes — drag to reorder = pointer swap
// YouTube Playlist as Doubly Linked List #include <iostream> #include <string> using namespace std; struct VideoNode { string title; string videoId; VideoNode* prev; VideoNode* next; VideoNode(string t, string id) : title(t), videoId(id), prev(nullptr), next(nullptr) {} }; class Playlist { VideoNode* head; VideoNode* current; public: Playlist() : head(nullptr), current(nullptr) {} void addVideo(string title, string id) { VideoNode* node = new VideoNode(title, id); if (!head) { head = current = node; return; } VideoNode* tail = head; while (tail->next) tail = tail->next; tail->next = node; node->prev = tail; } void playNext() { if (current && current->next) { current = current->next; cout << "Now playing: " << current->title << endl; } } void playPrev() { if (current && current->prev) { current = current->prev; cout << "Now playing: " << current->title << endl; } } };
// YouTube Playlist as Doubly Linked List public class Playlist { static class VideoNode { String title, videoId; VideoNode prev, next; VideoNode(String t, String id) { title = t; videoId = id; } } VideoNode head, current; void addVideo(String title, String id) { VideoNode node = new VideoNode(title, id); if (head == null) { head = current = node; return; } VideoNode tail = head; while (tail.next != null) tail = tail.next; tail.next = node; node.prev = tail; } void playNext() { if (current != null && current.next != null) { current = current.next; System.out.println("Now playing: " + current.title); } } void playPrev() { if (current != null && current.prev != null) { current = current.prev; System.out.println("Now playing: " + current.title); } } }
# YouTube Playlist as Doubly Linked List class VideoNode: def __init__(self, title, video_id): self.title = title self.video_id = video_id self.prev = None self.next = None class Playlist: def __init__(self): self.head = None self.current = None def add_video(self, title, vid): node = VideoNode(title, vid) if not self.head: self.head = self.current = node return tail = self.head while tail.next: tail = tail.next tail.next = node node.prev = tail def play_next(self): if self.current and self.current.next: self.current = self.current.next print(f"Now playing: {self.current.title}") def play_prev(self): if self.current and self.current.prev: self.current = self.current.prev print(f"Now playing: {self.current.title}") playlist = Playlist() playlist.add_video("Intro to DSA", "abc123") playlist.add_video("Arrays Deep Dive", "def456") playlist.add_video("Linked Lists", "ghi789") playlist.play_next() # Arrays Deep Dive playlist.play_next() # Linked Lists playlist.play_prev() # Arrays Deep Dive
Stack — Undo/Redo, Navigation & Recursion

YouTube Studio's video editor uses a dual-stack undo/redo system. Every edit (trim, cut, filter) is pushed onto the undo stack. Pressing undo pops it and pushes to the redo stack. Browser back/forward navigation is also a two-stack system. Nested comment thread rendering uses the call stack recursively.

  • Undo/Redo in editor: Two stacks — undo stack for past actions, redo stack for undone actions
  • Page navigation: Back button pops from history stack, pushes to forward stack
  • Nested comments: Recursive DFS rendering — call stack depth = reply nesting depth
  • Expression parsing: YouTube's search query parser uses a stack to handle nested operators and quotes
// YouTube Studio Undo/Redo System #include <iostream> #include <stack> #include <string> using namespace std; class VideoEditor { stack<string> undoStack; stack<string> redoStack; public: void doAction(string action) { undoStack.push(action); while (!redoStack.empty()) redoStack.pop(); cout << "Action: " << action << endl; } void undo() { if (undoStack.empty()) { cout << "Nothing to undo" << endl; return; } string action = undoStack.top(); undoStack.pop(); redoStack.push(action); cout << "Undo: " << action << endl; } void redo() { if (redoStack.empty()) { cout << "Nothing to redo" << endl; return; } string action = redoStack.top(); redoStack.pop(); undoStack.push(action); cout << "Redo: " << action << endl; } };
// YouTube Studio Undo/Redo System import java.util.Stack; public class VideoEditor { Stack<String> undoStack = new Stack<>(); Stack<String> redoStack = new Stack<>(); void doAction(String action) { undoStack.push(action); redoStack.clear(); System.out.println("Action: " + action); } void undo() { if (undoStack.isEmpty()) return; String action = undoStack.pop(); redoStack.push(action); System.out.println("Undo: " + action); } void redo() { if (redoStack.isEmpty()) return; String action = redoStack.pop(); undoStack.push(action); System.out.println("Redo: " + action); } }
# YouTube Studio Undo/Redo System class VideoEditor: def __init__(self): self.undo_stack = [] self.redo_stack = [] def do_action(self, action): self.undo_stack.append(action) self.redo_stack.clear() print(f"Action: {action}") def undo(self): if not self.undo_stack: return action = self.undo_stack.pop() self.redo_stack.append(action) print(f"Undo: {action}") def redo(self): if not self.redo_stack: return action = self.redo_stack.pop() self.undo_stack.append(action) print(f"Redo: {action}") editor = VideoEditor() editor.do_action("Trim clip at 0:05") editor.do_action("Add filter: warm") editor.do_action("Insert text overlay") editor.undo() # Undo: Insert text overlay editor.undo() # Undo: Add filter: warm editor.redo() # Redo: Add filter: warm
Queue — Upload Pipeline, Buffering & Notifications

When you upload a video, it enters a multi-stage processing queue: transcoding (converting to multiple resolutions), thumbnail extraction, content analysis (copyright detection via Content ID), subtitle generation, and finally indexing for search. The video player uses a frame buffer queue — pre-fetched frames are enqueued and dequeued for smooth playback. YouTube pushes notifications to millions of subscribers via distributed message queues (similar to Kafka).

  • Upload pipeline: Video → transcode queue → thumbnail queue → Content ID queue → index queue
  • Frame buffering: Player maintains ~5s of pre-fetched frames in a circular buffer queue
  • Notification fan-out: "New video" alert enqueued per subscriber — processed by worker pools
  • Ad serving: Pre-roll, mid-roll, post-roll ads queued in order per watch session
  • Live stream chat: Messages queued in FIFO order, rate-limited per user
// YouTube Video Upload Processing Queue #include <iostream> #include <queue> #include <string> using namespace std; struct UploadJob { string videoId; string stage; // transcode, thumbnail, contentId, index }; int main() { queue<UploadJob> pipeline; // Video uploaded — enters pipeline pipeline.push({"vid_001", "transcode"}); pipeline.push({"vid_001", "thumbnail"}); pipeline.push({"vid_001", "contentId"}); pipeline.push({"vid_001", "index"}); // Workers process jobs in order while (!pipeline.empty()) { UploadJob job = pipeline.front(); pipeline.pop(); cout << "Processing " << job.videoId << " - Stage: " << job.stage << endl; } return 0; }
// YouTube Video Upload Processing Queue import java.util.LinkedList; import java.util.Queue; public class UploadPipeline { static class UploadJob { String videoId, stage; UploadJob(String v, String s) { videoId = v; stage = s; } } public static void main(String[] args) { Queue<UploadJob> pipeline = new LinkedList<>(); pipeline.offer(new UploadJob("vid_001", "transcode")); pipeline.offer(new UploadJob("vid_001", "thumbnail")); pipeline.offer(new UploadJob("vid_001", "contentId")); pipeline.offer(new UploadJob("vid_001", "index")); while (!pipeline.isEmpty()) { UploadJob job = pipeline.poll(); System.out.println("Processing " + job.videoId + " - Stage: " + job.stage); } } }
# YouTube Video Upload Processing Queue from collections import deque pipeline = deque() # Video uploaded — enters multi-stage pipeline stages = ["transcode", "thumbnail", "contentId", "index"] for stage in stages: pipeline.append({"video_id": "vid_001", "stage": stage}) # Workers process jobs FIFO while pipeline: job = pipeline.popleft() print(f"Processing {job['video_id']} - Stage: {job['stage']}") # Output: # Processing vid_001 - Stage: transcode # Processing vid_001 - Stage: thumbnail # Processing vid_001 - Stage: contentId # Processing vid_001 - Stage: index
Trees — Comments, Categories & Search Indexes

YouTube's comment system is a tree — each top-level comment is a root, and replies form child nodes. A reply to a reply creates deeper nesting. The category system (Music → Pop → K-Pop → BTS) is a tree hierarchy. Behind the scenes, B+ Trees power the database indexes that make searching across billions of videos possible in milliseconds.

  • Comment threads: Tree of comments — root comments with nested reply subtrees
  • Category taxonomy: Music → Electronic → House → Deep House — tree of categories
  • Content moderation: Decision tree classifiers — safe / age-restricted / removed
  • Database indexes: B+ Trees index video titles, tags, descriptions for search
  • DOM rendering: YouTube's web UI is a tree of React/Polymer components
  • Huffman encoding: Video compression uses Huffman trees for entropy coding
// YouTube Comment Thread as Tree #include <iostream> #include <vector> #include <string> using namespace std; struct Comment { string user, text; int likes; vector<Comment*> replies; Comment(string u, string t, int l) : user(u), text(t), likes(l) {} }; void renderThread(Comment* c, int depth = 0) { for (int i = 0; i < depth; i++) cout << " "; cout << c->user << ": " << c->text << " [" << c->likes << " likes]" << endl; for (Comment* r : c->replies) renderThread(r, depth + 1); } int main() { Comment* root = new Comment("Alice", "Great video!", 142); Comment* r1 = new Comment("Bob", "Totally agree!", 23); Comment* r2 = new Comment("Charlie", "@Bob same here", 5); root->replies.push_back(r1); r1->replies.push_back(r2); renderThread(root); return 0; }
// YouTube Comment Thread as Tree import java.util.*; public class CommentThread { static class Comment { String user, text; int likes; List<Comment> replies = new ArrayList<>(); Comment(String u, String t, int l) { user = u; text = t; likes = l; } } static void render(Comment c, int depth) { System.out.println(" ".repeat(depth) + c.user + ": " + c.text + " [" + c.likes + " likes]"); for (Comment r : c.replies) render(r, depth + 1); } public static void main(String[] args) { Comment root = new Comment("Alice", "Great video!", 142); Comment r1 = new Comment("Bob", "Totally agree!", 23); Comment r2 = new Comment("Charlie", "@Bob same", 5); root.replies.add(r1); r1.replies.add(r2); render(root, 0); } }
# YouTube Comment Thread as Tree class Comment: def __init__(self, user, text, likes=0): self.user = user self.text = text self.likes = likes self.replies = [] def add_reply(self, reply): self.replies.append(reply) def render_thread(comment, depth=0): indent = " " * depth print(f"{indent}{comment.user}: {comment.text} [{comment.likes} likes]") for reply in comment.replies: render_thread(reply, depth + 1) root = Comment("Alice", "Great video!", 142) r1 = Comment("Bob", "Totally agree!", 23) r2 = Comment("Charlie", "@Bob same here", 5) root.add_reply(r1) r1.add_reply(r2) render_thread(root) # Alice: Great video! [142 likes] # Bob: Totally agree! [23 likes] # Charlie: @Bob same here [5 likes]
Graphs — Recommendations, Social & Related Videos

YouTube's recommendation engine is built on a massive user-video bipartite graph. Users and videos are nodes; edges represent watches, likes, and subscriptions. Collaborative filtering traverses this graph: "users who watched A also watched B." The subscriber network is a directed graph (A subscribes to B ≠ B subscribes to A). The "related videos" sidebar is generated by traversing co-watch edges — videos frequently watched together are strongly connected.

  • Recommendation engine: Bipartite user-video graph + deep learning embeddings
  • Subscriber network: Directed graph — A→B means A subscribes to B
  • Related videos: Co-watch graph — edge weight = number of users who watched both
  • Creator collaborations: Undirected graph linking channels that appear in each other's videos
  • Content ID network: Graph matching uploaded content against copyrighted reference material
  • Community detection: Identify clusters of related channels (gaming, cooking, tech)
// YouTube Recommendation — Co-watch Graph #include <iostream> #include <unordered_map> #include <vector> #include <algorithm> using namespace std; class RecommendationGraph { // video -> [(related_video, co_watch_count)] unordered_map<string, vector<pair<string, int>>> adj; public: void addCoWatch(string a, string b, int count) { adj[a].push_back({b, count}); adj[b].push_back({a, count}); } vector<string> getRelated(string videoId, int topK) { auto& edges = adj[videoId]; sort(edges.begin(), edges.end(), [](auto& a, auto& b) { return a.second > b.second; }); vector<string> result; for (int i = 0; i < min(topK, (int)edges.size()); i++) result.push_back(edges[i].first); return result; } };
// YouTube Recommendation — Co-watch Graph import java.util.*; public class RecommendationGraph { Map<String, List<int[]>> adj = new HashMap<>(); Map<String, String> names = new HashMap<>(); void addCoWatch(String a, String b, int count) { adj.computeIfAbsent(a, k -> new ArrayList<>()) .add(new int[]{b.hashCode(), count}); adj.computeIfAbsent(b, k -> new ArrayList<>()) .add(new int[]{a.hashCode(), count}); } List<String> getRelated(String videoId, int topK) { // Sort by co-watch count descending List<int[]> edges = adj.getOrDefault(videoId, Collections.emptyList()); edges.sort((a, b) -> b[1] - a[1]); // Return top-K related video IDs List<String> result = new ArrayList<>(); for (int i = 0; i < Math.min(topK, edges.size()); i++) result.add(String.valueOf(edges.get(i)[0])); return result; } }
# YouTube Recommendation — Co-watch Graph from collections import defaultdict class RecommendationGraph: def __init__(self): # video_id -> [(related_id, co_watch_count)] self.adj = defaultdict(list) def add_co_watch(self, a, b, count): self.adj[a].append((b, count)) self.adj[b].append((a, count)) def get_related(self, video_id, top_k=5): edges = sorted(self.adj[video_id], key=lambda x: x[1], reverse=True) return [vid for vid, _ in edges[:top_k]] graph = RecommendationGraph() graph.add_co_watch("dsa_intro", "arrays_deep", 15000) graph.add_co_watch("dsa_intro", "linked_lists", 12000) graph.add_co_watch("dsa_intro", "big_o", 18000) graph.add_co_watch("dsa_intro", "python_basics", 5000) related = graph.get_related("dsa_intro", top_k=3) print(f"Related videos: {related}") # Related videos: ['big_o', 'arrays_deep', 'linked_lists']
HashMap — Caching, Sessions & Deduplication

YouTube serves billions of requests daily, and HashMaps are everywhere. Video metadata (title, views, likes, channel) is cached in-memory as a HashMap keyed by video ID — avoiding database hits on every page load. User sessions map session tokens to user objects. Content ID uses perceptual hashing to map fingerprints to copyrighted content. Rate limiting tracks request counts per IP using HashMap with TTL.

  • Video metadata cache: videoId → {title, views, likes, channel, duration} — billions of entries in Redis/Memcached
  • User sessions: sessionToken → {userId, email, preferences} — O(1) auth lookup
  • Content ID fingerprints: audioHash → copyrightOwner — detects re-uploads instantly
  • Rate limiting: userIP → {requestCount, windowStart} — throttle abusive clients
  • URL shortener: shortCode → fullVideoURL — youtu.be links are HashMap lookups
  • A/B testing: userId → experimentGroup — determines which UI variant you see
// YouTube Video Metadata Cache #include <iostream> #include <unordered_map> #include <string> using namespace std; struct VideoMeta { string title, channel; long long views, likes; int durationSec; }; int main() { unordered_map<string, VideoMeta> cache; // Cache video metadata — O(1) insert cache["dQw4w9WgXcQ"] = { "Never Gonna Give You Up", "Rick Astley", 1500000000LL, 15000000LL, 213 }; // Lookup — O(1) average string id = "dQw4w9WgXcQ"; if (cache.count(id)) { auto& v = cache[id]; cout << v.title << " by " << v.channel << endl; cout << "Views: " << v.views << endl; } return 0; }
// YouTube Video Metadata Cache import java.util.HashMap; public class MetadataCache { static class VideoMeta { String title, channel; long views, likes; int durationSec; VideoMeta(String t, String c, long v, long l, int d) { title=t; channel=c; views=v; likes=l; durationSec=d; } } public static void main(String[] args) { HashMap<String, VideoMeta> cache = new HashMap<>(); cache.put("dQw4w9WgXcQ", new VideoMeta( "Never Gonna Give You Up", "Rick Astley", 1500000000L, 15000000L, 213)); if (cache.containsKey("dQw4w9WgXcQ")) { VideoMeta v = cache.get("dQw4w9WgXcQ"); System.out.println(v.title + " by " + v.channel); System.out.println("Views: " + v.views); } } }
# YouTube Video Metadata Cache cache = {} # Cache video metadata — O(1) insert cache["dQw4w9WgXcQ"] = { "title": "Never Gonna Give You Up", "channel": "Rick Astley", "views": 1_500_000_000, "likes": 15_000_000, "duration_sec": 213, } # O(1) lookup video_id = "dQw4w9WgXcQ" if video_id in cache: v = cache[video_id] print(f"{v['title']} by {v['channel']}") print(f"Views: {v['views']:,}") # Rate limiting example rate_limits = {} # ip -> request_count def check_rate(ip, limit=100): rate_limits[ip] = rate_limits.get(ip, 0) + 1 return rate_limits[ip] <= limit print(check_rate("192.168.1.1")) # True
Heaps / Priority Queues — Trending, Top-K & Ad Auctions

YouTube's Trending page is powered by a max-heap — videos are ranked by an engagement score (views × recency × interaction rate), and the heap efficiently maintains the top-N videos as scores change. Finding the top K most-liked videos out of billions uses a min-heap of size K — any video with more likes than the heap's minimum replaces it. Ad auctions use priority queues to select the highest-bidding advertiser per impression in real time.

  • Trending tab: Max-heap of videos ranked by engagement score — top element = #1 trending
  • Top-K search results: Min-heap of size K maintains the K most relevant results efficiently
  • Ad auction: Priority queue ranks ads by bid × quality score — highest wins the impression
  • Live stream packets: Min-heap reorders out-of-sequence network packets by timestamp
  • Task scheduler: Internal job scheduler uses priority queue — highest-priority jobs (transcoding) run first
  • Notification priority: Urgent notifications (live stream starting) get higher priority than weekly digests
// YouTube Trending — Top-K Videos using Min-Heap #include <iostream> #include <queue> #include <vector> #include <string> using namespace std; struct Video { string title; long long score; bool operator>(const Video& o) const { return score > o.score; } }; int main() { int K = 3; // Top 3 trending // Min-heap of size K priority_queue<Video, vector<Video>, greater<Video>> minHeap; vector<Video> videos = { {"DSA Tutorial", 50000}, {"Cat Compilation", 200000}, {"Music Video", 500000}, {"Tech Review", 150000}, {"Comedy Skit", 350000}, }; for (auto& v : videos) { minHeap.push(v); if (minHeap.size() > K) minHeap.pop(); } cout << "Top " << K << " Trending:" << endl; while (!minHeap.empty()) { cout << " " << minHeap.top().title << " (score: " << minHeap.top().score << ")" << endl; minHeap.pop(); } return 0; }
// YouTube Trending — Top-K Videos using Min-Heap import java.util.*; public class Trending { public static void main(String[] args) { int K = 3; // Min-heap by score PriorityQueue<long[]> minHeap = new PriorityQueue<>((a, b) -> Long.compare(a[1], b[1])); String[] titles = {"DSA", "Cats", "Music", "Tech", "Comedy"}; long[] scores = {50000, 200000, 500000, 150000, 350000}; for (int i = 0; i < titles.length; i++) { minHeap.offer(new long[]{i, scores[i]}); if (minHeap.size() > K) minHeap.poll(); } System.out.println("Top " + K + " Trending:"); while (!minHeap.isEmpty()) { long[] v = minHeap.poll(); System.out.println(" " + titles[(int)v[0]] + " (score: " + v[1] + ")"); } } }
# YouTube Trending — Top-K Videos using Min-Heap import heapq videos = [ (50000, "DSA Tutorial"), (200000, "Cat Compilation"), (500000, "Music Video"), (150000, "Tech Review"), (350000, "Comedy Skit"), ] K = 3 # heapq is a min-heap — keeps K largest top_k = [] for score, title in videos: heapq.heappush(top_k, (score, title)) if len(top_k) > K: heapq.heappop(top_k) # Remove smallest print(f"Top {K} Trending:") for score, title in sorted(top_k, reverse=True): print(f" {title} (score: {score:,})") # Top 3 Trending: # Music Video (score: 500,000) # Comedy Skit (score: 350,000) # Cat Compilation (score: 200,000)