Data Structures

Set Basics

Difficulty: Beginner

Drop a million IDs into a Set, ask set.has(id) for any one of them, and the answer comes back in expected constant time without any ordering, sorting, or scanning. That single primitive (membership testing) is what makes a Set the quiet...

0%
BEGINNER
Complexity varies

Set Basics

Drop a million IDs into a Set, ask set.has(id) for any one of them, and the answer comes back in expected constant time without any ordering, sorting, or scanning. That single primitive (membership testing) is what makes a Set the quiet workhorse behind deduplication, visited-node tracking in graph traversal, and almost every cache-invalidation routine you will read.

This lesson covers the uniqueness invariant of a set, the four core operations add, delete, has, and iteration in both JavaScript's Set and Python's set, and the four classical set-theory operations: union, intersection, difference, and symmetric difference. You will see why a set is implemented as a hash map whose values are unused, and you will practice the visited-set pattern that reappears in BFS, DFS, and cycle detection.

In Hash Map (Dictionary) Basics, you learned that a hash function maps keys to bucket indices for O(1) get and set. A set keeps only the keys, drops the values, and exposes the resulting structure as a pure membership data type.

Next, Trees: Binary Tree Fundamentals moves from flat unordered collections to hierarchical structures, where parent and child relationships unlock entirely new traversal patterns.

Beginner
40 min
6 Objectives
5 Sections

Topics:

Set
Hashing
Data Structures
Beginner
Free
Time Complexity
Visited Set
Comparison

What You'll Learn

By the end of this lesson, you will be able to:

Explain the uniqueness property of sets and how they differ from arrays and hash maps.

Perform add, delete, and has (contains) operations on a set in both JavaScript and Python.

Implement union, intersection, difference, and symmetric difference between two sets.

Describe how sets are backed by a hash map internally and why this gives O(1) average operations.

Identify when to choose a set over an array or hash map for a given problem.

Apply sets to solve deduplication and membership-testing problems efficiently.

Why This Matters

01

Sets are the simplest way to enforce uniqueness -- adding a duplicate is silently ignored, making deduplication a one-liner instead of a multi-step process.

02

Membership testing in O(1) is critical for the visited-set pattern used in BFS, DFS, and cycle detection -- you will use sets in almost every graph and tree algorithm.

03

Set-theory operations (union, intersection, difference) appear constantly in interviews, database query planning, and data analysis pipelines.

Key Terms

7 terms you'll encounter in this lesson

1

Set

An unordered collection of unique elements. Adding a duplicate has no effect; the set always contains at most one copy of each value.

2

Membership Test (has/contains)

Checking whether a value exists in the set. Runs in O(1) average time because the set uses hashing internally.

3

Union

A set operation that combines all elements from two sets into a new set, removing duplicates. Written as A ∪ B.

4

Intersection

A set operation that returns only the elements present in both sets. Written as A ∩ B.

5

Difference

A set operation that returns elements in the first set that are not in the second. Written as A \ B or A - B.

6

Symmetric Difference

A set operation that returns elements in either set but not in both. Written as A △ B. Equivalent to (A ∪ B) - (A ∩ B).

7

Hash-Based Set

A set implementation backed by a hash map where elements are stored as keys with dummy values, inheriting O(1) average-case performance for add/remove/has.

Related Problems

Coding problems that put this lesson's concepts into practice

Heads Up

Common misconceptions to watch out for

"Sets maintain insertion order"

In Python, sets are unordered (iteration order is not guaranteed). In JavaScript, Set does preserve insertion order by spec, but you should not rely on order for algorithmic correctness -- treat sets as unordered collections.

"Adding a duplicate to a set throws an error"

Adding a duplicate is a silent no-op -- the set simply ignores the value because it already exists. No error is thrown. This is what makes sets useful for deduplication.

"Sets are always better than arrays"

Sets do not support indexing (no set[0]) and cannot store duplicates. If you need ordered access by position, allow duplicates, or iterate in a specific order, arrays are the better choice. Sets excel specifically at uniqueness and fast membership checks.

"Set operations like union are O(1)"

Individual operations (add, delete, has) are O(1) average. But set-theory operations like union, intersection, and difference must iterate over elements, so they are O(n + m) where n and m are the set sizes.