Prefix Sum vs Difference Array: Master Range Queries and Range Updates in DSA
If you’ve ever solved a coding problem where you had to find the sum of a subarray, or update a range of values in an array, you’ve probably encountered a Time Limit Exceeded (TLE) error.
The naive approach—iterating through the array for every query—takes $O(N)$ time. When you have $Q$ queries over an array of size $N$, the total time becomes $O(N \times Q)$. For large inputs, this is unacceptably slow.
Enter two of the most elegant and powerful array manipulation techniques in DSA: Prefix Sums and Difference Arrays. They are two sides of the same coin. * Prefix Sum: Optimizes Range Queries (asking a question about a range). * Difference Array: Optimizes Range Updates (modifying a range of values).
Here is how they work.
Part 1: Prefix Sum (The Range Query Savior)
A Prefix Sum array stores the cumulative sum of elements up to a given index. Once built, it allows you to calculate the sum of any subarray in $O(1)$ time.
The Concept
Imagine you have an array arr = [3, 1, 4, 1, 5].
We create a prefix array where prefix[i] is the sum of all elements from index 0 to i-1. (Using a 0 at the start makes the math much easier).
Index: 0 1 2 3 4 5
arr: [ 3 , 1 , 4 , 1 , 5 ]
prefix: [ 0 , 3 , 4 , 8 , 9 , 14 ]
↑ ↑ ↑ ↑ ↑ ↑
sum sum sum sum sum sum
of 0 of 1 of 2 of 3 of 4 of 5
elems elems elems elems elems elems
The Formula: Sum of Range [L, R]
To find the sum of elements from index L to R (inclusive):
Sum(L, R) = prefix[R + 1] - prefix[L]
Example: Sum from index 1 to 3 ([1, 4, 1]).
Sum(1, 3) = prefix[4] - prefix[1]
Sum(1, 3) = 9 - 3 = 6. (Which is 1 + 4 + 1). Magic!
The Code
def build_prefix_sum(arr):
n = len(arr)
prefix = [0] * (n + 1)
for i in range(n):
prefix[i + 1] = prefix[i] + arr[i]
return prefix
def range_sum(prefix, L, R):
# Returns sum of arr from index L to R inclusive
return prefix[R + 1] - prefix[L]
# Usage
arr = [3, 1, 4, 1, 5]
prefix = build_prefix_sum(arr)
print(range_sum(prefix, 1, 3)) # Output: 6
print(range_sum(prefix, 0, 4)) # Output: 14 (sum of all elements)
Complexity: * Build Time: $O(N)$ * Query Time: $O(1)$ * Space: $O(N)$
⚠️ Note: If the array elements can be negative, you can also use a Prefix Sum to find if a subarray with a sum of
0exists (by checking for duplicate values in the prefix array using a Hash Set).
Part 2: Difference Array (The Range Update Wizard)
Now flip the problem. You have an array of zeros. You are given $Q$ updates. Each update says: "Add 5 to all elements from index L to R." At the end, print the final array.
Naively, each update takes $O(N)$. But with a Difference Array, each update takes $O(1)$ time.
The Concept
A Difference array diff stores the difference between consecutive elements of the target array.
diff[i] = arr[i] - arr[i-1]
If we want to add a value X to a range [L, R], we don't need to touch every element in that range. We only need to mark the start and the end of the range.
1. Add X to diff[L] (This starts the increment for everything from L onwards).
2. Subtract X from diff[R + 1] (This stops the increment from affecting elements after R).
The Visual Flow
graph TD
A[Start with diff array of zeros] --> B[Apply Range Updates in O 1 ]
B --> C[Take Prefix Sum of diff array]
C --> D[Final Array is reconstructed!]
style B fill:#4CAF50,color:#fff
style C fill:#2196F3,color:#fff
Example: Array size 5. Update: Add 5 to range [1, 3].
- Initial
diff:[0, 0, 0, 0, 0, 0](Size N+1 to prevent out-of-bounds) - Apply Update:
diff[1] += 5->[0, 5, 0, 0, 0, 0]diff[4] -= 5->[0, 5, 0, 0, -5, 0]
- Reconstruct Final Array (by taking the running sum / prefix sum of
diff):arr[0] = 0arr[1] = 0 + 5 = 5arr[2] = 5 + 0 = 5arr[3] = 5 + 0 = 5arr[4] = 5 + (-5) = 0- Final Array:
[0, 5, 5, 5, 0]
The Code
def apply_range_update(diff, L, R, X):
# Adds X to all elements from index L to R inclusive
diff[L] += X
diff[R + 1] -= X
def reconstruct_array(diff, n):
arr = [0] * n
current_sum = 0
for i in range(n):
current_sum += diff[i]
arr[i] = current_sum
return arr
# Usage
n = 5
diff = [0] * (n + 1) # Initialize with zeros
# Apply multiple updates in O(1) each
apply_range_update(diff, 1, 3, 5) # Add 5 to indices 1,2,3
apply_range_update(diff, 2, 4, 2) # Add 2 to indices 2,3,4
final_arr = reconstruct_array(diff, n)
print(final_arr)
# Output: [0, 5, 7, 7, 2]
Complexity: * Update Time: $O(1)$ * Reconstruct Time: $O(N)$ * Space: $O(N)$
Summary: When to Use Which?
| Feature | Prefix Sum | Difference Array |
|---|---|---|
| Primary Use Case | Querying subarray sums. | Updating subarray ranges. |
| Array Mutability | Array must be static (read-only). If you change one element, you must rebuild the whole prefix array in $O(N)$. | Array is dynamic (write-heavy). You defer the $O(N)$ cost to the very end. |
| Core Operation | prefix[R+1] - prefix[L] |
diff[L] += X, diff[R+1] -= X |
| Relationship | Prefix Sum is the inverse of Difference Array. | Difference Array is the inverse of Prefix Sum. |
The Golden Rule
If an interviewer asks you to optimize a problem involving arrays: 1. If you are doing many reads on a static array: Think Prefix Sum. 2. If you are doing many writes/updates on ranges, and only reading once at the end: Think Difference Array.
Master these two patterns, and you'll instantly turn $O(N^2)$ TLE solutions into sleek $O(N)$ accepted submissions.