Skip to main content

Prefix Sum vs Difference Array: Master Range Queries and Range Updates in DSA

Prefix Sum vs Difference Array: Master Range Queries and Range Updates in DSA

If you’ve ever solved a coding problem where you had to find the sum of a subarray, or update a range of values in an array, you’ve probably encountered a Time Limit Exceeded (TLE) error.

The naive approach—iterating through the array for every query—takes $O(N)$ time. When you have $Q$ queries over an array of size $N$, the total time becomes $O(N \times Q)$. For large inputs, this is unacceptably slow.

Enter two of the most elegant and powerful array manipulation techniques in DSA: Prefix Sums and Difference Arrays. They are two sides of the same coin. * Prefix Sum: Optimizes Range Queries (asking a question about a range). * Difference Array: Optimizes Range Updates (modifying a range of values).

Here is how they work.


Part 1: Prefix Sum (The Range Query Savior)

A Prefix Sum array stores the cumulative sum of elements up to a given index. Once built, it allows you to calculate the sum of any subarray in $O(1)$ time.

The Concept

Imagine you have an array arr = [3, 1, 4, 1, 5]. We create a prefix array where prefix[i] is the sum of all elements from index 0 to i-1. (Using a 0 at the start makes the math much easier).

Index:      0     1     2     3     4     5
arr:      [ 3  ,  1  ,  4  ,  1  ,  5  ]
prefix:   [ 0  ,  3  ,  4  ,  8  ,  9  ,  14  ]
            ↑     ↑     ↑     ↑     ↑     ↑
           sum  sum   sum   sum   sum   sum
          of 0  of 1  of 2  of 3  of 4  of 5
          elems elems elems elems elems elems

The Formula: Sum of Range [L, R]

To find the sum of elements from index L to R (inclusive): Sum(L, R) = prefix[R + 1] - prefix[L]

Example: Sum from index 1 to 3 ([1, 4, 1]). Sum(1, 3) = prefix[4] - prefix[1] Sum(1, 3) = 9 - 3 = 6. (Which is 1 + 4 + 1). Magic!

The Code

def build_prefix_sum(arr):
    n = len(arr)
    prefix = [0] * (n + 1)
    for i in range(n):
        prefix[i + 1] = prefix[i] + arr[i]
    return prefix

def range_sum(prefix, L, R):
    # Returns sum of arr from index L to R inclusive
    return prefix[R + 1] - prefix[L]

# Usage
arr = [3, 1, 4, 1, 5]
prefix = build_prefix_sum(arr)

print(range_sum(prefix, 1, 3)) # Output: 6
print(range_sum(prefix, 0, 4)) # Output: 14 (sum of all elements)

Complexity: * Build Time: $O(N)$ * Query Time: $O(1)$ * Space: $O(N)$

⚠️ Note: If the array elements can be negative, you can also use a Prefix Sum to find if a subarray with a sum of 0 exists (by checking for duplicate values in the prefix array using a Hash Set).


Part 2: Difference Array (The Range Update Wizard)

Now flip the problem. You have an array of zeros. You are given $Q$ updates. Each update says: "Add 5 to all elements from index L to R." At the end, print the final array.

Naively, each update takes $O(N)$. But with a Difference Array, each update takes $O(1)$ time.

The Concept

A Difference array diff stores the difference between consecutive elements of the target array. diff[i] = arr[i] - arr[i-1]

If we want to add a value X to a range [L, R], we don't need to touch every element in that range. We only need to mark the start and the end of the range. 1. Add X to diff[L] (This starts the increment for everything from L onwards). 2. Subtract X from diff[R + 1] (This stops the increment from affecting elements after R).

The Visual Flow

graph TD
    A[Start with diff array of zeros] --> B[Apply Range Updates in O 1 ]
    B --> C[Take Prefix Sum of diff array]
    C --> D[Final Array is reconstructed!]

    style B fill:#4CAF50,color:#fff
    style C fill:#2196F3,color:#fff

Example: Array size 5. Update: Add 5 to range [1, 3].

  1. Initial diff: [0, 0, 0, 0, 0, 0] (Size N+1 to prevent out-of-bounds)
  2. Apply Update:
    • diff[1] += 5 -> [0, 5, 0, 0, 0, 0]
    • diff[4] -= 5 -> [0, 5, 0, 0, -5, 0]
  3. Reconstruct Final Array (by taking the running sum / prefix sum of diff):
    • arr[0] = 0
    • arr[1] = 0 + 5 = 5
    • arr[2] = 5 + 0 = 5
    • arr[3] = 5 + 0 = 5
    • arr[4] = 5 + (-5) = 0
    • Final Array: [0, 5, 5, 5, 0]

The Code

def apply_range_update(diff, L, R, X):
    # Adds X to all elements from index L to R inclusive
    diff[L] += X
    diff[R + 1] -= X

def reconstruct_array(diff, n):
    arr = [0] * n
    current_sum = 0
    for i in range(n):
        current_sum += diff[i]
        arr[i] = current_sum
    return arr

# Usage
n = 5
diff = [0] * (n + 1) # Initialize with zeros

# Apply multiple updates in O(1) each
apply_range_update(diff, 1, 3, 5)  # Add 5 to indices 1,2,3
apply_range_update(diff, 2, 4, 2)  # Add 2 to indices 2,3,4

final_arr = reconstruct_array(diff, n)
print(final_arr) 
# Output: [0, 5, 7, 7, 2]

Complexity: * Update Time: $O(1)$ * Reconstruct Time: $O(N)$ * Space: $O(N)$


Summary: When to Use Which?

Feature Prefix Sum Difference Array
Primary Use Case Querying subarray sums. Updating subarray ranges.
Array Mutability Array must be static (read-only). If you change one element, you must rebuild the whole prefix array in $O(N)$. Array is dynamic (write-heavy). You defer the $O(N)$ cost to the very end.
Core Operation prefix[R+1] - prefix[L] diff[L] += X, diff[R+1] -= X
Relationship Prefix Sum is the inverse of Difference Array. Difference Array is the inverse of Prefix Sum.

The Golden Rule

If an interviewer asks you to optimize a problem involving arrays: 1. If you are doing many reads on a static array: Think Prefix Sum. 2. If you are doing many writes/updates on ranges, and only reading once at the end: Think Difference Array.

Master these two patterns, and you'll instantly turn $O(N^2)$ TLE solutions into sleek $O(N)$ accepted submissions.