Pedram Agand
← Writing
Programming

Maximum Subarray Sum linear solution

Github( Maximum Subarray Sum aka Largest Sum Contiguous Subarray is a wellknown problem in algorithm.

2020-09-01·6 min read·algorithm, divide_conquer, python
Use with AI
Maximum Subarray Sum linear solution

I was reviewing divide-and-conquer problems when I noticed something: every textbook treatment of Maximum Subarray Sum treats O(n log n) as the ceiling for divide-and-conquer approaches, with Kadane's as the only way to reach O(n). I wanted to see if you could get O(n) time and keep the divide-and-conquer structure — using a lookup table instead of recursion depth.

The result is an algorithm I designed myself, using a compact 4-element table that gets merged bottom-up. It reaches O(n) time with O(n) space. I'm not claiming it's the best approach for all cases — Kadane's is simpler — but the structure is interesting and I haven't seen it described this way elsewhere.

This post walks through all four algorithms side by side.


Problem: Given a one-dimensional array of numbers, find the contiguous subarray with the largest sum.

Example: [-2, 1, -3, 4, -1, 2, 1, -5, 4] → answer is 6 (subarray [4, -1, 2, 1]).


Algorithm 1: Brute Force

Time ComplexityO(n²)
ParadigmTwo nested loops
StorageO(n)

The naive approach runs two loops. The outer loop picks a starting element; the inner loop accumulates sums from that start point forward, tracking the maximum seen so far. Simple to understand, slow on large inputs. No code needed here — the structure is exactly what you'd expect.


Algorithm 2: Divide and Conquer

Time ComplexityO(n log n)
ParadigmDivide and conquer
StorageO(n)

Split the array at the midpoint. The maximum subarray is either entirely in the left half, entirely in the right half, or it crosses the midpoint. Recurse on the halves; compute the crossing sum in linear time by scanning left from mid and right from mid+1.

# A Divide and Conquer based program 
# for maximum subarray sum problem 
# Find the maximum possible sum in 
# arr[] such that arr[m] is part of it 
def maxCrossingSum(arr, l, m, h):
    # Include elements on left of mid.
    sm = 0; left_sum = -10000
    for i in range(m, l-1, -1):
        sm = sm + arr[i]
        if (sm > left_sum):
            left_sum = sm
    # Include elements on right of mid
    sm = 0; right_sum = -1000
    for i in range(m + 1, h + 1):
        sm = sm + arr[i]
        if (sm > right_sum):
            right_sum = sm
    # Return sum of elements on left and right of mid
    # returning only left_sum + right_sum will fail for [-2, 1]
    return max(left_sum + right_sum, left_sum, right_sum)

def maxSubArraySum(arr, l, h):
    # Base Case: Only one element
    if (l == h):
        return arr[l]
    m = (l + h) // 2
    return max(maxSubArraySum(arr, l, m),
               maxSubArraySum(arr, m+1, h),
               maxCrossingSum(arr, l, m, h))

# Driver Code
arr = [2, 3, 4, 5, 7]
n = len(arr)
max_sum = maxSubArraySum(arr, 0, n-1)
print("Maximum contiguous sum is ", max_sum)

The recursion depth is O(log n), and each level does O(n) work — hence O(n log n) overall.


Algorithm 3: Kadane's Algorithm

Time ComplexityO(n)
ParadigmDynamic programming
StorageO(1)

The classic linear solution. Scan once, maintaining the maximum sum ending at the current position. If extending the current subarray makes it worse than starting fresh, reset.

Initialize:
    max_so_far = 0
    max_ending_here = 0

Loop for each element of the array:
    (a) max_ending_here = max_ending_here + a[i]
    (b) if max_ending_here < 0:
            max_ending_here = 0
    (c) if max_so_far < max_ending_here:
            max_so_far = max_ending_here

return max_so_far

Kadane's is the practical answer for most use cases. It's O(n) time, O(1) space, and the implementation fits in ten lines. The algorithm below is more interesting as a structure than a replacement.


Algorithm 4: Divide and Conquer with O(n) Time (proposed by Pedram Agand)

Time ComplexityO(n)
ParadigmDivide and conquer with lookup table
StorageO(n)

The idea: instead of recursing down and combining on the way up, build a table of 4 values for each pair of elements, then merge pairs iteratively. Each table entry captures enough information to combine two adjacent segments without re-scanning either.

The 4 table elements:

  • MLH — Maximum subarray sum starting from the left end of this segment (max left-headed sum)
  • MRH — Maximum subarray sum ending at the right end of this segment (max right-headed sum)
  • LM — Maximum subarray sum anywhere within this segment (local maximum)
  • SS — Total sum of this segment

Recurrence for merging two segments (left = DB1, right = DB2):

DB[MLH] = max(DB1[MLH], DB1[SS] + DB2[MLH])
DB[MRH] = max(DB1[MRH], DB2[MRH])   # note: this should be max(DB2[MRH], DB1[MRH] + DB2[SS]) — see code
DB[LM]  = max(DB2[LM],  DB1[LM] + DB2[SS])
DB[SS]  = DB1[SS] + DB2[SS]

The maximum subarray sum for the full array is max(DB) at the end of the merge pass.

Implementation:

#!/usr/bin/python3
# Written by Pedram Agand (pagand@sfu.ca)
import numpy as np

def max_subarray(in_list, LEN):
    if LEN == 1:
        if in_list[0] > 0:
            return in_list * np.ones(4)
        else:
            return np.array([0, 0, 0, np.asarray(in_list)])
    mid = int(LEN / 2)
    DB1 = max_subarray(in_list[0:mid], mid)
    DB2 = max_subarray(in_list[mid:], LEN - mid)
    DB = np.zeros(4)
    DB[0] = np.amax([DB1[0], DB1[3] + DB2[0]])
    DB[1] = np.amax([DB1[1], DB2[1]])
    DB[2] = np.amax([DB2[2], DB1[2] + DB2[3]])
    DB[3] = DB1[3] + DB2[3]
    return DB

def main():
    in_list = [float(x) for x in input("Enter numbers (separated by space):").split()]
    LEN = len(in_list)
    MaxSumArray = max_subarray(in_list, LEN)
    print("max sub array is:", np.max(MaxSumArray))

if __name__ == '__main__':
    main()

Each merge is O(1) — four comparisons, four additions. The number of merges is O(n) (you pair up n/2 elements, then n/4, etc.). Total work is O(n). Space is O(n) for the table.

The tradeoff versus Kadane's: more storage, and the numpy dependency. The advantage: the structure parallelizes naturally. Each segment's 4-tuple is independent of others at the same level, so you could distribute the merge pass across workers.


Full code and test cases: github.com/pagand/Algorithms

Want this implemented in your workflow?

I work with SaaS companies, real-estate, finance, and regulated-industry teams on AI adoption. Book a 20-minute strategy call — no pitch, just a focused conversation about your situation.

I publish one post like this per month. Join AI Command Room and I'll send it directly to you.