正在切换页面...

Game Theory DP: Nim, Prediction, and Stones

hardDynamic ProgrammingGame DPGame TheoryUpdated

Predict the Winner (LeetCode 486)

Two players pick numbers from either end of an array. Does the first player have a winning strategy (score $\ge$ opponent's score)?

State: dp[i][j] = The maximum advantage (First player score minus Second player score) a player can achieve relative to the other from the subarray nums[i..j].

public boolean predictTheWinner(int[] nums) {
    int n = nums.length;
    int[][] dp = new int[n][n];
    
    // Base case: only one element, first player takes it
    for (int i = 0; i < n; i++) dp[i][i] = nums[i];
    
    for (int len = 2; len <= n; len++) {
        for (int i = 0; i + len - 1 < n; i++) {
            int j = i + len - 1;
            // Option 1: Pick nums[i], subtract the advantage the opponent gets from [i+1, j]
            // Option 2: Pick nums[j], subtract the advantage the opponent gets from [i, j-1]
            dp[i][j] = Math.max(nums[i] - dp[i + 1][j], 
                                nums[j] - dp[i][j - 1]);
        }
    }
    // If the net advantage start-to-end is positive, player 1 wins
    return dp[0][n - 1] >= 0;
}

Stone Game (LeetCode 877)

Similar to the above, but with specific constraints: even length and total sum is odd.

Mathematical Insight: Since the sum is odd and length is even, Player 1 can always guarantee a win by choosing all even-indexed piles or all odd-indexed piles (whichever group is larger).

public boolean stoneGame(int[] piles) {
    // Pure logic: first player is guaranteed to win given constraints
    return true; 
}

// DP approach (identical Logic to Predict the Winner)
public boolean stoneGameDP(int[] piles) {
    int n = piles.length;
    int[][] dp = new int[n][n];
    for (int i = 0; i < n; i++) dp[i][i] = piles[i];
    for (int len = 2; len <= n; len++) {
        for (int i = 0; i + len - 1 < n; i++) {
            int j = i + len - 1;
            dp[i][j] = Math.max(piles[i] - dp[i + 1][j], 
                                piles[j] - dp[i][j - 1]);
        }
    }
    return dp[0][n - 1] > 0;
}

Stone Game VII (LeetCode 1690)

Score is equal to the sum of remaining stones after picking one from an end.

public int stoneGameVII(int[] stones) {
    int n = stones.length;
    int[] prefix = new int[n + 1];
    for (int i = 0; i < n; i++) prefix[i + 1] = prefix[i] + stones[i];
    
    int[][] dp = new int[n][n];
    for (int len = 2; len <= n; len++) {
        for (int i = 0; i + len - 1 < n; i++) {
            int j = i + len - 1;
            // Current player's potential scores after removing one end
            int scoreExLeft = prefix[j + 1] - prefix[i + 1];
            int scoreExRight = prefix[j] - prefix[i];
            
            // max difference = current score - (opponent's max difference on remaining range)
            dp[i][j] = Math.max(scoreExLeft - dp[i + 1][j], 
                                scoreExRight - dp[i][j - 1]);
        }
    }
    return dp[0][n - 1];
}

At a Glance Comparison Table

Problem	DP Meaning	Transition Core
Prediction	Net score advantage	`max(pick[end] - opponentAdvantage)`
Stone Game I	Net score advantage	Guaranteed win for P1 (mathematics)
Stone Game VII	Advantage where gain is `sum(remaining)`	Subtract relative advantage of remaining range
筋