Jump to content

Talk:Alpha–beta pruning

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Pseudocode for Fail-Hard/Fail-Soft Incorrect?

[edit]

According to the StackOverflow question "Alpha-beta pruning: Fail-hard VS. Fail-soft" and first answer the pseudocode here is incorrect inasmuch as both versions (allegedly) demonstrate fail-soft. This should be confirmed and the psuedocode corrected if necessary. JoshuaGreen1981 (talk) 00:55, 7 February 2022 (UTC)[reply]

It is not correct indeed. Both versions shown on wikipedia are the fail soft variant. 81.164.156.146 (talk) 20:55, 3 October 2023 (UTC)[reply]

Merge with minimax?

[edit]

Does this really deserve its own page? Alpha-beta pruning isn't a search algorithm at all, it's an optimization for the minimax algorithm. 193.130.71.68 (talk) 11:53, 25 May 2011 (UTC)[reply]

Query about AB pruning description

[edit]

In the description for this article, this appears: "If the move ordering for the search is optimal (meaning the best moves always searched first)"

As far as I can tell, the optimal search order for AB pruning is worst-first. The algorithm is looking to eliminate a subtree as early as possible, and the way to do this is to find a move in it that's worse than the current Alpha value; worst-first gives the best chance of doing this. Am I missing something? I'll update the description if I don't hear otherwise. --=== Jez === 11:48, 28 May 2007 (UTC)[reply]

No, it's best-first. Though it's best-first for the player whose move is under consideration *at that level of the tree*, which obviously alternates each move. The way it works is this: when trying to decide on a move for player A, consider the first move (ideally this is the best move, but not always -- if the move ordering was perfect there wouldn't be a need for the search). Arrive at a score for the first move. Then go the second move. The idea is to determine as quickly as possible that this move is worse for A than the first move. The quickest way to do this is to consider B's best response to that move. Now consider trying it in the reverse order. Suppose the first move tried is worse than the second. Then, when searching the second, we never encounter a move by B that proves it worse than the first move, so we have to consider all of B's possible responses instead of just one. HTH, let me know if it doesn't make sense. Evand 23:36, 28 May 2007 (UTC)[reply]
Ahh, I understand now. You're calling it B's best move, I was thinking of it as A's worst scenario, as it's usually A that's doing the search. At least my diagram is still correct. I think I'm right in thinking that AB pruning can only happen on a minimizer's move? --=== Jez === 10:15, 29 May 2007 (UTC)[reply]
When it extends more than two plies deep it gets a little harder to follow :) For example, when A is considering the first move, there are no established bounds, so the algorithm ends up the same as if B was searching for their best possible move from a position one ply deeper. (This is part of why it makes sense to consider the algorithm as symmetric, with the player doing the searching switching each ply.) The effect is actually the same it the alpha-beta window isn't tight enough to cause a cutoff on later moves A tries -- that is, if those moves are actually better than the first move, or if B hasn't yet found a good enough response to prove they aren't. BTW, for your diagram, it might make it clearer what's happening if some of the moves that got cut off were better than the alternatives that were searched. Evand 20:02, 29 May 2007 (UTC)[reply]

Incorrect

[edit]

The article is incorrect. Alpha beta pruning pertains to TWO types of pruning whereof the author only describes one.

I think all search strategies except minimax and alpha beta should be put in a single page - Arvindn

I respectfully disagree - search is an interesting problem, and there's more than enough info on each algorithm for a page each. Besides, what would the page be called? "All search algorithms (other than minimax and alpha beta)"? Surely A* at least deserves its own page? Sorry. GeorgeBills 15:48, 14 Jun 2005 (UTC)

I put ordering heuristics here because they derive their power only from their interaction with alpha-beta search -- they have no advantage otherwise. The Anome


The article claims: The optimisation typically reduces the effective branching factor by two compared to simple Minimax, or equivalently doubles the number of ply that can be searched in a given time. Since the number of nodes to be searched (i.e. time required) increases exponentially with each ply, I don't think this is correct. Here's an example, with math... take a tree which had an effective branching factor of 6 before applying alpha-beta pruning... Assuming I had time to search 4 plys deep previous to pruning, I would have had time for 6^1+6^2+6^3+6^4=1554 nodes. If alpha-beta reduces the branching factor to 3 now, the article's claim is that I should have time to search 8 plys deep. However, that would be 3^1+3^2+3^3+...+3^8=9840 nodes. Clearly, reducing the effective branching factor by two does not buy one the time to search twice as many plies deep. Have I made an error in this reasoning? Unless I hear otherwise, I'll update the article. --Ds13 22:31, 19 January 2006 (UTC)[reply]

Hmm. Using the formula for the sum of a geometric progression with m=1 and taking x and y as the two bases:
So what is x in terms of y, if ? What speedup can you recieve i.e. what is n in terms of m? r3m0t talk 07:39, 20 January 2006 (UTC)[reply]
Yes, you are correct that the current description is in error. Alpha-beta pruning reduces the number of searches required under the minimax algorithm by the square-root of what the total number would otherwise be. In effect, this allows exactly 1-ply deeper (using chess terminology) to be completed, without error or loss of important information, within the same time. --AceVentura
Actually, it allows you to search twice as many plies (assuming perfect ordering, of course) -- w ^ d = sqrt(w ^ (2 * d)). Evand 06:30, 22 January 2006 (UTC)[reply]
I think the implicit assumptions are that only end nodes are evaluated and that evaluation takes much longer than move generation. Stephen B Streater 14:15, 23 June 2006 (UTC)[reply]
With an optimal ordering of the moves the complexity can indeed be reduced to - but I do find it noteworthy that optimal ordering is impossible (otherwise this order could be used for decision making without the need for any minimaxing...). AFAIK Moore and Knuth first examined alpha-beta pruning in 1975, concluding an average asymptotic complexity of for small b, converging to for . Maebert 2:22, 7 December 2006 (UTC)

The pseudo-code has been changed back-and-forth several times (see the *** below), so perhaps it's better to add some discussion here rather than to add to the article itself.

 evaluate (node, alpha, beta)
   if node is a leaf
       return the heuristic value of node
   if node is a minimizing node
       for each child of node
           beta = min (beta, evaluate (child, alpha, beta))
           if beta <= alpha
               return *** alpha or beta? ***
       return beta
   if node is a maximizing node
       for each child of node
           alpha = max (alpha, evaluate (child, alpha, beta))
           if beta <= alpha
               return *** alpha or beta? ***
       return alpha

The pseudocode is correct, regardless of whether alpha or beta is returned at the cut-offs. In either case, the parent node knows (implicitly) that a cut-off occurred, and that the cut-off node is equal to or worse than the current best child node, and that therefore it must retain the current best child node. —Unknown

Wait... how does the parent "implicitly" know that a cut-off occured? It could be returning a cut-off, or it could be returning an actual known value. I agree that it does not matter whether alpha or beta are returned in the cut-off case, but I think it is important to either a) return an additional property which says "equal to or worse" (as opposed to just equal), OR b) make it clear that the parent must favour earlier results to later ones (that is, if one node returns "2" and another node also returns "2", this second return "implicitly" means "2 or worse", while the first one actually means "2", so it must favour the first one). —EatMyShortz 17:27, 21 November 2006 (UTC)[reply]
I agree, this pseudo-code seems to be correct. I've been able to implement it in an unbeatable tic-tac-toe player. However, I've not been able to verify the current condensed pseudo-code in the actual article. Ceran 01:38, 13 January 2007 (UTC)[reply]
The difference between the two examples only matters when you have modified to algorithm to return additional information (like a move index) along with the score. However, this is almost always unnecessary. A typical implementation will use a simple numeric alpha beta search, where the search function only returns numbers. The top level function will call this search on each possible move from the current state, remember the highest score it has seen and apply the associated move index. So, in the actual low level search function, it does not matter whether you return alpha or beta as either one will be ignored.

Pseudo code is broken

[edit]

The pseudo code is *WRONG* in the same way as negamax (which is identical, clearly something is wrong): the heurestic evaluation must be negated whenever player 2 is to play. Also, if the root call was for player 2, the final result must be negated again. —The preceding unsigned comment was added by 208.65.183.75 (talk) 07:39, 4 March 2007 (UTC).[reply]

Not true, although the description should be clearer. The heuristic is always from the perspective of the player executing the search, not the current player.

Could someone provide a working line-by-line real code implementation of the pseudocode? I have done so, and the current code on this page is wrong. JSB73 (talk) 22:43, 28 April 2009 (UTC)[reply]

The code in the Talk page appears to be an implementation of MiniMax with Alpha Beta pruning. The code in the actual Wiki page appears to be an implementation of NegaMax. The difference is that MiniMax alternates between taking minimums and maximums, while NegaMax negates Alpha and Beta and the returned result so as to keep using maximums. It's simpler but quite possibly harder to understand. -- wadetb at gmail dot com

I think the code is wrong indeed. Consider the following case: two nodes with values of 10 and 20 at depth of 1 (that is just after the current state). Running the code with depth of 1 (examing only one move forward), would return the state with value of 10, while it should be 20. In my opinion the evalution function has to be negated. Jqer (talk) 11:14, 9 January 2009 (UTC)[reply]

I implemented the pseudo code, and it does not work. My assumptions are as follow - the first time we call the routine, I assume the player making the first move is the "first" player. I assume the heuristic scores are such that a higher score is always better for the first player. I also assume the alpha and beta value parameters are non volatile (the are not modified on return.) Assuming the simplest case of two terminal nodes with scores of 1 and 2, I get the return value of -1 when I code the psuedo code as it's written. It's as if something is missing from the code, or there's specific behavior when return heuristic evaluations. When I looked up negamax, which looks almost identical, it passes in a parameter called 'color', which seems like it might fix this problem!—Preceding unsigned comment added by 15.203.233.76 (talk) 19:51, 29 March 2010 (UTC)[reply]

I added another variable to the function that should fix the psudo code. An ugly fix is much better than leaving it broken. 79.179.3.108 (talk) 19:53, 13 November 2010 (UTC)[reply]

Improvement to image

[edit]

The tree image at the top is quite helpful (especially compared with earlier versions), but would it be possible to show the values of Alpha and Beta somewhere? In the MiniMax page there is a description to go along with the image that describes the processing at each step. Something similar would help quite a lot on the Alpha-beta page.

I agree. There should be something like an algorithm walkthrough.
I edited it to include the values of alpha and beta at each step. It has too much visual clutter to be used tho. Plus I used vertical text. Xjcl (talk) 17:03, 1 April 2016 (UTC)[reply]

pseudocode - added from negamax

[edit]

it was quite beutyful and short ;) ---- ca1 84.16.123.194 (talk) 16:05, 3 April 2008 (UTC)[reply]

It certainly is short, it may be beautiful. But it most certainly is NOT explanatory. It makes no mention of variable scope, which the pseudocode leaves ambiguous. XcepticZP (talk) 13:41, 23 March 2009 (UTC)[reply]

Correct Pseudocode

[edit]

The link below for the correct pseudocode from alpha-beta. It was taken from page 218 of "Algorithms in a Nutshell", http://oreilly.com/catalog/9780596516246

Pseudocode - http://cardforge.googlecode.com/files/alpha-beta%20pseudocode.gif

Mtgrares (talk) 20:28, 19 November 2009 (UTC)[reply]

Luck?

[edit]

Could be make some mention of the use of alpha-beta pruning in games involving luck (such as dice)? Can alpha-beta pruning be used on such games? --Doradus (talk) 14:02, 18 January 2010 (UTC)[reply]

It can't usually be used much, because you have no control over outcome of the dice. It can be used for the players moves though, once you have calculated a probabilistic average score for a particular ply.- Wolfkeeper 17:15, 18 January 2010 (UTC)[reply]

heuristic value of node

[edit]

Hi,

The example code fragment says "return the heuristic value of node". It is somewhat ambiguous if the value must be calculated seen from the player-at-that-depth point of view or the the player at the root-level of the search tree. — Preceding unsigned comment added by Flok (talkcontribs) 12:37, 19 July 2013 (UTC)[reply]

Roughly  ?

[edit]

Since is a set, the function "average number of nodes evaluated" would be either be in the set or not in it, not "roughly" contained. The link to "Human Level AI Is Harder Than It Seemed in 1955" seems dead (at least for me, now), so I can't investigate the exact language McCarthy uses and am reluctant to make a call one way or the other. Can somebody with access to the original slides clarify this? 192.160.117.141 (talk) 22:24, 23 January 2014 (UTC)[reply]

I have tried to address this; see the new text. Csaba Szepesvari (talk) 00:19, 17 November 2017 (UTC)[reply]

Swapping and negating α and β

[edit]

A shorter but equivalent variant of the pseudocode in the article should be obtained if α and β are swapped, and α, β and the heuristic value are negated, between each depth, like this:

01 function alphabeta(node, depth, α, β, heuristicFactor)
02      if depth = 0 or node is a terminal node
03          return the heuristic value of node multiplied by heuristicFactor
04      for each child of node
05          α := max(α, -alphabeta(child, depth - 1, -β, -α, -heuristicFactor))
06          if β ≤ α
07              break (* β cut-off *)
08      return α
(* Initial call *)
alphabeta(origin, depth, -, +, 1)

Swapping and negating α and β effectively negates the interval [α, β], and each depth can focus only on maximizing the lower bound of the interval, instead of having to either maximize the lower bound or minimize the upper bound depending on who's turn it is. Hence saving code. But then the heuristic values also has to be negated between each depth; hence the heuristic factor and the minus sign in the max function.

Could anyone confirm that this code actually works? I haven't had a chance to test it. In that case we could include it as a variant in the article, as this is just about half the number of lines of code of the pseudocode that is currently in the article. —Kri (talk) 13:29, 6 November 2014 (UTC)[reply]

Hm, I see this is covered already in Negamax#NegaMax with Alpha Beta Pruning. The algorithms seem to be equivalent, except from that the algorithm covered there returns the best value found in the current function call, whereas the version I wrote here returns the maximum of that value and the α value it was given as input. Which is to prefer, I don't know; the pseudocode currently given in this article does the same thing as the code I wrote here.
Maybe we could create a subsection of the pseudocode section where a NegaMax variant is included? —Kri (talk) 13:43, 6 November 2014 (UTC)[reply]
By the way, has anyone checked the code in the article to see that it works, or whether it matters which of the variables α and bestValue it is that is returned if they aren't equal (i.e. α > bestValue)? According to Artificial Intelligence: A Modern Approach (Third Edition) it is bestValue that should be returned, not α or β as is the case in this article. —Kri (talk) 14:40, 6 November 2014 (UTC)[reply]
It doesn't matter with a pure alpha-beta implementation whether an out of bounds bestValue (fail soft alpha-beta) or the appropriate α or β bound (fail hard alpha-beta) is returned. The algorithm discards these return values and such values will not affect the alpha-beta result for the root node.
The article's code has since been revised to show bestValue (v in the article's code) return. I believe this is the better alternative, since the extent that bestValue exceeds an α and β bound is useful with alpha-beta optimization methods. GamePlayerAI (talk) 17:24, 6 March 2015 (UTC)[reply]
[edit]

Hello fellow Wikipedians,

I have just added archive links to one external link on Alpha–beta pruning. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

checkY An editor has reviewed this edit and fixed any errors that were found.

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 23:59, 27 February 2016 (UTC)[reply]

[edit]

Hello fellow Wikipedians,

I have just added archive links to 3 external links on Alpha–beta pruning. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

checkY An editor has reviewed this edit and fixed any errors that were found.

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 18:53, 22 March 2016 (UTC)[reply]

Citation 2 is incorrect.

[edit]

It leads to a page that doesn't provide any of the information indicated. — Preceding unsigned comment added by 67.168.176.62 (talk) 08:13, 24 September 2016 (UTC)[reply]

The terms "fail-soft" and "fail-hard" are used without much explanation

[edit]

The pseudo-code is casually described as being "fail-soft," before the definition of "fail-soft" is actually given:

   The pseudo-code for the fail-soft variation of alpha-beta pruning is as follows:

"Fail-soft" is defined afterwards, but initially, the pseudo-code section is written as if the reader should already knows what the term "fail-soft" means. I think it would make sense to open the pseudo-code section with something like this:

Implementations of alpha-beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard." A fail-soft alpha-beta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha-beta limits its function return value into the inclusive range of α and β.

Some additional explanation of the implications of "fail-soft" and "fail-hard" would also be nice.

Unsatisfactory handling of heuristic suboptimality

[edit]

The article states that the algorithm is optimal and selects the same move as the standard minimax algorithm, but this depends on some unspoken assumptions. If the values of the leaf nodes are known, it will select the optimal move. But the usual context is a limited-depth search, where some (or most) of the leaf nodes have uncertain value. Alpha-beta will select the same move as the standard minimax algorithm if the search depth is the same, but if (as in usual applications) a bigger depth is allowed within the same time because of the pruning, the deeper search might reveal better insights that makes it choose another move. — Preceding unsigned comment added by 37.44.138.159 (talk) 13:03, 12 March 2018 (UTC)[reply]

Negamax variation section

[edit]

The section needs considerable rework, replace with reference to Negamax#Negamax_with_alpha_beta_pruning, or remove. Among the issues:

Missing commentary about alpha / beta.
Missing references.
Negamax description unclear and of questionable accuracy.
Negamax with alpha/beta pruning code inaccurate.

The pseudo code for the section, in this article's context, should resemble the following:

function alphabeta(node, depth, α, β, color) is
    if depth = 0 or node is a terminal node then
        return color × the heuristic value of node
    value := −∞
    for each child of node do
        value := max(value, −alphabeta(child, depth − 1, −β, −α, −color))
        α := max(α, value)
        if α ≥ β then
            break (* cut-off *)
    return value

GamePlayerAI (talk) 17:14, 30 May 2020 (UTC)[reply]

(* Initial call *)
alphabeta(origin, depth, −, +, 1)

GamePlayerAI (talk) 15:14, 1 June 2020 (UTC)[reply]

Rather than leave inaccurate information in the main article, I'm removing the disputed section. I'll leave the revisions here for the section's disputed code, as above. The subject (Alpha-beta pruning in Negamax) is also covered in Negamax#Negamax_with_alpha_beta_pruning.

I'll also note Alpha Beta Pruning is best understood with the Minimax algorithm. Although Alpha Beta Pruning is also applicable with Negamax, it's far less intuitive in describing and understanding it with Negamax.

GamePlayerAI (talk) 18:30, 21 June 2020 (UTC)[reply]

Minimax versus Alphabeta tree diagram

[edit]

Hi, I've created an image which shows the difference between a minimax and alpha-beta pruned search using trees diagrams. Could this be added to the article?

https://i.imgur.com/9AXbblz.png

Sgriffin53 (talk) 13:26, 2 July 2021 (UTC)[reply]

Pseudocode for fail-hard vs fail-soft alpha-beta

[edit]

It seems that the pseudocode, illustrating the fail-hard version of alpha-beta pruning, is wrong and is actually equivalent to the fail-soft version (strict inequality aside). The reference to the Russell & Norvig's book seems irrelevant too, because the book doesn't mension fail-hard vs fail-soft variations and implements the fail-soft version.

It is written that "The main difference between fail-soft and fail-hard implementations is whether α and β are updated before or after the cutoff check. If they are updated before the check, then they can exceed initial bounds and the algorithm is fail-soft.", but it does not make a difference whether we update α/β if there is a cutoff -- we don't use it anymore in both scenarios.

There is even a discussion about this at SO: https://stackoverflow.com/questions/70050677/alpha-beta-pruning-fail-hard-vs-fail-soft

As far as I understand, the fail-hard version must return the bound (β for a MAX node and α for a MIN node respectively) in the case of a cutoff (see e.g. https://www.chessprogramming.org/Alpha-Beta#Fail_hard), and the correct version of the fail-hard pseudocode should be

function alphabeta(node, depth, α, β, maximizingPlayer) is
    if depth == 0 or node is terminal then
        return the heuristic value of node
    if maximizingPlayer then
        value := −∞
        for each child of node do
            value := max(value, alphabeta(child, depth − 1, α, β, FALSE))
            if value ≥ β then
                return β (* hard β cutoff *)
            α := max(α, value)
        return value
    else
        value := +∞
        for each child of node do
            value := min(value, alphabeta(child, depth − 1, α, β, TRUE))
            if value ≤ α then
                return α (* hard α cutoff *)
            β := min(β, value)
        return value

Alexander Chekmenev (talk) 16:59, 7 August 2024 (UTC)[reply]