# Artificial Intelligence Search Algorithms

Gửi bởi:

**Khoa CNTT - HCEM**vào ngày 2021-04-01 02:18:16 || Kiểu file:

**36**| Lượt Download:

**0**| File size: 1.070364 Mb

## Nội dung tài liệu Xem trước tài liệu

Link tài liệu:

## Các tài liệu liên quan

## Thông tin tài liệu

Artiﬁcial Intelligence Search

Algorithms

22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1

22.2 Problem Space Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-3

22.3 Brute-Force Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-3

Breadth-First Search • Uniform-Cost Search • Depth-First

Search • Depth-First Iterative-Deepening • Bidirectional

Search • Frontier Search • Disk-Based Search Algorithms •

Combinatorial Explosion

22.4 Heuristic Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7

Heuristic Evaluation Functions • Pattern Database

Heuristics • Pure Heuristic Search • A∗ Algorithm •

Iterative-Deepening-A∗ • Depth-First Branch-and-Bound •

Complexity of Finding Optimal Solutions • Heuristic Path

Algorithm • Recursive Best-First Search

22.5 Interleaving Search and Execution . . . . . . . . . . . . . . . . . . . . . . . . . 22-11

Minimin Search • Real-Time-A∗ • Learning-Real-Time-A∗

22.6 Game Playing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-13

Minimax Search • Alpha-Beta Pruning • Other Techniques •

Signiﬁcant Milestones • Multiplayer Games, Imperfect and

Hidden Information

22.7 Constraint-Satisfaction Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-16

Chronological Backtracking • Intelligent Backtracking •

Constraint Recording • Local Search Algorithms

22.8 Research Issues and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-19

Research Issues • Summary

Richard E. Korf

University of California, Los Angeles

22.1

22.9 Further Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Deﬁning Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22-20

22-20

22-21

22-21

Introduction

Search is a universal problem-solving mechanism in artiﬁcial intelligence (AI). In AI problems, the

sequence of steps required for the solution of a problem is not known a priori, but often must be

determined by a trial-and-error exploration of alternatives. The problems that have been addressed

by AI search algorithms fall into three general classes: single-agent pathﬁnding problems, game

playing, and constraint-satisfaction problems.

Classic examples in the AI literature of single-agent pathﬁnding problems are the sliding-tile

puzzles, including the 3 × 3 eight puzzle (see Figure 22.1) and its larger relatives the 4 × 4 ﬁfteen

22-1

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 1 2009-10-6

22-2

Special Topics and Techniques

1 2 3

8

4

7 6 5

1

3

8 2 4

7 6 5

1 2 3

8 4

7 6 5

1 2 3

8 6 4

7

5

1 2 3

8 4

7 6 5

1 3

8 2 4

7 6 5

1 3

8 2 4

7 6 5

1 2

8 4 3

7 6 5

1 2 3

8 4 5

7 6

1 2 3

8 6 4

7 5

1 2 3

8 6 4

7 5

2 3

1 8 4

7 6 5

1 2 3

7 8 4

6 5

8 1 3

2 4

7 6 5

1 3 4

8 2

7 6 5

1

2

8 4 3

7 6 5

1 2 3

8 4 5

7

6

1 2 3

6 4

8 7 5

1 2 3

8 6

7 5 4

2

3

1 8 4

7 6 5

1 2 3

7 8 4

6

5

FIGURE 22.1 Eight puzzle search tree fragment.

puzzle and 5 × 5 twenty-four puzzle. The eight puzzle consists of a 3 × 3 square frame containing

eight numbered square tiles and an empty position called the blank. The legal operators are to

slide any tile that is horizontally or vertically adjacent to the blank into the blank position. The

problem is to rearrange the tiles from some random initial conﬁguration into a particular desired

goal conﬁguration. The sliding-tile puzzles are common testbeds for research in AI search algorithms

because they are very simple to represent and manipulate, yet ﬁnding optimal solutions to the N × N

generalization of the sliding-tile puzzles is NP-complete [38]. Another well-known example of a

single-agent pathﬁnding problem is Rubik’s cube. Real-world examples include theorem proving,

the traveling salesman problem, and vehicle navigation. In each case, the task is to ﬁnd a sequence

of operations that maps an initial state to a goal state.

A second class of search problems include games, such as chess, checkers, Othello, backgammon,

bridge, poker, and Go. The third category is constraint-satisfaction problems, such as the N-queens

problem or Sudoku. The task in the N-queens problem is to place N queens on an N × N chessboard,

such that no two queens are on the same row, column, or diagonal. The Sudoku task is to ﬁll each

empty cell in a 9 × 9 matrix with a digit from zero through nine, such that each row, column, and

nine 3 × 3 submatrices contain all the digits zero through nine. Real-world examples of constraintsatisfaction problems are ubiquitous, including boolean satisﬁability, planning, and scheduling

applications.

We begin by describing the problem-space model on which search algorithms are based. Bruteforce searches are then considered including breadth-ﬁrst, uniform-cost, depth-ﬁrst, depth-ﬁrst

iterative-deepening, bidirectional, frontier, and disk-based search algorithms. Next, we introduce

heuristic evaluation functions, including pattern databases, and heuristic search algorithms including pure heuristic search, the A∗ algorithm, iterative-deepening-A∗ , depth-ﬁrst branch-and-bound,

the heuristic path algorithm, and recursive best-ﬁrst search. We then consider single-agent algorithms that interleave search and execution, including minimin lookahead search, real-time-A∗ ,

and learning-real-time-A∗ . Next, we consider game playing, including minimax search, alpha-beta

pruning, quiescence, iterative deepening, transposition tables, special-purpose hardware, multiplayer

games, and imperfect and hidden information. We then examine constraint-satisfaction algorithms,

such as chronological backtracking, intelligent backtracking techniques, constraint recording, and

local search algorithms. Finally, we consider open research problems in this area. The performance

of these algorithms, in terms of the costs of the solutions they generate, the amount of time the

algorithms take to execute, and the amount of computer memory they require are of central concern

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 2 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-3

throughout. Since search is a universal problem-solving method, what limits its applicability is the

eﬃciency with which it can be performed.

22.2

Problem Space Model

A problem space is the environment in which a search takes place [32]. A problem space consists of

a set of states of the problem, and a set of operators that change the state. For example, in the eight

puzzle, the states are the diﬀerent possible permutations of the tiles, and the operators slide a tile

into the blank position. A problem instance is a problem space together with an initial state and a

goal state. In the case of the eight puzzle, the initial state would be whatever initial permutation the

puzzle starts out in, and the goal state is a particular desired permutation. The problem-solving task

is to ﬁnd a sequence of operators that map the initial state to a goal state. In the eight puzzle the goal

state is given explicitly. In other problems, such as the N-Queens Problem, the goal state is not given

explicitly, but rather implicitly speciﬁed by certain properties that must be satisﬁed by any goal state.

A problem-space graph is often used to represent a problem space. The states of the space

are represented by nodes of the graph, and the operators by edges between nodes. Edges may be

undirected or directed, depending on whether their corresponding operators are reversible or not.

The task in a single-agent path-ﬁnding problem is to ﬁnd a path in the graph from the initial node

to a goal node. Figure 22.1 shows a small part of the eight puzzle problem-space graph.

Although most problem spaces correspond to graphs with more than one path between a pair of

nodes, for simplicity they are often represented as trees, where the initial state is the root of the tree.

The cost of this simpliﬁcation is that any state that can be reached by two diﬀerent paths will be

represented by duplicate nodes in the tree, increasing the size of the tree. The beneﬁt of a tree is that

the absence of multiple paths to the same state greatly simpliﬁes many search algorithms.

One feature that distinguishes AI search algorithms from other graph-searching algorithms is the

size of the graphs involved. For example, the entire chess graph is estimated to contain over 1040

nodes. Even a simple problem like the Twenty-four puzzle contains almost 1025 nodes. As a result,

the problem-space graphs of AI problems are never represented explicitly by listing each state, but

rather are implicitly represented by specifying an initial state and a set of operators to generate new

states from existing states. Furthermore, the size of an AI problem is rarely expressed as the number

of nodes in its problem-space graph. Rather, the two parameters of a search tree that determine the

eﬃciency of various search algorithms are its branching factor and its solution depth. The branching

factor is the average√

number of children of a given node. For example, in the eight puzzle the average

branching factor is 3, or about 1.732. The solution depth of a problem instance is the length of a

shortest path from the initial state to a goal state, or the length of a shortest sequence of operators

that solves the problem. For example, if the goal were in the bottom row of Figure 22.1, the depth of

the problem instance represented by the initial state at the root would be three moves.

22.3

Brute-Force Search

The most general search algorithms are brute-force searches, since they do not require any domainspeciﬁc knowledge. All that is required for a brute-force search is a state description, a set of

legal operators, an initial state, and a description of the goal state. The most important brute-force

techniques are breadth-ﬁrst, uniform-cost, depth-ﬁrst, depth-ﬁrst iterative-deepening, bidirectional,

and frontier search. In the descriptions of the algorithms in the following text, to generate a node

means to create the data structure corresponding to that node, whereas to expand a node means to

generate all the children of that node.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 3 2009-10-6

22-4

Special Topics and Techniques

0

1

2

3

7

4

8

9

5

10

11

6

12

13

14

FIGURE 22.2 Order of node generation for breadth-ﬁrst search.

22.3.1 Breadth-First Search

Breadth-ﬁrst search expands nodes in order of their depth from the root, generating one level of the

tree at a time until a solution is found (see Figure 22.2). It is most easily implemented by maintaining

a ﬁrst-in ﬁrst-out queue of nodes, initially containing just the root, and always removing the node

at the head of the queue, expanding it, and adding its children to the tail of the queue.

Since it never generates a node in the tree until all the nodes at shallower levels have been generated,

breadth-ﬁrst search always ﬁnds a shortest path to a goal. Since each node can be generated in

constant time, the amount of time used by breadth-ﬁrst search is proportional to the number of

nodes generated, which is a function of the branching factor b and the solution depth d. Since the

number of nodes in a uniform tree at level d is bd , the total number of nodes generated in the worst

case is b + b2 + b3 + · · · + bd , which is O(bd ), the asymptotic time complexity of breadth-ﬁrst search.

The main drawback of breadth-ﬁrst search is its memory requirement. Since each level of the tree

must be stored in order to generate the next level, and the amount of memory is proportional to

the number of nodes stored, the space complexity of breadth-ﬁrst search is also O(bd ). As a result,

breadth-ﬁrst search is severely space-bound in practice, and will exhaust the memory available on

typical computers in a matter of minutes.

22.3.2 Uniform-Cost Search

If all edges do not have the same cost, then breadth-ﬁrst search generalizes to uniform-cost search.

Instead of expanding nodes in order of their depth from the root, uniform-cost search expands nodes

in order of their cost from the root. At each step, the next node n to be expanded is one whose cost

g(n) is lowest, where g(n) is the sum of the edge costs from the root to node n. The nodes are stored

in a priority queue. This algorithm is similar to Dijkstra’s single-source shortest-path algorithm [7].

The main diﬀerence is that uniform-cost search runs until a goal node is chosen for expansion, while

Dijkstra’s algorithm runs until every node in a ﬁnite graph is chosen for expansion.

Whenever a node is chosen for expansion by uniform-cost search, a lowest-cost path to that node

has been found. The worst-case time complexity of uniform-cost search is O(bc/e ), where c is the

cost of an optimal solution, and e is the minimum edge cost. Unfortunately, it also suﬀers the same

memory limitation as breadth-ﬁrst search.

22.3.3 Depth-First Search

Depth-ﬁrst search removes the space limitation of breadth-ﬁrst and uniform-cost search by always

generating next a child of the deepest unexpanded node (see Figure 22.3). Both algorithms can be

implemented using a list of unexpanded nodes, with the diﬀerence that breadth-ﬁrst search manages

the list as a ﬁrst-in ﬁrst-out queue, whereas depth-ﬁrst search treats the list as a last-in ﬁrst-out stack.

More commonly, depth-ﬁrst search is implemented recursively, with the recursion stack taking the

place of an explicit node stack.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 4 2009-10-6

22-5

Artiﬁcial Intelligence Search Algorithms

0

1

8

2

3

5

4

6

12

9

7

10

11

13

14

FIGURE 22.3 Order of node generation for depth-ﬁrst search.

The advantage of depth-ﬁrst search is that its space requirement is only linear in the maximum

search depth, as opposed to exponential for breadth-ﬁrst search. The reason is that the algorithm only

needs to store a stack of nodes on the path from the root to the current node. The time complexity

of a depth-ﬁrst search to depth d is O(bd ), since it generates the same set of nodes as breadth-ﬁrst

search, but simply in a diﬀerent order. Thus, as a practical matter, depth-ﬁrst search is time-limited

rather than space-limited.

The primary disadvantage of depth-ﬁrst search is that it may not terminate on an inﬁnite tree,

but simply go down the left-most path forever. For example, even though there are a ﬁnite number

of states of the eight puzzle, the tree fragment shown in Figure 22.1 can be inﬁnitely extended down

any path, generating an inﬁnite number of duplicate nodes representing the same states. The usual

solution to this problem is to impose a cutoﬀ depth on the search. Although the ideal cutoﬀ is the

solution depth d, this value is rarely known in advance of actually solving the problem. If the chosen

cutoﬀ depth is less than d, the algorithm will fail to ﬁnd a solution, whereas if the cutoﬀ depth is

greater than d, a large price is paid in execution time, and the ﬁrst solution found may not be an

optimal one.

22.3.4 Depth-First Iterative-Deepening

Depth-ﬁrst iterative-deepening (DFID) combines the best features of breadth-ﬁrst and depth-ﬁrst

search [18,46]. DFID ﬁrst performs a depth-ﬁrst search to depth one, then starts over, executing a

complete depth-ﬁrst search to depth two, and continues to run depth-ﬁrst searches to successively

greater depths, until a solution is found (see Figure 22.4).

Since it never generates a node until all shallower nodes have been generated, the ﬁrst solution

found by DFID is guaranteed to be via a shortest path. Furthermore, since at any given point it is

executing a depth-ﬁrst search, saving only a stack of nodes, and the algorithm terminates when it

ﬁnds a solution at depth d, the space complexity of DFID is only O(d).

Although DFID spends extra time in the iterations before the one that ﬁnds a solution, this extra

work is usually insigniﬁcant. To see this, note that the number of nodes at depth d is bd , and each of

these nodes are generated once, during the ﬁnal iteration. The number of nodes at depth d−1 is bd−1 ,

0

1, 3, 9

4, 10

11

12

2, 6, 16

5, 13

14

15

7, 17

18

19

8, 20

21

22

FIGURE 22.4 Order of node generation for depth-ﬁrst iterative-deepening search.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 5 2009-10-6

22-6

Special Topics and Techniques

and each of these are generated twice, once during the ﬁnal iteration, and once during the penultimate

iteration. In general, the number of nodes generated by DFID is bd + 2bd−1 + 3bd−2 + · · · + db.

This is asymptotically O(bd ) if b is greater than one, since for large values of d the lower order terms

are insigniﬁcant. In other words, most of the work goes into the ﬁnal iteration, and the cost of the

previous iterations is relatively small. The ratio of the number of nodes generated by DFID to those

generated by breadth-ﬁrst search on a tree is approximately b/(b−1). In fact, DFID is asymptotically

optimal in terms of time and space among all brute-force shortest-path algorithms on a tree [18].

If the edge costs diﬀer from one another, then one can run an iterative deepening version of

uniform-cost search, where the depth cutoﬀ is replaced by a cutoﬀ on the g(n) cost of a node. At

the end of each iteration, the threshold for the next iteration is set to the minimum cost of all nodes

generated but not expanded on the previous iteration.

On a graph with multiple paths to the same node, however, breadth-ﬁrst search may be much more

eﬃcient than depth-ﬁrst or depth-ﬁrst iterative-deepening search. The reason is that a breadth-ﬁrst

search can easily detect all duplicate nodes, whereas a depth-ﬁrst search can only check for duplicates

along the current search path. Thus, the complexity of breadth-ﬁrst search grows only as the number

of states at a given depth, while the complexity of depth-ﬁrst search depends on the number of paths

of a given length. For example, in a square grid, the number of nodes within a radius r of the origin

is O(r2 ), whereas the number of paths of length r is O(3r ), since there are three children of every

node, not counting its parent. Thus, in a graph with a large number of very short cycles, breadth-ﬁrst

search is preferable to depth-ﬁrst search, if suﬃcient memory is available. For two approaches to the

problem of pruning duplicate nodes in depth-ﬁrst search, see [8,47].

22.3.5 Bidirectional Search

Bidirectional search is a brute-force algorithm that requires an explicit goal state instead of simply

a test for a goal condition [36]. The main idea is to simultaneously search forward from the initial

state, and backward from the goal state, until the two search frontiers meet. The path from the initial

state is then concatenated with the inverse of the path from the goal state to form the complete

solution path.

Bidirectional search still guarantees optimal solutions. Assuming that the comparisons for identifying a common state between the two frontiers can be done in constant time per node, by hashing

for example, the time complexity of bidirectional search is O(bd/2 ) since each search need only

proceed to half the solution depth. Since at least one of the searches must be breadth-ﬁrst in order

to ﬁnd a common state, the space complexity of bidirectional search is also O(bd/2 ). As a result,

bidirectional search is space bound in practice.

22.3.6 Frontier Search

Best-ﬁrst searches, such as breadth-ﬁrst, uniform-cost, and bidirectional search, store both a closed

list of expanded nodes, and an open list of nodes that have been generated but not yet expanded.

Another approach to the memory limitation of these algorithms, called frontier search, is to store

only the open list, and delete nodes once they are expanded [25]. To keep from regenerating the

parent of a node, frontier search stores with each node a bit for each operator that would generate

an expanded node. This reduces the space complexity of a breadth-ﬁrst search, for example, from

the size of the problem space to the width of the problem space, or the maximum number of nodes

at any given depth. For example, the size of a two dimensional grip graph is quadratic in the radius,

while the width of the graph is only linear. Alternatively, the immediate parents of those nodes on

the open list can be stored [49].

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 6 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-7

The main drawback of frontier search is that once the search completes, the solution path is not

available, since the expanded nodes have been deleted. One way to reconstruct the solution path is

to perform a bidirectional frontier search, from both the initial state and the goal state. Once the

two search frontiers meet, then we have a middle node approximately half-way along an optimal

solution path. Then, we can use divide and conquer frontier search to recursively generate a path

from the initial state to the middle state, and from the middle state to the goal state. Frontier search

can also be used to reduce the memory of heuristic best-ﬁrst searches described in the following text.

22.3.7 Disk-Based Search Algorithms

Another approach to the memory limitation of graph-search algorithms is to store search nodes on

magnetic disk instead of main memory. Currently, two terabyte disks can be bought for about $300,

which is almost three orders of magnitude cheaper than semiconductor memory. The drawback of

disk storage, however, is that the latency to access a single byte can be as much as ten milliseconds,

which is about ﬁve orders of magnitude slower than semiconductor memory. The main reason to

store generated nodes is to detect duplicate nodes, which is usually accomplished by a hash table in

memory. A hash table cannot be directly implemented on magnetic disk, however, because of the

long latencies for random access.

An alternative technique is called delayed duplicate detection. Rather than checking for duplicate

nodes every time a node is generated, duplicate detection is delayed, for example, until the end

of a breadth-ﬁrst search iteration. At that point the queue of nodes on disk can be sorted by

their state representation, using only sequential access algorithms, bringing duplicate nodes to

adjacent locations. Then, a single sequential pass over the sorted queue can merge duplicate nodes

in preparation for the next iteration of the search. This technique can also be combined with frontier

search to further reduce the storage needed. See [27] for more details.

Another disk-based search algorithm is called structured duplicate detection [53].

22.3.8 Combinatorial Explosion

The problem with all brute-force search algorithms is that their time complexities often grow

exponentially with problem size. This is called combinatorial explosion, and as a result, the size of

problems that can be solved with these techniques is quite limited. For example, while the eight

puzzle, with about 105 states, is easily solved by brute-force search, and the ﬁfteen puzzle, with over

1013 states, can be searched exhaustively using brute-force frontier search, storing nodes on disk

[26], the 5 × 5 twenty-four puzzle, with 1025 states, is completely beyond the reach of brute-force

search.

22.4

Heuristic Search

In order to solve larger problems, domain-speciﬁc knowledge must be added to improve search

eﬃciency. In AI, heuristic search has a general meaning, and a more specialized technical meaning.

In a general sense, the term heuristic is used for any advice that is often eﬀective, but is not guaranteed

to work in every case. Within the heuristic search literature, however, the term heuristic usually refers

to the special case of a heuristic evaluation function.

22.4.1 Heuristic Evaluation Functions

In a single-agent path-ﬁnding problem, a heuristic evaluation function estimates the cost of an

optimal path between a pair of states. For a ﬁxed goal state, a heuristic evaluation h(n) is a function

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 7 2009-10-6

22-8

Special Topics and Techniques

of a node n, which estimates the distance from node n to the goal state. For example, the Euclidean

or airline distance is an estimate of the highway distance between a pair of locations. A common

heuristic function for the sliding-tile puzzles is called the Manhattan distance. It is computed by

counting the number of moves along the grid that each tile is displaced from its goal position, and

summing these values over all tiles.

The key properties of a heuristic evaluation function are that it estimate actual cost, and that

it is eﬃciently computable. For example, the Euclidean distance between a pair of points can be

computed in constant time. The Manhattan distance between a pair of states can be computed in

time proportional to the number of tiles.

Most heuristic functions are derived by generating a simpliﬁed version of the problem to be

solved, then using the cost of an optimal solution to the simpliﬁed problem as a heuristic evaluation

function for the original problem. For example, if we remove the constraint that we must drive

on the roads, the cost of an optimal solution to the resulting helicopter navigation problem is the

Euclidean distance. In the sliding-tile puzzles, if we remove the constraint that a tile can only be

slid into the blank position, then any tile can be moved to any adjacent position at any time. The

optimal number of moves required to solve this simpliﬁed version of the problem is the Manhattan

distance.

Since they are derived from relaxations of the original problem, such heuristics are also lower

bounds on the costs of optimal solutions to the original problem, a property referred to as admissibility. For example, the Euclidean distance is a lower bound on road distance between two points,

since the shortest path between a pair of points is a straight line. Similarly, the Manhattan distance is

a lower bound on the actual number of moves necessary to solve an instance of a sliding-tile puzzle,

since every tile must move at least as many times as its Manhattan distance to its goal position, and

each move moves only one tile.

22.4.2 Pattern Database Heuristics

Pattern databases are heuristics that are precomputed and stored in lookup tables for use during

search [1]. For example, consider a table with a separate entry for each possible conﬁguration of a

subset of the tiles in a sliding tile puzzle, called the pattern tiles. Each entry stores the minimum

number of moves required to move the pattern tiles from the corresponding conﬁguration to their

goal conﬁguration. Such a value is a lower bound on the number of moves needed to reach the goal

from any state where the pattern tiles are in the given conﬁguration, and can be used as an admissible

heuristic during search.

To construct such a table, we perform a breadth-ﬁrst search backward from the goal state, where a

state is deﬁned by the conﬁguration of the pattern tiles. As each such conﬁguration is reached for the

ﬁrst time, we store in the pattern database the number of moves needed to reach that conﬁguration.

This table is computed once before the search, and can be used for multiple searches that share the

same goal state. The number of tiles in a pattern is limited by the amount of memory available to

store an entry for each possible conﬁguration of pattern tiles.

If when computing a pattern database we only count moves of the pattern tiles, then we can add

the values from diﬀerent pattern databases based on disjoint sets of tiles, without overestimating

actual cost [24]. Thus, we can partition all the tiles in a problem into disjoint sets of pattern tiles,

and sum the values from the diﬀerent databases to get a more accurate but still admissible heuristic

function.

A number of algorithms make use of heuristic evaluations, including pure heuristic search, the

A∗ algorithm, iterative-deepening-A∗ , depth-ﬁrst branch-and-bound, the heuristic path algorithm,

recursive best-ﬁrst search, and real-time-A∗ . In addition, heuristic evaluations can be employed in

bidirectional search, and are used in two-player games as well.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 8 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-9

22.4.3 Pure Heuristic Search

The simplest of these algorithms, pure heuristic search, expands nodes in order of their heuristic

values h(n) [9]. As a best-ﬁrst search, it maintains a closed list of those nodes that have been expanded,

and an open list of those nodes that have been generated but not yet expanded. The algorithm begins

with just the initial node on the open list. At each step, a node on the open list with the minimum

h(n) value is expanded, generating all of its children, and is placed on the closed list. The heuristic

function is applied to each of the children, and they are placed on the open list, sorted by their

heuristic values. The algorithm continues until a goal node is chosen for expansion.

In a graph with cycles, multiple paths will be found to the same node, and the ﬁrst path found

may not be the shortest. When a shorter path is found to an open node, the shorter path is saved

and the longer one discarded. When a shorter path to a closed node is found, the node is moved to

open, and the shorter path is associated with it. The main drawback of pure heuristic search is that

since it ignores the cost of the path so far to node n, it does not ﬁnd optimal solutions.

22.4.4 A∗ Algorithm

The A∗ algorithm [14] combines features of uniform-cost search and pure heuristic search to

eﬃciently compute optimal solutions. A∗ is a best-ﬁrst search in which the cost associated with a

node is f (n) = g(n) + h(n), where g(n) is the cost of the path from the initial state to node n, and

h(n) is the heuristic estimate of the cost of a path from node n to a goal. At each point a node with

lowest f value is chosen for expansion. Ties among nodes of equal f value are broken in favor of

nodes with lower h values. The algorithm terminates when a goal node is chosen for expansion.

A∗ ﬁnds an optimal path to a goal if the heuristic function h(n) is admissible, meaning it never

overestimates actual cost. For example, since airline distance never overestimates actual highway

distance, and Manhattan distance never overestimates actual moves in the sliding-tile puzzles, A∗

using these evaluation functions will ﬁnd optimal solutions to these problems. In addition, A∗ makes

the most eﬃcient use of a given heuristic function in the following sense: among all shortest-path

algorithms using a given heuristic function h(n), A∗ expands the fewest number of nodes [5].

The main drawback of A∗ , and indeed of any best-ﬁrst search, is its memory requirement. Since

the open and closed lists are stored in memory, A∗ is severely space-limited in practice, and will

exhaust the available memory on current machines in minutes. For example, while it can be run

successfully on the eight puzzle, it cannot solve most random instances of the ﬁfteen puzzle before

exhausting memory. Frontier-A∗ only stores the open list, which ameliorates the memory limitation,

but does not eliminate it. An additional drawback of A∗ is that managing the open and closed lists

is complex, and takes time.

22.4.5 Iterative-Deepening-A∗

Just as depth-ﬁrst iterative-deepening solved the space problem of breadth-ﬁrst search, iterativedeepening-A∗ (IDA∗ ) eliminates the memory constraint of A∗ , without sacriﬁcing solution optimality [18]. Each iteration of the algorithm is a depth-ﬁrst search that keeps track of the cost,

f (n) = g(n) + h(n), of each node generated. As soon as a node is generated whose cost exceeds a

threshold for that iteration, it is deleted, and the search backtracks before continuing along a diﬀerent

path. The cost threshold is initialized to the heuristic estimate of the initial state, and is increased in

each successive iteration to the lowest cost of all the nodes that were generated but not expanded

during the previous iteration. The algorithm terminates when a goal state is reached whose cost does

not exceed the current threshold.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 9 2009-10-6

22-10

Special Topics and Techniques

Since IDA∗ performs a series of depth-ﬁrst searches, its memory requirement is linear in the

maximum search depth. In addition, if the heuristic function is admissible, IDA∗ ﬁnds an optimal

solution. Finally, by an argument similar to that presented for depth-ﬁrst iterative-deepening, IDA∗

expands the same number of nodes, asymptotically, as A∗ on a tree, provided that the number of

nodes grows exponentially with solution cost. These facts, together with the optimality of A∗ , imply

that IDA∗ is asymptotically optimal in time and space over all heuristic search algorithms that ﬁnd

optimal solutions on a tree. The additional beneﬁts of IDA∗ are that it is much easier to implement,

and often runs faster than A∗ , since it does not incur the overhead of managing the open and closed

lists. Using appropriate admissible heuristic functions, IDA∗ can optimally solve random instances

of the ﬁfteen puzzle [18], the twenty-four puzzle [24], and Rubik’s cube [22].

22.4.6 Depth-First Branch-and-Bound

For many problems, the maximum search depth is known in advance, or the search tree is ﬁnite. For

example, the traveling salesman problem (TSP) is to visit each of a given set of cities and return to

the starting city in a tour of shortest total distance. The most natural problem space for this problem

consists of a tree where the root node represents the starting city, the nodes at level one represent

all the cities that could be visited ﬁrst, the nodes at level two represent all the cites that could be

visited second, and so on. In this tree, the maximum depth is the number of cities, and all candidate

solutions occur at this depth. In such a space, a simple depth-ﬁrst search guarantees ﬁnding an

optimal solution using space that is only linear in the depth.

The idea of depth-ﬁrst branch-and-bound (DFBnB) is to make this search more eﬃcient by

keeping track of the best solution found so far. Since the cost of a partial tour is the sum of the costs

of the edges traveled so far, whenever a partial tour is found whose cost equals or exceeds the cost

of the best complete tour found so far, the branch representing the partial tour can be pruned, since

all its descendents must have equal or greater cost. Whenever a lower-cost complete tour is found,

the cost of the best tour is updated to this lower cost. In addition, an admissible heuristic function,

such as the cost of the minimum spanning tree of the remaining unvisited cities, can be added to

the cost so far of a partial tour to increase the amount of pruning. Finally, by ordering the children

of a given node from smallest to largest estimated total cost, a low-cost solution can be found more

quickly, further improving the pruning eﬃciency.

Interestingly, IDA∗ and DFBnB exhibit complementary behavior. Both are guaranteed to return

an optimal solution using only linear space, assuming that their cost functions are admissible. In

IDA∗ , the cost threshold is always a lower bound on the optimal solution cost, and increases in each

iteration until it reaches the optimal cost. In DFBnB, the cost of the best solution found so far is

always an upper bound on the optimal solution cost, and decreases until it reaches the optimal cost.

While IDA∗ never expands any nodes whose cost exceeds the optimal cost, its overhead consists of

expanding some nodes more than once. While DFBnB never expands any node more than once,

its overhead consists of expanding some nodes whose cost exceed the optimal cost. For problems

whose search trees are of bounded depth, or for which it is easy to construct a good solution, such

as the TSP, DFBnB is usually the algorithm of choice for ﬁnding an optimal solution. For problems

with inﬁnite search trees, or for which it is diﬃcult to construct a low-cost solution, such as the

sliding-tile puzzles or Rubik’s cube, IDA∗ is usually the best choice. Both IDA∗ and DFBnB suﬀer

the same limitation of all linear-space search algorithms, however, which is that they can generate

exponentially more nodes on a graph with many short cycles, relative to a best-ﬁrst search.

22.4.7 Complexity of Finding Optimal Solutions

The time complexity of a heuristic search algorithm depends on the accuracy of the heuristic function.

For example, if the heuristic evaluation function is an exact estimator, then A∗ runs in linear time,

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 10 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-11

expanding only those nodes on an optimal solution path. Conversely, with a heuristic that returns

zero everywhere, A∗ becomes brute-force uniform-cost search, with exponential complexity. In

general, the eﬀect of a good heuristic function is to reduce the eﬀective depth of search required [23].

22.4.8 Heuristic Path Algorithm

Since the complexity of ﬁnding optimal solutions to these problems is still exponential, even with a

good heuristic function, in order to solve signiﬁcantly larger problems, the optimality requirement

must be relaxed. An early approach to this problem was the heuristic path algorithm (HPA) [35].

HPA is a best-ﬁrst search, where the cost of a node n is computed as f (n) = g(n) + w ∗ h(n). Varying

w produces a range of algorithms from uniform-cost search (w = 0), through A∗ (w = 1), to pure

heuristic search (w = ∞). Increasing w beyond 1 generally decreases the amount of computation,

while increasing the cost of the solution generated. This trade-oﬀ is often quite favorable, with small

increases in solution cost yielding huge savings in computation [21]. Furthermore, it can be shown

that the solutions found by this algorithm are guaranteed to be no more than a factor of w greater

than optimal [2], but often are signiﬁcantly better.

Breadth-ﬁrst search, uniform-cost search, pure heuristic search, A∗ , and the heuristic path algorithm are all special cases of best-ﬁrst search. In each step of a best-ﬁrst search, the node that is best

according to some cost function is chosen for expansion. These best-ﬁrst algorithms diﬀer only in

their cost functions: the depth of node n for breadth-ﬁrst search, g(n) for uniform-cost search, h(n)

for pure heuristic search, g(n) + h(n) for A∗ , and g(n) + w ∗ h(n) for the heuristic path algorithm.

22.4.9 Recursive Best-First Search

The memory limitation of the heuristic path algorithm can be overcome simply by replacing the

best-ﬁrst search with IDA∗ using the same weighted evaluation function. However, with w ≥ 1, IDA∗

is no longer a best-ﬁrst search, since the total cost of a child can be less than that of its parent, and thus

nodes are not necessarily expanded in best-ﬁrst order. An alternative algorithm is recursive best-ﬁrst

search (RBFS) [21]. RBFS simulates a best-ﬁrst search in space that is linear in the maximum search

depth, regardless of the cost function used. Even with an admissible cost function, RBFS generates

fewer nodes than IDA∗ , and is generally superior to IDA∗ , except for a small increase in the cost per

node generation.

It works by maintaining on the recursion stack the complete path to the current node being

expanded, as well as all siblings of nodes on that path, along with the cost of the best node in the

subtree explored below each sibling. Whenever the cost of the current node exceeds that of some

other node in the previously expanded portion of the tree, the algorithm backs up to their closest

common ancestor, and continues the search down the new path. In eﬀect, the algorithm maintains

a separate threshold for each subtree diverging from the current search path. See [21] for full details

on RBFS.

22.5

Interleaving Search and Execution

In the discussion above, it is assumed that a complete solution can be computed, even before the ﬁrst

step of the solution is executed. This is in contrast to the situation in two-player games, discussed

below, where because of computational limits and uncertainty due to the opponent’s moves, search

and execution are interleaved, with each search determining only the next move to be made. This

paradigm is also applicable to single-agent problems. In the case of autonomous vehicle navigation,

for example, information is limited by the horizon of the vehicle’s sensors, and it must physically

move to acquire more information. Thus, only a few moves can be computed at a time, and those

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 11 2009-10-6

22-12

Special Topics and Techniques

moves must be executed before computing the next. Below we consider algorithms designed for this

scenario.

22.5.1 Minimin Search

Minimin search determines individual single-agent moves in constant time per move [19]. The

algorithm searches forward from the current state to a ﬁxed depth determined by the informational

or computational resources available. At the search horizon, the A∗ evaluation function f (n) =

g(n) + h(n) is applied to the frontier nodes. Since all decisions are made by a single agent, the value

of an interior node is the minimum of the frontier values in the subtree below that node. A single

move is then made to a neighbor of the current state with the minimum value.

Most heuristic functions obey the triangle inequality characteristic of distance measures. As a

result, f (n) = g(n) + h(n) is guaranteed to be monotonically nondecreasing along a path. Furthermore, since minimin search has a ﬁxed depth limit, we can apply depth-ﬁrst branch-and-bound to

prune the search tree. The performance improvement due to branch-and-bound is quite dramatic,

in some cases extending the achievable search horizon by a factor of 5 relative to brute-force minimin

search on sliding-tile puzzles [19].

Minimin search with branch-and-bound is an algorithm for evaluating the immediate neighbors

of the current node. As such, it is run until a best child is identiﬁed, at which point the chosen move

is executed in the real world. We can view the static evaluation function combined with lookahead

search as simply a more accurate, but computationally more expensive, heuristic function. In fact,

it provides an entire spectrum of heuristic functions trading oﬀ accuracy for cost, depending on the

search horizon.

22.5.2 Real-Time-A∗

Simply repeating minimin search for each move ignores information from previous searches and

results in inﬁnite loops. In addition, since actions are committed based on limited information, often

the best move may be to undo the previous move. The principle of rationality is that backtracking

should occur when the estimated cost of continuing the current path exceeds the cost of going back

to a previous state, plus the estimated cost of reaching the goal from that state. Real-time-A∗ (RTA∗ )

implements this policy in constant time per move on a tree [19].

For each move, the f (n) = g(n) + h(n) value of each neighbor of the current state is computed,

where g(n) is now the cost of the edge from the current state to the neighbor, instead of from the

initial state. The problem solver moves to a neighbor with the minimum f (n) value, and stores in

the previous state the best f (n) value among the remaining neighbors. This represents the h(n) value

of the previous state from the perspective of the new current state. This is repeated until a goal is

reached. To determine the h(n) value of a previously visited state, the stored value is used, while for

a new state the heuristic evaluator is called. Note that the heuristic evaluator may employ minimin

lookahead search with branch-and-bound as well.

In a ﬁnite problem space in which there exists a path to a goal from every state, RTA∗ is guaranteed

to ﬁnd a solution, regardless of the heuristic evaluation function [19]. Furthermore, on a tree, RTA∗

makes locally optimal decisions based on the information available at the time of the decision.

22.5.3 Learning-Real-Time-A∗

If a problem is to be solved repeatedly with the same goal state but diﬀerent initial states, we would

like our algorithm to improve its performance over time. Learning-real-time-A∗ (LRTA∗ ) is such

an algorithm. It behaves almost identically to RTA∗ , except that instead of storing the second-best f

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 12 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-13

value of a node as its new heuristic value, it stores the best value instead. Once one problem instance

is solved, the stored heuristic values are saved and become the initial values for the next problem

instance. While LRTA∗ is less eﬃcient than RTA∗ for solving a single problem instance, if it starts

with admissible initial heuristic values, over repeated trials its heuristic values eventually converge

to their exact values, at which point the algorithm returns optimal solutions.

22.6

Game Playing

The second major application of heuristic search algorithms in AI is game playing. One of the original

challenges of AI, which in fact predates the term artiﬁcial intelligence, was to build a program that

could play chess at the level of the best human players [48], a goal ﬁnally achieved in 1997 [18].

The earliest work in this area focussed on two-player perfect information games, such as chess and

checkers, but recently the ﬁeld has broadened considerably.

22.6.1 Minimax Search

The standard algorithm for two-player perfect-information games is minimax search with heuristic

static evaluation [44]. The simplest version of the algorithm searches forward to a ﬁxed depth in

the game tree, limited by the amount of time available per move. At this search horizon, a heuristic

evaluation function is applied to the frontier nodes. In this case, the heuristic is a function that

takes a board position and returns a number that indicates how favorable that position is for one

player relative to the other. For example, a very simple heuristic evaluator for chess would count the

total number of pieces on the board for one player, weighted by their relative strength, and subtract

the weighted sum of the opponent’s pieces. Thus, large positive values would correspond to strong

positions for one player, called MAX, whereas negative values of large magnitude would represent

advantageous situations for the opponent, called MIN.

Given the heuristic evaluations of the frontier nodes, values for the interior nodes in the tree are

recursively computed according to the minimax rule. The value of a node where it is MAX’s turn to

move is the maximum of the values of its children, while the value of a node where MIN is to move

is the minimum of the values of its children. Thus, at alternate levels of the tree, the minimum or

the maximum values of the children are backed up. This continues until the values of the immediate

children of the current position are computed, at which point one move is made to the child with

the maximum or minimum value, depending on whose turn it is to move.

22.6.2 Alpha-Beta Pruning

One of the most elegant of all AI search algorithms is alpha-beta pruning. It was invented in the

late 1950s, and a thorough treatment of the algorithm can be found in [28]. The idea, similar to

branch-and-bound, is that the minimax value of the root of a game tree can be determined without

examining all the nodes at the search frontier.

Figure 22.5 shows some examples of alpha-beta pruning. Only the pictured nodes are generated

by the algorithm, with the heavy black lines indicating pruning. At the square nodes MAX is to

move, while at the circular nodes it is MIN’s turn. The search proceeds depth-ﬁrst to minimize

the memory required, and only evaluates a frontier node when necessary. First, nodes e and f are

statically evaluated at 4 and 5, respectively, and their minimum value, 4, is backed up to their parent

node d. Node h is then evaluated at 3, and hence the value of its parent node g must be less than or

equal to 3, since it is the minimum of 3 and the unknown value of its right child. The value of node

c must be 4 then, because it is the maximum of 4 and a value that is less than or equal to 3. Since

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 13 2009-10-6

22-14

Special Topics and Techniques

a 4

<2

m–

b 4

c 4

d 4

4

e

5

f

i >

–6

g <

–3

j 6

3

h

6

k

7

l

n –

<2

o <

–1

<2

q–

1

p

2

r

FIGURE 22.5 Alpha-beta pruning example.

we have determined the minimax value of node c, we do not need to evaluate or even generate any

more children of node g.

Similarly, after statically evaluating nodes k and l at 6 and 7, respectively, the backed up value of

their parent node j is 6, the minimum of these values. This tells us that the minimax value of node i

must be greater than or equal to 6, since it is the maximum of 6 and the unknown value of its right

child. Since the value of node b is the minimum of 4 and a value that is greater than or equal to 6, it

must be 4, and hence we prune the remaining children of node i.

The right half of the tree shows an example of deep pruning. After evaluating the left half of the

tree, we know that the value of the root node a is greater than or equal to 4, the minimax value of

node b. Once node p is evaluated at 1, the value of its parent node o must be less than or equal to

1. Since the value of the root is greater than or equal to 4, the value of node o cannot propagate to

the root, and hence we need not generate any more children of node o. A similar situation exists

after the evaluation of node r as 2. At that point, the value of node o is less than or equal to 1, and

the value of node q is less than or equal to 2; hence, the value of node n, which is the maximum of

the values of nodes o and q, must be less than or equal to 2. Furthermore, since the value of node m

is the minimum of the value of node n and that of its brothers, and node n has a value less than or equal

to 2, the value of node m must also be less than or equal to 2. This causes the remaining children of

node m to be pruned, since the value of the root node a is greater than or equal to 4. Thus, we compute

the minimax value of the root of the tree to be 4 by generating only seven leaf nodes in this case.

Since alpha-beta pruning performs a minimax search while pruning much of the tree, its eﬀect

is to allow a deeper search in the same amount of time. This raises the question of how much does

alpha-beta improve performance? The eﬃciency of alpha-beta pruning depends upon the order in

which nodes are encountered at the search frontier. For any set of frontier node values, there exists

some ordering of the values such that alpha-beta will not perform any cutoﬀs at all. In that case, all

frontier nodes must be evaluated, and the time complexity is O(bd ).

On the other hand, there is an optimal or perfect ordering in which every possible cutoﬀ is

realized. In that case, the asymptotic time complexity is reduced from O(bd ) to O(bd/2 ). Another

way of viewing the perfect ordering case is that for the same amount of computation, one can search

twice as deep with alpha-beta pruning as without. Since the search tree grows exponentially with

depth, doubling the search horizon is a dramatic improvement.

In between worst-possible ordering and perfect ordering is random ordering, which is the average

case. Under random ordering of the frontier nodes, alpha-beta pruning reduces the asymptotic time

complexity to approximately O(b3d/4 ) [33]. This means that one can search 4/3 times as deep with

alpha-beta than with simple minimax, yielding a 33% improvement in search depth.

In practice, however, the time complexity of alpha-beta is closer to the best case of O(bd/2 ) due to

node ordering. The idea of node ordering is that instead of generating the children of a node in an

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 14 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-15

arbitrary order, we can order the tree based on static evaluations of the interior nodes. In particular,

the children of MAX nodes are expanded in decreasing order of their static values, while the children

of MIN nodes are expanded in increasing order of their static values.

22.6.3 Other Techniques

A wide variety of additional techniques are employed in modern game-playing programs, including

quiescent search, iterative deepening, transposition tables, opening books, end-game databases, and

special-purpose hardware. We consider each of these in turn.

22.6.3.1

Quiescence

The idea of quiescence is that the static evaluator should not be applied to positions whose values

are unstable, such as those occurring in the middle of a piece trade. In those positions, a small

secondary search is conducted until the static evaluation becomes more stable. In games such as

chess or checkers, this can be achieved by not statically evaluating any position that allows capture

moves, but exploring capture moves one level deeper.

22.6.3.2

Iterative Deepening

Iterative deepening is used to solve the problem of where to set the search horizon [45], and in fact

predated its use as a memory-saving device in single-agent search. In a tournament game, there is

a limited amount of time allowed for moves. Unfortunately, it is very diﬃcult to accurately predict

how long it will take to perform an alpha-beta search to a given depth. The solution is to perform

a series of searches to successively greater depths. When time runs out, the move recommended by

the last completed search is made.

22.6.3.3

Transposition Tables

The search graphs of most games contain multiple paths to the same node, often reached by making

the same moves in a diﬀerent order, referred to as a transposition of the moves. Since alpha-beta

is a depth-ﬁrst search, it does not detect such duplicate nodes. A transposition table is a table of

previously encountered game states, together with their backed-up minimax values. Whenever a

new state is generated, if it is stored in the transposition table, its stored value is used instead of

researching the tree below the node. Transposition tables can also be used in single-agent depth-ﬁrst

searches to detect some duplicate nodes.

22.6.3.4

Opening Books

Just like people, programs often learn and store the best ﬁrst few moves of a game, since many games

start from a common initial state. Such an opening book can be hand-coded by a human game

expert, or automatically learned from deep oﬀ-line searches.

22.6.3.5

End-Game Databases

In many games, such as checkers, the number of positions near the end of the game is small relative

to the mid-game. This allows the precomputation and storage of end-game databases, which contain

the true value, such as win, lose, or draw, of a large number of positions near the end of the game.

For example, the Chinook checkers program stores on disk a database of all positions with 10 or

fewer pieces on the board, along with their exact values [42]. When a position in the database is

encountered during search, the stored value is used, rather than searching the position any further.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 15 2009-10-6

22-16

22.6.3.6

Special Topics and Techniques

Special-Purpose Hardware

While the basic algorithms are described above, much of the performance advances in computer

chess in previous decades came from faster hardware. The faster the machine, the deeper it can search

in the time available, and the better it plays. Despite the rapidly advancing speed of general-purpose

computers, the best machines in the 1980s and 1990s included special-purpose hardware designed

and built only to play chess. For example, IBM’s DeepBlue could evaluate about 200 million chess

positions per second [17].

22.6.4 Signiﬁcant Milestones

The most signiﬁcant milestones in this area include defeating the human world champion at chess,

and solving the game of checkers.

In May 1997, IBM’s DeepBlue defeated Gary Kasparov, the world champion, in a six-game

exhibition match, achieving a long-anticipated goal in artiﬁcial intelligence [18]. Currently, the best

chess machines are general-purpose computers that rely entirely on software for their performance,

and play comparably to the best humans.

In April 2007, Jonathan Schaeﬀer’s Chinook group at the University of Alberta announced that

they had solved the game of checkers [43], proving that when both sides play perfectly, checkers is a

draw. This is by far the most complex game solved to date, using two dozen computers running for

about 18 years. The combination of its opening book, its mid-game search, and 10-piece end-game

databases allows Chinook to play perfect checkers.

22.6.5 Multiplayer Games, Imperfect and Hidden Information

Minimax search with static evaluation and alpha-beta pruning is most appropriate for two-player

games with perfect information and alternating moves among the players. This paradigm extends

in a straightforward way to more than two players, but alpha-beta becomes much less eﬀective [20].

Games with chance elements, such as the roll of the dice in backgammon, for example, tend to

limit search algorithms because of the need to search over all possible chance outcomes. In addition

to chance, card games have information that is available to some players but hidden from others,

such as cards in diﬀerent hands in Bridge. Poker is a very diﬃcult challenge in this area, combining

all of the above complexities, as well as active deception and the need to model the opponents.

One technique that has been eﬀective in handling hidden information is Monte-Carlo sampling

[12]. Given a decision to be made, such as the play of a card in bridge, we can randomly generate a

set of hands that is consistent with the information known about those hands at the current point

in time. Given this particular set of hands, we then use perfect-information techniques, such as

alpha-beta minimax, to determine the optimal play in this case. We then repeat the experiment,

generating another random set of hands, consistent with the information available, and compute

the optimal move in that case. After generating about 100 diﬀerent random hands, we then play the

card that was most often the optimal card to play over all the randomly generated hands.

22.7

Constraint-Satisfaction Problems

In addition to single-agent path-ﬁnding problems and game playing, the third major application

of heuristic search is constraint-satisfaction problems. The N-queens problem and the popular

Sudoku puzzle are classic examples. Other examples include graph coloring, boolean satisﬁability,

and scheduling problems.

Constraint-satisfaction problems are usually modelled as follows: There is a set of variables, a set

of values for each variable, and a set of constraints on the values that the variables can be assigned.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 16 2009-10-6

22-17

Artiﬁcial Intelligence Search Algorithms

A unary constraint on a variable speciﬁes a subset of all possible values that can be assigned to

that variable. A binary constraint between two variables speciﬁes which possible combinations of

assignments to the pair of variables is legal. For example, in the N-queens problem, the variables would

represent individual queens, and the values would be their positions on the board. The constraints

are binary constraints on each pair of queens that prohibit them from occupying positions on the

same row, column, or diagonal.

22.7.1 Chronological Backtracking

The brute-force approach to constraint satisfaction is called chronological backtracking. One selects

an order for the variables, and an order for the values, and starts assigning values to the variables

one at a time. Each assignment is made so that all constraints involving any of the instantiated

variables are satisﬁed. The reason for this is that once a constraint is violated, no assignment to the

remaining variables can possibly satisfy that constraint. Once a variable is reached which has no

remaining legal assignments, then the last variable that was assigned is reassigned to its next legal

value. The algorithm continues until either one or all complete, consistent assignments are found,

resulting in success, or all possible assignments are shown to violate some constraint, resulting in

failure. Figure 22.6 shows the tree generated by chronological backtracking to ﬁnd all solutions to

the four-queens problem. The tree is searched depth-ﬁrst to minimize memory requirements.

22.7.2 Intelligent Backtracking

One can improve the performance of chronological backtracking using a number of techniques,

such as variable ordering, value ordering, backjumping, and forward checking.

The order in which variables are instantiated can have a large eﬀect on the size of a search tree. The

idea of variable ordering is to order the variables from most constrained to least constrained [10,37].

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

FIGURE 22.6 Tree generated to solve the four queens problem.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 17 2009-10-6

22-18

Special Topics and Techniques

For example, if any variable has only a single value remaining that is consistent with the previously

instantiated variables, it should be assigned that value immediately. In general, the variables should

be instantiated in increasing order of the size of their remaining domains. This can either be done

statically at the beginning of the search, or dynamically, reordering the remaining variables each

time a variable is assigned a new value.

The order in which the values of a given variable are chosen determines the order in which the tree

is searched. Since it does not eﬀect the size of the tree, it makes no diﬀerence if all solutions are to be

found. If only a single solution is required, however, value ordering can decrease the time required

to ﬁnd a solution. Thus, one should order the values from least constraining to most constraining,

in order to minimize the time required to ﬁnd a ﬁrst solution [6,13].

The idea of backjumping is that when an impasse is reached, instead of simply undoing the last

decision made, the decision that actually caused the failure should be modiﬁed [11]. For example,

consider a three-variable problem where the variables are instantiated in the order x, y, z. Assume

that values have been chosen for both x and y, but that all possible values for z conﬂict with the value

chosen for x. In chronological backtracking, the value chosen for y would be changed, and then all

the possible values for z would be tested again, to no avail. A better strategy in this case is to go back

to the source of the failure, and change the value of x, before trying diﬀerent values for y and z.

When a variable is assigned a value, forward checking checks each remaining uninstantiated

variable to make sure that there is at least one assignment for each of them that is consistent with the

current assignments. If not, the original variable is reassigned to its next value. Forward checking

subsumes backjumping.

22.7.3 Constraint Recording

In a constraint-satisfaction problem, some constraints are explicitly speciﬁed, and others are implied

by explicit constraints. Implicit constraints may be discovered either during a backtracking search,

or in advance in a preprocessing phase. The idea of constraint recording is that once these implicit

constraints are discovered, they should be saved explicitly so that they do not have to be rediscovered.

A simple example of constraint recording in a preprocessing phase is called arc consistency [10,

29,31]. For each pair of variables x and y that are related by a binary constraint, we remove from the

domain of x any values that do not have at least one corresponding consistent assignment to y, and

vice versa. In general, several iterations may be required to achieve complete arc consistency. Path

consistency is a generalization of arc consistency where instead of considering pairs of variables,

we examine triples of constrained variables. The eﬀect of performing arc or path consistency before

backtracking is that the resulting search space can be dramatically reduced. In some cases, this

preprocessing of the constraints can eliminate the need for search entirely.

22.7.4 Local Search Algorithms

Backtracking searches a space of consistent partial assignments to variables, in the sense that all

constraints among instantiated variables are satisﬁed, and looks for a complete consistent assignment

to the variables, or in other words a solution. An alternative approach is to search a space of

inconsistent but complete assignments to the variables, until a consistent complete assignment is

found. This approach is known as heuristic repair [30] or more generally stochastic local search.

For example, in the N-queens problem, this algorithm places all N queens on the board at the

same time, and then moves queens one at a time until a solution is found. The natural heuristic,

called min-conﬂicts, moves a queen that is in conﬂict with the most other queens, to a position

where it conﬂicts with the fewest other queens. What is surprising about this simple strategy is how

well it performs, relative to backtracking. While backtracking techniques can solve on the order of

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 18 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-19

hundred-queen problems, heuristic repair can solve million-queen problems, often with only about

50 individual queen moves!

This strategy has been extensively explored in the context of boolean satisﬁability. The problem of

boolean satisﬁability starts with a formula in propositional logic in conjunctive normal form. This is

an AND of a set of clauses, each of which is an OR of a set of literals, each of which is either an atomic

proposition or a negated atomic proposition. For example, the boolean formula (a + b) · (a + b )

is in conjunctive normal form if + represents OR, · represents AND, and represents negation. The

problem is to determine if there exists some assignment to the variables such that the entire formula

evaluates to true, or in other words, all the clauses are satisﬁed. For example, the above formula is

satisﬁable, since assigning a to true and b to false will satisfy both clauses. If each clause has three

literals, called 3-SAT, the problem is NP-complete.

The greedy satisﬁability (GSAT) algorithm [40] starts with a random assignment of boolean values

to the variables of a satisﬁability problem, and ﬂips the assignment of a variable that results in the

largest net increase in the number of satisﬁed clauses. It continues until either a solution is found, or

until a speciﬁed number of ﬂips are performed, at which point it starts over with a diﬀerent random

initial assignment. WalkSAT [41] adds another random element to this algorithm. With a certain

probability, it ﬂips a variable that GSAT would ﬂip, and with one minus that probability, it chooses

a random unsatisﬁed clause, and ﬂips a randomly chosen variable in that clause.

The main drawback of these local-search approaches is that they are not complete, in that they are

not guaranteed to ﬁnd a solution in a ﬁnite amount of time, even if one exists. If there is no solution,

these algorithms will run forever, whereas complete backtracking algorithms will eventually discover

that a problem is not solvable.

The best complete algorithms for boolean satisﬁability are based on the DPLL algorithm [3]. These

algorithms assign values to variables one at a time, and propagate the consequences of assignments

using unit propagation, backtracking when a partial assignment cannot be extended any further. Both

complete algorithms and stochastic local search are areas of active research for boolean satisﬁability.

Currently, local search algorithms seem to perform best on diﬃcult random problems that are

satisﬁable, while the complete algorithms tend to perform better on more structured problems.

While constraint-satisfaction problems appear somewhat diﬀerent from single-agent path-ﬁnding

problems and two-player games, there is a strong similarity among the algorithms employed. For

example, backtracking can be viewed as a form of branch-and-bound, where a node is pruned when

any constraint is violated. Similarly, heuristic repair can be viewed as a heuristic search where the

evaluation function is the total number of constraints that are violated, and the goal is to ﬁnd a state

with zero constraint violations.

22.8

Research Issues and Summary

22.8.1 Research Issues

A primary research problem in this area is the development of faster algorithms. All the above

algorithms are limited by eﬃciency either in the size of problems that they can solve optimally, or

in the quality of the decisions they can make, or solutions they can ﬁnd within practical time limits.

Thus, there is a continual demand for faster algorithms.

One approach to faster algorithms is parallel search. Most search algorithms have a tremendous

amount of potential parallelism, since the basic step of node generation and evaluation is often

performed billions or trillions of times. As a result, many such algorithms are readily parallelized

with nearly linear speedups. The algorithms that are diﬃcult to parallelize are branch-and-bound

algorithms, such as alpha-beta pruning, because the results of searching one part of the tree determine

whether another part of the tree needs to be examined at all.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 19 2009-10-6

22-20

Special Topics and Techniques

Since the performance of a search algorithm depends critically on the quality of the heuristic

evaluation function, another important research area is the automatic generation of such functions.

This was pioneered in the area of two-player games by Arthur Samuel’s landmark checkers program

that learned to improve its evaluation function through repeated play [39]. The development of

pattern databases, described above, is another approach to automatic heuristic construction.

While the ﬁrst games to be addressed with AI techniques were two-player perfect information

games such as chess and checkers, the ﬁeld has branched out to consider games with random

elements, such as backgammon, hidden information, such as card games, and multiple players.

Recently the game of poker has seen a great deal of activity, both from search-based and gametheoretic approaches. The game of Go continues to be very challenging for computers, both because

of its large branching factor, and the diﬃculty of evaluating nonterminal positions.

Another important area in computer game-playing is developing automated opponents for video

games. The electronic entertainment industry is larger than the motion-picture industry, and most

of these games have agents that must be computer controlled. In addition to low-level issues such as

path planning for multiple agents, these games require high-level strategy as well.

22.8.2 Summary

We have described search algorithms for three diﬀerent classes of problems. In single-agent pathﬁnding problems, the task is to ﬁnd a sequence of operators that map an initial state to a desired goal

state. Much of the work in this area has focussed on ﬁnding optimal solutions to such problems,

often making use of admissible heuristic functions to speed up the search without sacriﬁcing solution

optimality. In the area of game playing, ﬁnding optimal solutions is infeasible, and research has

focussed on algorithms for making the best move decisions possible given a limited amount of

computing time. This approach has also been applied to single-agent problems as well. In constraintsatisfaction problems, the task is to ﬁnd a state that satisﬁes a set of constraints. While all three of

these types of problems are diﬀerent, the same set of ideas, such as brute-force search and heuristic

evaluation functions, can be applied to all three.

22.9

Further Information

The classic reference in this area is [34]. A number of papers were collected in an edited volume

devoted to search [16]. The constraint-satisfaction area is covered by [4], and stochastic local

search is covered in [15]. Most new research in this area initially appears in the Proceedings

of the Association for the Advancement of on Artiﬁcial Intelligence (AAAI) or the Proceedings

of the International Joint Conference on Artiﬁcial Intelligence (IJCAI). The Preeminent journals

in this area include Artiﬁcial Intelligence (AIJ), and the Journal of Artiﬁcial Intelligence Research

(JAIR).

Deﬁning Terms

Admissible: A heuristic is said to be admissible if it never overestimates actual distance from a given

state to a goal. An algorithm is said to be admissible if it always ﬁnds an optimal solution to a problem

if one exists.

Branching factor: The average number of children of a node in a problem-space graph.

Constraint-satisfaction problem: A problem where the task is to identify a state that satisﬁes a set

of constraints.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 20 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-21

Depth: The length of a shortest path from the initial state to a goal state.

Heuristic evaluation function: A function from a state to a number. In a single-agent problem, it

estimates the distance from the state to a goal. In a game, it estimates the merit of the position with

respect to one player.

Node expansion: Generating all the children of a given state.

Node generation: Creating the data structure that corresponds to a problem state.

Operator: An action that maps one state into another state, such as a twist of Rubik’s cube.

Problem instance: A problem space together with an initial state of the problem and a desired set

of goal states.

Problem space: A theoretical construct in which a search takes place, consisting of a set of states

and a set of operators.

Problem-space graph: A graphical representation of a problem space, where states are represented

by nodes, and operators are represented by edges.

Search: A trial-and-error exploration of alternative solutions to a problem.

Search tree: A problem-space graph with a unique path to each node.

Single-agent path-ﬁnding problem: A problem where the task is to ﬁnd a sequence of operators

that map an initial state to a goal state.

State: A conﬁguration of a problem, such as the arrangement of the parts of a Rubik’s cube at a given

point in time.

Acknowledgment

This work was supported in part by NSF Grant IIS-0713178.

References

1. Culberson, J. and J. Schaeﬀer, Pattern databases, Computational Intelligence, 14(3), 318–334, 1998.

2. Davis, H.W., A. Bramanti-Gregor, and J. Wang, The advantages of using depth and breadth components in heuristic search, in Methodologies for Intelligent Systems 3, Z.W. Ras and L. Saitta (Eds.),

North-Holland, Amsterdam, the Netherlands, 1989, pp. 19–28.

3. Davis, M., G. Logemann, and D. Loveland, A machine program for theorem proving, Journal of the

Association for Computing Machinery, 5(7), 394–397, 1962.

4. Dechter, R., Constraint Processing, Morgan-Kaufmann, San Francisco, CA, 2003.

5. Dechter, R. and J. Pearl, Generalized best-ﬁrst search strategies and the optimality of A∗ , Journal of

the Association for Computing Machinery, 32(3), 505–536, July 1985.

6. Dechter, R. and J. Pearl, Network-based heuristics for constraint-satisfaction problems, Artiﬁcial

Intelligence, 34(1), 1–38, 1987.

7. Dijkstra, E.W., A note on two problems in connexion with graphs, Numerische Mathematik, 1,

269–271, 1959.

8. Dillenburg, J.F. and P.C. Nelson, Improving the eﬃciency of depth-ﬁrst search by cycle elimination,

Information Processing Letters, 45(1), 5–10, 1993.

9. Doran, J.E. and D. Michie, Experiments with the graph traverser program, Proceedings of the Royal

Society A, 294, 235–259, 1966.

10. Freuder, E.C., A suﬃcient condition for backtrack-free search, Journal of the Association for Computing Machinery, 29(1), 24–32, 1982.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 21 2009-10-6

22-22

Special Topics and Techniques

11. Gaschnig, J., Performance measurement and analysis of certain search algorithms, PhD thesis.

Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA, 1979.

12. Ginsberg, M.L., GIB: Imperfect information in a computationally challenging game, Journal of Artiﬁcial Intelligence Research, 14, 303–358, 2001.

13. Haralick, R.M. and G.L. Elliott, Increasing tree search eﬃciency for constraint satisfaction problems,

Artiﬁcial Intelligence, 14, 263–313, 1980.

14. Hart, P.E., N.J. Nilsson, and B. Raphael, A formal basis for the heuristic determination of minimum

cost paths, IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107, 1968.

15. Hoos, H.H. and T. Stutzle, Stochastic Local Search: Foundations and Applications, Morgan-Kaufmann,

San Francisco, CA, 2004.

16. Kanal, L. and V. Kumar (Eds.), Search in Artiﬁcial Intelligence, Springer-Verlag, New York, 1988.

17. Campbell, M., A.J. Hoane, and F. Hsu, Deep blue, Artiﬁcial Intelligence, 134(1–2), 57–83, 2002.

18. Korf, R.E., Depth-ﬁrst iterative-deepening: An optimal admissible tree search, Artiﬁcial Intelligence,

27(1), 97–109, 1985.

19. Korf, R.E., Real-time heuristic search, Artiﬁcial Intelligence, 42(2–3), 189–211, 1990.

20. Korf, R.E., Multi-player alpha-beta pruning, Artiﬁcial Intelligence, 48(1), 99–111, 1991.

21. Korf, R.E., Linear-space best-ﬁrst search, Artiﬁcial Intelligence, 62(1), 41–78, 1993.

22. Korf, R.E., Finding optimal solutions to Rubik’s cube using pattern databases, Proceedings of the

Fourteenth National Conference on Artiﬁcial Intelligence (AAAI-97), Providence, RI, July 1997,

pp. 700–705.

23. Korf, R.E., M. Reid, and S. Edelkamp, Time complexity of Iterative-Deepening-A∗ , Artiﬁcial Intelligence, 129(1–2), 199–218, 2001.

24. Korf, R.E. and A. Felner, Disjoint pattern database heuristics, Artiﬁcial Intelligence, 134(1–2), 9–22,

2002.

25. Korf, R.E, W. Zhang, I. Thayer, and H. Hohwald, Frontier search, Journal of the Association for

Computing Machinery, 52(5), 715–748, 2005.

26. Korf, R.E. and P. Schultze, Large-scale, parallel breadth-ﬁrst search, Proceedings of the National

Conference on Artiﬁcial Intelligence (AAAI-05), Pittsburgh, PA, July 2005, pp. 1380–1385.

27. Korf, R.E., Linear-time disk-based implicit graph search, Journal of the ACM, 55(6), 26-1–26-40,

2008.

28. Knuth, D.E. and R.E. Moore, An analysis of alpha-beta pruning, Artiﬁcial Intelligence, 6(4), 293–326,

1975.

29. Mackworth, A.K., Consistency in networks of relations. Artiﬁcial Intelligence, 8(1), 99–118, 1977.

30. Minton, S., M.D. Johnston, A.B. Philips, and P. Laird, Minimizing conﬂicts: A heuristic repair method

for constraint satisfaction and scheduling problems, Artiﬁcial Intelligence, 58(1–3), 161–205, 1992.

31. Montanari, U., Networks of constraints: Fundamental properties and applications to picture processing, Information Science, 7, 95–132, 1974.

32. Newell, A. and H.A. Simon, Human Problem Solving, Prentice-Hall, Englewood Cliﬀs, NJ, 1972.

33. Pearl, J., The solution for the branching factor of the alpha-beta pruning algorithm and its optimality,

Communications of the Association of Computing Machinery, 25(8), 559–564, 1982.

34. Pearl, J., Heuristics, Addison-Wesley, Reading, MA, 1984.

35. Pohl, I., Heuristic search viewed as path ﬁnding in a graph, Artiﬁcial Intelligence, 1, 193–204, 1970.

36. Pohl, I., Bi-directional search, in Machine Intelligence 6, B. Meltzer and D. Michie (Eds.), American

Elsevier, New York, 1971, pp. 127–140.

37. Purdom, P.W., Search rearrangement backtracking and polynomial average time, Artiﬁcial Intelligence, 21(1–2), 117–133, 1983.

38. Ratner, D. and M. Warmuth, Finding a shortest solution for the NxN extension of the 15-puzzle is

intractable, Proceedings of the Fifth National Conference on Artiﬁcial Intelligence (AAAI-86), Philadelphia, PA, 1986, pp. 168–172.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 22 2009-10-6

Artiﬁcial Intelligence Search Algorithms

22-23

39. Samuel, A.L., Some studies in machine learning using the game of checkers, in Computers and

Thought, E. Feigenbaum and J. Feldman (Eds.), McGraw-Hill, New York, 1963, pp. 71–105.

40. Selman, B., H. Levesque, and D. Mitchell, A new method for solving hard satisﬁability problems,

Proceedings of the Tenth National Conference on Artiﬁcial Intelligence (AAAI-92), San Jose, CA, July

1992, pp. 440–446.

41. Selman, B., H. Kautz, and B. Cohen, Noise strategies for improving local search, Proceedings of

the Twelfth National Conference on Artiﬁcial Intelligence (AAAI-94), Seattle, WA, August 1994.

pp. 337–343.

42. Schaeﬀer, J., Y. Bjornsson, N. Burch, R. Lake, P. Lu, and S. Sutphen, Building the checkers 10-piece

endgame databases, in H.J. van den Herik, Iida, H., and Heinz, E.A., editors, Advances in Computer

Games 10, Kluwer, Boston, MA, 2003, pp. 193–210.

43. Schaeﬀer, J., N. Burch, Y. Bjornsson, A. Kishimoto, M. Mueller, R. Lake, P. Lu, and S. Sutphen,

Checkers is solved, Science, 317, 1518–1522, 2007.

44. Shannon, C.E., Programming a computer for playing chess, Philosophical Magazine, 41, 256–275,

1950.

45. Slate, D.J. and L.R. Atkin, CHESS 4.5 - The Northwestern University chess program, in Chess Skill in

Man and Machine, P.W. Frey (Ed.), Springer-Verlag, New York, 1977, pp. 82–118.

46. Stickel, M.E. and W.M. Tyson, An analysis of consecutively bounded depth-ﬁrst search with

applications in automated deduction, Proceedings of the International Joint Conference on Artiﬁcial Intelligence (IJCAI-85), Los Angeles, CA, August 1985, pp. 1073–1075.

47. Taylor, L. and R.E. Korf, Pruning duplicate nodes in depth-ﬁrst search, Proceedings of the National

Conference on Artiﬁcial Intelligence (AAAI-93), Washington, DC, July 1993, pp. 756–761.

48. Turing, A.M., Computing machinery and intelligence, Mind, 59, 433–460, 1950. Also in Computers

and Thought, E. Feigenbaum and J. Feldman (Eds.), McGraw-Hill, New York, 1963.

49. Zhou, R. and E.A. Hansen, Sparse-memory graph search, Proceedings of the 18th International Joint

Conference on Artiﬁcial Intelligence (IJCAI-03), Acapulco, Mexico, August 2003, pp. 1259–1266.

50. Zhou, R. and E.A. Hansen, Structured duplicate detection in external-memory graph search,

Proceedings of the 19th National Conference on Artiﬁcial Intelligence (AAAI-04), San Jose, CA,

July 2004, pp. 683–688.

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 23 2009-10-6

Atallah/Algorithms and Theory of Computation Handbook: Second Edition C820X_C022 Finals Page 24 2009-10-6

Có thể đăng nhập bằng tài khoản Olm.vn, Hoc24.vn, Bingbe.com