Beam Search

Written by Sudipto Paul | Jun 27, 2023 8:46:00 PM

What is beam search?

Beam search is a heuristic search algorithm that expands the most promising node in a limited set for exploring a graph. This technique uses a breadth-first search algorithm to build a tree data structure and generate successors of each level's current state.

Natural language processing (NLP) software applications and speech recognition models use beam search as a decision-making layer. The goal here is to choose the best output for target variables like next output character or maximum probability.

The beam search algorithm selects tokens for a particular position in a sequence based on conditional probability. It adjusts the beam width (β) to choose various hyperparameter n alternatives.

Unlike greedy search, beam search widens the search to explore possible fits for words in a sequence. Beam search gets its name from how the algorithm casts a wide search beam.

Types of beam search

Beam search has different variants, depending on the algorithm in use, such as depth-first search (DFS), limited discrepancy search, and memory-based search.

Beam stack search is a search algorithm that uses beam stack as a data structure to combine depth-first search or chronological backtracking with beam search. Users can also connect it with a divide-and-conquer algorithm — a technique that recursively breaks down a problem into two or more sub-problems — to create a divide-and-conquer beam-stack search.
Depth-first beam search uses a depth-first search algorithm for searching graph data structures. It uses additional memory or a stack to track nodes explored along a specific branch and backtrack the graph.
Local beam search tracks or maintains up to k randomly generated states or numbers of assignments instead of one. It chooses k best neighbors of current individuals at each stage. The algorithm reports success only after finding a satisfying assignment. In case of ties, it’ll pick random values.
Stochastic beam search selects k of the individuals randomly instead of the best k individuals. This beam search variation uses the probability of being chosen as a function of the evaluation function. The better the evaluation, the higher the chances of being chosen.
Beam search using limited discrepancy backtracking (BULB) is a memory-based search method that solves larger search problems within a reasonable runtime. BULB searches do so by using larger beam widths and finding shorter paths without losing memory.
Flexible beam search temporarily increases the beam width to include all child nodes with the same heuristic value.

Beam search components

Beam search has three components:

A problem that can be a graph or may contain a set of nodes representing a goal.
Heuristic rules refer to the problem domain-specific guidelines used for removing unfavorable nodes from memory and focusing on the problem domain.
Memory that stores the beam but has limited available capacity. When the memory is full, the algorithm automatically deletes the costliest node to avoid exceeding the memory limit.

Benefits of using beam search

Beam search attempts to find the optimal complex in a heuristical way. The benefits of using beam search for solving optimization problems are:

Reduced computation: Beam search uses a limited number of nodes or resources at a time. As a result, it needs less computation and memory consumption compared to other search methods. Developing effective heuristic rules for pruning is crucial in realizing this advantage, which can be challenging without domain knowledge.
Improved suboptimal choices: Greedy search considers each position in isolation. On the contrary, beam search considers more than one option at a time and improves suboptimal choices.

Disadvantages of using beam search

The main disadvantage of using a beam search is that users may not reach an optimal goal in the end. Most beam search algorithms terminate in two scenarios: either the user has yet to reach a goal node and there's no node left to be explored, or has reached the required goal node.

Moreover, beam search can be incomplete due to its probabilistic nature. Despite these disadvantages, beam search algorithms have seen success in speech recognition, vision, planning, and machine learning.

Beam search uses

Beam search has multiple uses in machine translation systems, NLP, and speech recognition. Large systems with insufficient memory to store the entire search tree frequently turn to beam search to remain manageable.

Neural machine translation (NMT) systems use beam search to choose the best translation and process each part. They keep the best translations with the right sentence structures and discard the rest. The translator then checks the translations and picks the best one that meets the goals. In 1976, Carnegie-Mellon University’s Harpy Speech Recognition System was the first to use a beam search.

Beam search vs. greedy search vs. best-first search

A greedy search follows the problem-solving heuristic to build a solution gradually, one step at a time. The system prioritizes selecting the next piece that provides clear and immediate advantages. Greedy algorithms work best for problems where it’s important to choose locally optimal solutions.

Beam search optimizes best-first search to reduce memory requirements.

Both greedy and beam searches use sequence-to-sequence models to generate sequence outputs or tokens from a neural network. However, the former evaluates each position independently, and the latter considers more than one at a time.

Best-first search is a graph search technique that sorts and orders partial solutions based on heuristics. However, beam search considers a predetermined number of best partial solutions as candidates. That’s why beam search is treated as a greedy algorithm.

Looking to interpret data in a digestible manner? Explore the best natural language generation (NLG) software for processing data sets.

View full post