Brute-force search: Difference between revisions

Content deleted Content added
m Reverted 5 edits by 103.58.117.120 (talk) to last revision by Ardub23. (TW)
→‎Combinatorial explosion: archiving Lomonosov Endgame
 
(29 intermediate revisions by 26 users not shown)
Line 1:
{{Short description|Problem-solving technique and algorithmic paradigm}}
{{about|the problem-solving technique in computer science|similarly named methods in other disciplines|Brute force (disambiguation)}}
{{RefimproveMore citations needed|date=February 2008}}
In [[computer science]], '''brute-force search''' or '''exhaustive search''', also known as '''generate and test''', is a very general [[problem-solving]] technique and [[algorithmic paradigm]] that consists of [[Iteration#Computing|systematically enumeratingchecking]] all possible candidates for thewhether solutionor and checking whethernot each candidate satisfies the problem's statement.
 
A brute-force algorithm tothat findfinds the [[divisor]]s of a [[natural number]] ''n'' would enumerate all integers from 1 to n, and check whether each of them divides ''n'' without remainder. A brute-force approach for the [[eight queens puzzle]] would examine all possible arrangements of 8 pieces on the 64-square chessboard, and, for each arrangement, check whether each (queen) piece can attack any other.<ref>{{Cite web|date=2020-01-06|title=Brute Force Algorithms Explained|url=https://www.freecodecamp.org/news/brute-force-algorithms-explained/|access-date=2021-04-11|website=freeCodeCamp.org|language=en}}</ref>
 
While a brute-force search is simple to [[Software|implement]], and will always find a solution if it exists, itsimplementation costcosts isare proportional to the number of candidate solutions{{snd}}which in many practical problems tends to grow very quickly as the size of the problem increases ([[#Combinatorial explosion|combinatorial§Combinatorial explosion]]).<ref>{{cite web |title=Complexity of brute force search |url=https://www.coursera.org/learn/ml-clustering-and-retrieval/lecture/5R6q3/complexity-of-brute-force-search |website=coursera |accessdateaccess-date=14 June 2018}}</ref>. Therefore, brute-force search is typically used when the problem size is limited, or when there are problem-specific [[heuristic (computer science)|heuristic]]s that can be used to reduce the set of candidate solutions to a manageable size. The method is also used when the simplicity of implementation is more important than processing speed.
 
This is the case, for example, in critical applications where any errors in the [[algorithm]] would have very serious consequences; or when [[automated theorem proving|using a computer to prove a mathematical theorem]]. Brute-force search is also useful as a baseline method when [[benchmarking]] other algorithms or [[metaheuristic]]s. Indeed, brute-force search can be viewed as the simplest [[metaheuristic]]. Brute force search should not be confused with [[backtracking]], where large sets of solutions can be discarded without being explicitly enumerated (as in the textbook computer solution to the eight queens problem above). The brute-force method for finding an item in a table{{snd}}namely, check all entries of the latter, sequentially{{snd}}is called [[linear search]].
 
==Implementing the brute-force search==
 
===Basic algorithm===
#In ''next'' (''P'', ''c''): generate the nextorder candidate for ''P'' after the current one ''c''.
In order to apply brute-force search to a specific class of problems, one must implement four [[Subroutine|procedures]], ''first'', ''next'', ''valid'', and ''output''. These procedures should take as a parameter the data ''P'' for the particular instance of the problem that is to be solved, and should do the following:
# ''first'' (''P''): generate a first candidate solution for ''P''.
# ''next'' (''P'', ''c''): generate the next candidate for ''P'' after the current one ''c''.
# ''valid'' (''P'', ''c''): check whether candidate ''c'' is a solution for ''P''.
# ''output'' (''P'', ''c''): use the solution ''c'' of ''P'' as appropriate to the application.
The ''next'' procedure must also tell when there are no more candidates for the instance ''P'', after the current one ''c''. A convenient way to do that is to return a "null candidate", some conventional data value Λ that is distinct from any real candidate. Likewise the ''first'' procedure should return Λ if there are no candidates at all for the instance ''P''. The brute-force method is then expressed by the algorithm
''c'' &larr; ''first''(''P'')
'''while''' ''c'' &ne; Λ '''do'''
'''if''' ''valid''(''P'',''c'') '''then'''
''output''(''P'', ''c'')
''c'' &larr; ''next''(''P'', ''c'')
'''end while'''
For example, when looking for the divisors of an integer ''n'', the instance data ''P'' is the number ''n''. The call ''first''(''n'') should return the integer 1 if ''n'' &ge; 1, or Λ otherwise; the call ''next''(''n'',''c'') should return ''c'' + 1 if ''c'' < ''n'', and Λ otherwise; and ''valid''(''n'',''c'') should return '''true''' if and only if ''c'' is a divisor of ''n''. (In fact, if we choose Λ to be ''n'' + 1, the tests ''n'' &ge; 1 and ''c'' < ''n'' are unnecessary.)The brute-force search algorithm above will call ''output'' for every candidate that is a solution to the given instance ''P''. The algorithm is easily modified to stop after finding the first solution, or a specified number of solutions; or after testing a specified number of candidates, or after spending a given amount of [[central processing unit|CPU]] time.
 
==Combinatorial explosion==
The main disadvantage of the brute-force method is that, for many real-world problems, the number of natural candidates is prohibitively large. For instance, if we look for the divisors of a number as described above, the number of candidates tested will be the given number ''n''. So if ''n'' has sixteen decimal digits, say, the search will require executing at least 10<sup>15</sup> computer instructions, which will take several days on a typical [[personal computer|PC]]. If ''n'' is a random 64-[[binary digit|bit]] natural number, which has about 19 decimal digits on the average, the search will take about 10 years. This steep growth in the number of candidates, as the size of the data increases, occurs in all sorts of problems. For instance, if we are seeking a particular rearrangement of 10 letters, then we have 10! = 3,628,800 candidates to consider, which a typical PC can generate and test in less than one second. However, adding one more letter{{snd}}which is only a 10% increase in the data size{{snd}}will multiply the number of candidates by 11, a 1000% increase. For 20 letters, the number of candidates is 20!, which is about 2.4×10<sup>18</sup> or 2.4 [[quintillion]]; and the search will take about 10 years. This unwelcome phenomenon is commonly called the [[combinatorial explosion]], or the [[curse of dimensionality]].
 
One example of a case where combinatorial complexity leads to solvability limit is in [[Solving chess|solving chess]]. Chess is not a [[solved game]]. In 2005, all chess game endings with six pieces or less were solved, showing the result of each position if played perfectly. It took ten more years to complete the tablebase with one more chess piece added, thus completing a 7-piece tablebase. Adding one more piece to a chess ending (thus making an 8-piece tablebase) is considered intractable due to the added combinatorial complexity.<ref>{{cite web |title=Is there a freely available online 7 piece Endgame tablebase? |url=https://chess.stackexchange.com/questions/5253/is-there-a-freely-available-online-7-piece-endgame-tablebase |website=Stack Exchange}}</ref><ref>{{cite web |title=Lomonosov Endgame Tablebases |url=http://chessok.com/?page_id=27966 |website=ChessOK |archive-url= https://web.archive.org/web/20190406123519/http://chessok.com/?page_id=27966|archive-date=April 6, 2019}}</ref>
 
==Speeding up brute-force searches==
One way to speed up a brute-force algorithm is to reduce the search space, that is, the set of candidate solutions, by using [[heuristic]]s specific to the problem class. For example, in the [[eight queens puzzle|eight queens problem]] the challenge is to place eight queens on a standard [[chessboard]] so that no queen attacks any other. Since each queen can be placed in any of the 64 squares, in principle there are 64<sup>8</sup> = 281,474,976,710,656 possibilities to consider. However, because the queens are all alike, and that no two queens can be placed on the same square, the candidates are [[combinations|all possible ways of choosing]] of a set of 8 squares from the set all 64 squares; which means 64 choose 8 = 64!/(56!*8!) = 4,426,165,368 candidate solutions{{snd}}about 1/60,000 of the previous estimate. Further, no arrangement with two queens on the same row or the same column can be a solution. Therefore, we can further restrict the set of candidates to those arrangements.
As this example shows, a little bit of analysis will often lead to dramatic reductions in the number of candidate solutions, and may turn an intractable problem into a trivial one.
Line 38:
 
==Reordering the search space==
In applications that require only one solution, rather than all solutions, the [[expected value|expected]] running time of a brute force search will often depend on the order in which the candidates are tested. As a general rule, one should test the most promising candidates first. For example, when searching for a proper divisor of a random number ''n'', it is better to enumerate the candidate divisors in increasing order, from 2 to {{nowrap|''n'' − 1}}, than the other way around{{snd}}because the probability that ''n'' is divisible by ''c'' is 1/''c''. Moreover, the probability of a candidate being valid is often affected by the previous failed trials. For example, consider the problem of finding a '''1''' bit in a given 1000-bit string ''P''. In this case, the candidate solutions are the indices 1 to 1000, and a candidate ''c'' is valid if ''P''[''c''] = '''1'''. Now, suppose that the first bit of ''P'' is equally likely to be '''0''' or '''1''', but each bit thereafter is equal to the previous one with 90% probability. If the candidates are enumerated in increasing order, 1 to 1000, the number ''t'' of candidates examined before success will be about 6, on the average. On the other hand, if the candidates are enumerated in the order 1,11,21,31...991,2,12,22,32 etc., the expected value of ''t'' will be only a little more than 2.More generally, the search space should be enumerated in such a way that the next candidate is most likely to be valid, ''given that the previous trials were not''. So if the valid solutions are likely to be "clustered" in some sense, then each new candidate should be as far as possible from the previous ones, in that same sense. The converse holds, of course, if the solutions are likely to be spread out more uniformly than expected by chance.
 
==Alternatives to brute-force search==
There are many other search methods, or metaheuristics, which are designed to take advantage of various kinds of partial knowledge one may have about the solution. [[Heuristic]]s can also be used to make an early cutoff of parts of the search. One example of this is the [[minimax]] principle for searching game trees, that eliminates many subtrees at an early stage in the search. In certain fields, such as language parsing, techniques such as [[chart parsing]] can exploit constraints in the problem to reduce an exponential complexity problem into a polynomial complexity problem. In many cases, such as in [[Constraint Satisfaction Problem]]s, one can dramatically reduce the search space by means of [[Constraint propagation]], that is efficiently implemented in [[Constraint programming]] languages. The search space for problems can also be reduced by replacing the full problem with a simplified version. For example, in [[computer chess]], rather than computing the full [[minimax]] tree of all possible moves for the remainder of the game, a more limited tree of minimax possibilities is computed, with the tree being pruned at a certain number of moves, and the remainder of the tree being approximated by a [[evaluation function|static evaluation function]].
 
==In cryptography==
{{main|Brute-force attack}}
 
In [[cryptography]], a ''brute-force attack'' involves systematically checking all possible [[key (cryptography)|keys]] until the correct key is found.<ref>Mark Burnett, [http://www.cs.virginia.edu/~csadmin/gen_support/brute_force.php "Blocking Brute Force Attacks"] {{Webarchive|url=https://web.archive.org/web/20161203020306/http://www.cs.virginia.edu/~csadmin/gen_support/brute_force.php |date=2016-12-03 }}, ''UVA Computer Science'', 2007</ref> This [[strategy]] can in theory be used against any encrypted data<ref>{{cite book|url=http://www.crypto-textbook.com|page=7|title=Understanding Cryptography: A Textbook for Students and Practitioners|author1=Christof Paar |author2=Jan Pelzl |author3=Bart Preneel |publisher=Springer|year=2010|isbn=3-642-04100-0}}</ref> (except a [[one-time pad]]) by an attacker who is unable to take advantage of any weakness in an encryption system that would otherwise make his or her task easier.
 
The [[key length]] used in the encryption determines the practical feasibility of performing a brute force attack, with longer keys exponentially more difficult to crack than shorter ones. Brute force attacks can be made less effective by [[obfuscation|obfuscating]] the data to be encoded, something that makes it more difficult for an attacker to recognise when he has cracked the code. One of the measures of the strength of an encryption system is how long it would theoretically take an attacker to mount a successful brute force attack against it.
 
==References==
Line 54:
 
==See also==
 
{{Portal|Computer Science}}
* [[Sudoku solving algorithms#Sudoku brute force|A brute-force algorithm to solve Sudoku puzzles.]]
* [[Brute-force attack]]
* [[Big O notation]]
* [[Iteration#Computing]]
 
{{Algorithmic paradigms}}
==External links==
* [http://www.worldofhacker.com/2013/09/basic-idea-of-creating-password.html How a basic brute-force tool can be used to crack a command line application.]
 
{{DEFAULTSORT:Brute-Force Search}}
[[Category:Search algorithms]]
[[Category:Iteration in programming]]