Introduction
Techniques for searching arrays are normally taught during entry level programming classes after the student has had an introduction to language syntax, data types, and control structures (especially looping). While studying these techniques, students learn logical thinking skills and really learn to “think” like a computer.
There will be times when a programmer will need to search an entire array to look for specific data, whether it be integer values, floating point values, or character values. Problems requiring insertion into an array, deletion from an array, finding maximum and minimum values, and simply determining if a value is present require some method of searching.
In this article, two standard methods for searching arrays are introduced:
- linear search
- binary search
A simple problem to be solved by searching is first presented, and then examples of each search method using the C++ programming language is described.
Searching Mindset
Developing a search algorithm for an array requires a programmer to have knowledge of two skills: good sense of program logic and “loop” syntax.
An array is a bounded structure, which implies that it has a minimum storage location and a maximum storage location. Searching for an array element, normally called a “target”, will require the algorithm to start from either the minimum or maximum location and traverse to either the maximum or minimum location in pursuit of the “target”.
The logical (number of actual elements) size of the array should be known by the programmer. In most cases, it is best to declare the array size as a global constant. If we have knowledge of the beginning and ending locations of the array as well as how many elements the array has, a loop can be used as the search method.
The logic is fairly simple to comprehend once presented with the code. The programmer should attempt to mimic the “mind” of a computer when developing the search algorithm. This can be done by monitoring all variable values using pen and paper during each loop cycle. Tracing the execution paths of the program will help to fully understand the algorithm.
The Problem
Assume you are writing a program that has an array of integer values. These values represent the product identification numbers for the stock of a particular department within a store.
Your boss needs you to write code that will determine if a particular product is in stock by using the id number. This code will be embedded into a system used by sales clerks. When a customer wants to know if a product that is not on display in the store is in stock, this code will be used to determine if the product is available in stock.
Search Algorithms
For the example describe in the previous part, we’ll write the search function using the following two standard methods:
- linear search
- binary search
We’ll then which method is the most efficient for our needs. Let’s begin by examining a linear search routine.
Linear Search
When an object is said to linear, it implies that it is of, or relating to a line. From a programmer’s perspective, a linear search will sequentially evaluate each element in the array to see if it matches the target value, which is the element that needs to be found.
In regards to the search algorithm, it’s safe to think of the array as a line connected by many dots. The “dots” can be thought of as array rows. A linear search will, therefore, compare each dot (row) in the line with the “target”.
Linear searches normally begin by examining the value of array position one. When using the algorithm covered in this article, if this value matches the “target”, a value of 0 will be returned; otherwise, the search will move to array position 1 and repeat the process (return a value of 1 if the target is equal to the value in position 1; if not, move on to array position 2). Repeat this process until the target is found (return the array position where it was found) or the end of the array is reached without finding the target. If the target is not found, the function will simply return a value of -1 to indicate a failure.
The following is a C++ function definition that can be used as a linear search to handle our previous stock identification problem. The variable size represents the size of the array. For an example, if size equals a value of 100, then the array will store values in positions 0 through 99.
int linearSearch(int arr[], int size, int target)
{
int position = 0;
int result = -1;
while (result == -1 && position < size)
{
if (arr[position] == target)
{
result = position; // target position
}
position++;
}
return result;
}
When the above C++ function is called, one of two values will be returned: the target’s position in the array or a value of -1 indicating failure.
Since we’ve completed the implementation of the linear search, now let’s implement a binary search to figure out which method is most efficient for the product identification problem.
Binary Search
A requirement for a binary search is for the array to be previously sorted into ascending or descending order. Binary implies that an object consists of two components or parts. When performing binary array searches, the array being searched will be logically divided in half after each cycle. Only the half that can logically contain the target value will be searched next.
To further explain, binary searches begin by “checking” the middle element in the array. If this matches the target, the position of the element will be returned. If not, the algorithm will restrict the next search attempt to either the upper or lower half of the array being searched depending on where the “target” value falls in the sorted order of the array.
This is repeated until the target is found (can return array position where found) or every position has been logically checked with the target value not found, which indicates a failure. A failure is determined by checking for a “low” position that is greater than the “high” position. When the “low” position has exceeded the “high” position, the array has been ultimately “sealed” or searched completely.
The following C++ function definition can be used to search an array of integers previously sorted into ascending order. Like the linear search, the variable size represents the size of the array. For an example, if size equals a value of 100, then the array will store values in positions 0 through 99.
int binarySearch(int arr[], int size, int target)
{
int middlePosition;
int middleValue;
int result = -1;
int low = 0;
int high = size - 1;
while (result == -1 && low <= high)
{
middlePosition = (low + high) / 2;
middleValue = arr[middlePosition];
if (target == middleValue)
{
result = middlePosition;
}
else if (target < middleValue)
high = middlePosition - 1;
else
low = middlePosition + 1;
}
return result;
}
Similar to the linear search, when the above C++ function is called, one of two values will be returned: the target’s position in the array or a value of -1 indicating failure.
Search Efficiency
We have discussed two algorithms for searching arrays and have written code to handle both strategies. Our next step is to determine which method is more efficient for our needs.
linear search - for an array of size N, the worst possible case would need N comparisons (meaning all elements were evaluated) while searching the array. The average case is (n + 1)/2 such “attempts”.
binary search - for an array of size N, there could be at most (log base2 N) comparisons.
Consider our product identification problem. Assume we need to track 32,768 products. In such a case, we would need to declare the array to be of size 32768.
Using a linear search to find a particular product id, the average case would need
16384.5 (32768/2) comparisons. Using a binary search, we could find the product id in at most 15 (log base2 32768) comparisons. Comparing 16384.5 with 15, we can easily see a huge difference in program efficiency.
The binary search is extremely efficient compared to the linear search. The fact remains that when dealing with “small” sized arrays, there won’t be much difference in program efficiency between the two methods. However, if the array size is “huge”, the most efficient method is the binary search.
Although a little more difficult to first implement and understand, the binary method is far more beneficial than the linear method for searching arrays, especially in cases where the array to be searched is already sorted in a particular order.
Program Solution to the Problem
The following complete C++ program illustrates both the linear and binary methods for solving our product identification problem mentioned in Part 6.7.1 of this article:
Now that we have covered methods for searching arrays, we can talk about sorting array elements.
Read on for more…
Next: Shuffling Array Elements
{ 1 } Trackback
[...] An Introduction to Programming with C++ : Searching an Array [...]
Post a Comment