[Objective-C] Fast String Binary Search

Today a little tutorial to find a word in a big ordered text file using binary search in objective-c.

What is binary search? (Wikipedia -> http://en.wikipedia.org/wiki/Binary_search_algorithm)

A binary search or half-interval search algorithm locates the position of an item in a sorted array. Binary search works by comparing an input value to the middle element of the array. The comparison determines whether the element equals the input, less than the input or greater. When the element being compared to equals the input the search stops and typically returns the position of the element. If the element is not equal to the input then a comparison is made to determine whether the input is less than or greater than the element. Depending on which it is the algorithm then starts over but only searching the top or bottom subset of the array’s elements. If the input is not located within the array the algorithm will usually output a unique value indicating this. Binary search algorithms typically halve the number of items to check with each successive iteration, thus locating the given item (or determining its absence) in logarithmic time. A binary search is a dichotomic divide and conquer search algorithm.


How to do it in Objective-C language?

First of all you need a text file containing a word on each line like this snippet:

[…]
vuotato
vuotava
vuotavamo
vuotavano
vuotavate
[…]

You need to read all file and save each line in an array.

Snippet in Objecitve-C using C stdlib

[code lang=”java” autolinks=”false” collapse=”false” firstline=”1″ gutter=”true” htmlscript=”false” light=”false” padlinenumbers=”false” smarttabs=”true” tabsize=”4″ toolbar=”false”]#include <stdlib .h>
char buffer[512];
NSMutableArray *mySortedArray = [NSMutableArray array];
int charsRead = 512;
do {
  if(fscanf(file, "%s\n", buffer, &charsRead) == 1)
    [mySortedArray addObject:[NSString stringWithFormat:@"%s", buffer]];
  else
    break;
} while(charsRead == 512);
fclose(file);[/code]

Once you’ve getted all lines, you can use

[code lang=”java” autolinks=”false” collapse=”false” firstline=”1″ gutter=”true” htmlscript=”false” light=”false” padlinenumbers=”false” smarttabs=”true” tabsize=”4″ toolbar=”false”]CFArrayBSearchValues[/code]

to make a binary search in your array.
Very easy yo use.

You don’t understand this snippet?
Here the full version of binary search in objective-c.

Rif: albertopasca.it

[code lang=”java” autolinks=”false” collapse=”false” firstline=”1″ gutter=”true” htmlscript=”false” light=”false” padlinenumbers=”false” smarttabs=”true” tabsize=”4″ toolbar=”false”]- (void) BinarySearch {
//
// TIME INTERVAL
//
NSDate *STARTING = [NSDate date];

//
// OPEN ORDERED TEXT FILE
//
NSString *fileName = [[NSBundle mainBundle] pathForResource:@"test"
ofType:@"txt"];
const char *fn = [fileName UTF8String];
FILE *file = fopen(fn, "r");

//
// ADD FILE LINES TO NSMUTABLE ARRAY
//
char buffer[512];
NSMutableArray *mySortedArray = [NSMutableArray array];
int charsRead = 512;
do {
if(fscanf(file, "%s\n", buffer, &amp;charsRead) == 1)
[mySortedArray addObject:[NSString stringWithFormat:@"%s", buffer]];
else
break;
} while(charsRead == 512);
fclose(file);

//
// SEARCH STRING
//
NSString *myString = @"civettasti";

CFRange arrayRange;
arrayRange.location = 0;
arrayRange.length = [mySortedArray count];
NSLog(@"FILE LINES: %d", arrayRange.length);

//
// BINARY SEARCH
//
CFIndex stringIndex = CFArrayBSearchValues(
mySortedArray, arrayRange,
myString, CFStringCompare, kCFCompareForcedOrdering);

//
// RESULT
//
if ((stringIndex &lt; 0) || (stringIndex &gt;= [mySortedArray count])) {
NSLog(@"SOMETHING WRONG");
} else if ([myString
isEqualToString:[mySortedArray objectAtIndex:stringIndex]] == NO) {
NSLog(@"NOT FOUND");
} else {
NSLog(@"FOUNDED AT INDEX: %d", stringIndex);;
}

//
// TOTAL TIME
//
NSLog(@"SEARCH TIME: %f", [[NSDate date] timeIntervalSinceDate:STARTING]);
}[/code]

This sample code read a txt file named “test.txt” that contains 60452 words.
It open a file, add to array and search a word in this big list in 0.090847 seconds!
Very fast!!!

OUTPUT:

2011-03-11 13:17:22.924 [3700:207] FILE LINES: 60452
2011-03-11 13:17:22.925 [3700:207] FOUNDED AT INDEX: 10129
2011-03-11 13:17:22.928 [3700:207] SEARCH TIME: 0.090847

enjoy searching!!!

Rif: albertopasca.it

Alberto Pasca