Question
Bucket sort is mainly useful when input is uniformly distributed over a range. For example:
Sort a large set of floating point numbers which are in range from 0.0 to 1.0 and are uniformly distributed across the range. How do we sort the numbers efficiently.
Solution
Create n empty buckets (Or lists).
Insert each arr[i] into bucket[n*array[i]]
Sort individual buckets using insertion sort.
Concatenate all sorted buckets.
Steps 1 and 2 clearly take O(n) time. Step 4 also takes O(n) time.
The main step to analyze is step 3. This step also takes O(n) time on average if all numbers are uniformly distributed. Final time complexity: O(n).
Relationship with other sorting algorithms
There’s a algorithm within the bucket. Now if we use bucket sort itself as the sorting function, this becomes a radix sort.
If we set the bucket size as 2, then this becomes a quick sort (with potentially poor pivot choices).
Code
C++ code from G4G
void bucketSort(float arr[], int n) {
// 1) Create n empty buckets
vector<float> b[n];
// 2) Put array elements in different buckets
for (int i=0; i<n; i++)
{
int bi = n*arr[i]; // Index in bucket
b[bi].push_back(arr[i]);
}
// 3) Sort individual buckets
for (int i=0; i<n; i++)
sort(b[i].begin(), b[i].end());
// 4) Concatenate all buckets into arr[]
int index = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < b[i].size(); j++)
arr[index++] = b[i][j];
}