If I use the sort method to sort an array, I know the order of the elements will be rearranged based on alphabetic order in case the elements are strings;
I have a solid understanding of how the sort method works, and I can see it’s not completely random, that’s why people would use the following code:
arr.sort(() => 0.5 - Math.random());
With the above code the arr will be rearranged randomly and problem solved, but what’s the logic here? I’d like to know what’s going on and not just use a code blindly, but I don’t understand this part 0.5 - Math.random()
Math.random() returns any number from 0 to 0.99999… but why subtract 0.5 from it?
I was curious how inefficient the Math.random approach was so I wrote a pen. (Ignoring for the moment a debate about the mathematics of randomness.)
const numItems = 10000
const numPasses = 10
const results = { fy: [], rand: [] }
const arrProto = []
const shuffle = (array) => {
let currentIndex = array.length, temporaryValue, randomIndex
let count = 0
while (0 !== currentIndex) {
count++
randomIndex = Math.floor(Math.random() * currentIndex)
currentIndex -= 1
temporaryValue = array[currentIndex]
array[currentIndex] = array[randomIndex]
array[randomIndex] = temporaryValue
}
return { array, count }
}
for (let i = 1; i <= numItems; i++) {
arrProto.push(Math.random())
}
/*************/
for (let i = 0; i < numPasses; i++) {
let count
let arr1 = [...arrProto]
let arr2 = [...arrProto]
count = 0
arr1 = arr1.sort((a, b) => {
count++
return 0.5 - Math.random()
})
results.rand.push(count)
results.fy.push(shuffle(arr2).count)
}
// yes, it's silly to take avgFy, it will always be the same as the number of elements
const avgFy = results.fy.reduce((a, c) => a + c)/results.fy.length
const avgRand = results.rand.reduce((a, c) => a + c)/results.rand.length
console.log('\n\n\nAverage iterations for FY =', avgFy)
console.log('Average iterations for rand =', avgRand)
console.log('\nWith ', numItems, 'elements and ', numPasses, ' trials,')
console.log('rand took', (avgRand/avgFy), 'times longer than fy.')
It’s interesting that it gets longer each time (that’s what she said). In small data sets it might not matter, but certainly would in large data sets. For 10M elements, it was 20X slower. (That last test took quite a while.)