During spring break of 2015 at BCIT, I went and redid one of my assignments a few times in a few different languages: C++, Haskell, JavaScript, PowerShell, and Python. C++ was the course I was studying and I had done a fair bit of python before. JavaScript I’d touched a couple times, but Haskell and PowerShell were completely unfamiliar to me.
The assignment I chose was a small lab assignment to calculate which set of 7 numbers had the most permutations that were prime numbers.
Considering 3-digit numbers, as a simpler example, gives a maximum of 4 permutations. There are 3 different solutions tied with 4 permutations.
- 149, 419, 491, 941
- 179, 197, 719, 971
- 379, 397, 739, 937
3-digit numbers are easy to work with, but when you get up to 7-digit numbers it becomes more important how you choose to solve it. I really enjoyed solving this one. So much so that I resolved it in a few different languages. The code is the purpose of this article so I’m about to give away how to solve it. Stop now if you want to do it yourself.
Solutions
The general solution I used is
- Generate all the prime numbers in the range you need: e.g. [1000000:9999999]
- Write a function that will take a number in that same range, and turn it into a key based on the digits it contains, so that 1045, 1405, 5041, and 4501 all give the same key
- Group all the prime numbers that have the same key, or just keep the largest group if you don’t care about ties.
- Report the largest group(s).
One other note: Do not take the results below as a benchmark for a language speed comparison. I ran different code in each of the languages, some of which I had no knowledge of prior to starting this project and are not necessarily optimal. Haskell was run through a virtual machine, C++ through Cygwin, JavaScript through chromium.
C++
In C++ the code was quite straightforward. I had been writing in C++ recently and was able to translate the algorithm into code in less than an hour.
I was able to take advantage of the bitset type to both shrink my volume of data and probably make it faster to manipulate. Since strings are mutable in C++, most people in my class converted the number to a string and then ran quicksort on it. I felt like doing something different so I wrote a function that would generate a key using only numeric transformations. My solution would work up to 9-digit numbers, although it could be generalized for more.
- #include <iostream>
- #include <algorithm>
- #include <bitset>
- #include <vector>
- #include <map>
- using namespace std;
- #define SIZE 7
- #define LOW 1000000
- #define N 9999999
- //isprime[n] is 1 if n is prime, 0 if n is not prime
- bitset<N> isprime;
- int reorder(int n);
- void loadPrimes();
- int main() {
- map<int, vector<int>> results;
- //generate primes
- loadPrimes();
- //for all numbers LOW to HIGH
- for (size_t i=LOW; i<N; i++)
- if (isprime[i])
- results[reorder(i)].push_back(i);
- //find the max prime combinations
- pair<int, vector<int>> max{0, {}};
- for (auto& x : results) {
- if (x.second.size() > max.second.size())
- max = x;
- }
- cout << "Highest: " << endl
- << max.first << ": " << max.second.size() << endl;
- }
- //generate all prime numbers in range
- void loadPrimes() {
- isprime.set();
- for (size_t i=2; i < N; i++)
- if (isprime[i])
- for(size_t j = 2; j * i < N; j++)
- isprime[j*i]=0;
- }
- //generate a key for the given number
- int reorder(int n) {
- int result = 0;
- static int v[SIZE];
- for(int i = 0; i < SIZE; ++i) {
- v[i] = n % 10;
- n /= 10;
- }
- sort(v, v + SIZE);
- for (auto x : v) {
- result *= 10;
- result += x;
- }
- while (result < LOW)
- result *= 10;
- return result;
- }
Running the code in c++ took about 2 seconds on my laptop for a 7-digit number. No other language that followed came close to meeting that speed.
Haskell
Haskell was an interesting language to investigate. This was my first time installing it and coding in it. It’s a functional language, so it feels like all I’m doing is setting up the relationships between variables, and then, only at the very end, I’m requesting a value to print. Once that value is requested, Haskell works backwards to resolve all the dependencies and calculate the necessary intermediate values to give me what I wanted to print out.
Another thing that I found really interesting was that all the functions were pure. There were no global variables in the background being manipulated, nor any object state. It was pure functions with no side effects. It practically eliminates runtime errors, and it feels good too.
- module lab6Haskell
- where
- upperBound = 99999
- lowerBound = 10000
- myPrimes = filter (>=lowerBound) (primes upperBound)
- grouped = group myPrimes
- -- This will return an example of the largest size of prime permutation group
- mostCommon = foldr (myMax) (0, []) grouped
- main = putStrLn (show mostCommon)
- -- for testing, compile with profiling:
- -- ghc --make Main.hs -prof -fprof-auto -o Main.exe
- -- main.exe +RTS -p
- -- prime generator
- primes n = sieve [2..n]
- where
- sieve [] = []
- sieve (p:xs) = p : sieve (filter (\x -> mod x p /= 0) xs)
- -- provided quicksort
- quicksort :: (Ord a) => [a] -> [a]
- quicksort [] = []
- quicksort (x:xs) =
- let smallerSorted = quicksort [a | a <- xs, a <= x]
- biggerSorted = quicksort [a | a <- xs, a > x]
- in smallerSorted ++ [x] ++ biggerSorted
- -- key generator
- key n = quicksort (show n)
- -- grouping digits (This one is the time killer.)
- group [] = []
- group (x:xs) =
- let matches = [e | e <- xs, key x == key e]
- others = group ( [e | e <- xs, key x /= key e] )
- in (length matches + 1, x : matches) : others
- -- sorting criteria
- myMax a b =
- if fst a > fst b
- then a
- else b
Another cool thing about Haskell is that it comes with a fancy profiler. I ran it on my program and it gave me this as the result:
All in all it was a really insightful exploration into another programming language.
JavaScript
JavaScript is written to do the same thing as C++, but instead of using a bitset, it generates a list of primes. That means each prime is a Number and not a single bit. What JavaScript does bring to the table to is integration with HTML, and the ability to easily make the code visually interactive, with a GUI.
- //This is meant to find the largest cluster
- // of 7 digit primes that are all permutations
- // of each other.
- function getPrimes(upperBound, lowerBound) {
- isPrime = []
- isPrime.length = upperBound
- for (i=0; i < upperBound; ++i) {
- isPrime[i] = true;
- }
- limit = Math.ceil(Math.sqrt(10));
- for (i = 2; i < isPrime.length; ++i) {
- if (isPrime[i] == true) {
- for (j = i*i; j < upperBound; j += i) {
- isPrime[j] = false;
- }
- }
- }
- primes = []
- for (i = 2; i < isPrime.length; ++i) {
- if (isPrime[i] == true && i > lowerBound) {
- primes.push(i);
- }
- }
- return primes
- }
- function getKey(n) {
- key = 0
- while (n > 0) {
- key += Math.pow(10, (n%10))
- n = Math.floor(n / 10)
- }
- return key
- }
- function getPrimePermutations() {
- primes = getPrimes(document.getElementById("upperBound").value, document.getElementById("lowerBound").value);
- // group primes by permutations
- matches = {}
- for (i in primes) {
- if (typeof matches[getKey(primes[i])] === "undefined") {
- matches[getKey(primes[i])] = [primes[i]];
- } else {
- matches[getKey(primes[i])].push(primes[i]);
- }
- }
- // find the highest count of prime permutations
- // and all digits that match that count
- max = [[]]
- for (i in matches) {
- if (matches[i].length > max[0].length) {
- max = [matches[i]]
- } else if (matches[i].length == max[0].length) {
- max.push(matches[i])
- }
- }
- document.getElementById("result").innerHTML="<p>Max length: " + max[0].length + " primes</p><p>Number of matches: " + max.length + "</p>";
- result = "<ul>";
- for (i in max) {
- result += "<li>" + max[i] + "</li>";
- }
- result += "</ul>"
- document.getElementById("values").innerHTML=result;
- }
It takes several seconds to calculate the answer for 7 digits, considerably slower than the C++ version, but not as much slower as I expected. I was pleasantly surprised!
PowerShell
It took a little bit to wrap my head around PowerShell being built for piping commands into commands, but it is actually really cool, and powerful. I had a working product in PowerShell faster than in any other language.
- Function GetKey ($n)
- {
- $key = 0
- while ($n -gt 0)
- {
- $key += [Math]::Pow(10, $n % 10)
- $n = ([Math]::Floor($n / 10))
- }
- return $key
- }
- Function GetFilteredPrimes
- {
- Param ([int]$upper, [int]$lower)
- $primes = (0..($upper))
- $result = @()
- for ($i = 2; $i -lt $upper; ++$i)
- {
- if ($primes[$i] -ne 0)
- {
- if ($primes[$i] -ge $lower)
- {
- $result += $i
- }
- for ($j = $i*$i; $j -le $upper; $j += $i)
- {
- $primes[$j] = 0
- }
- }
- }
- return $result
- }
- GetFilteredPrimes 999999 100000 | Group-Object {getkey($_)} | Sort-Object Count | Select -Last 1
I was not patient enough to see how long it would take to calculate the 7-digit number, but 6 digits took about 4 minutes. Getting the primes was the most expensive part at 3 minutes, and grouping them took a minute more. I read a little about how to profile and optimize, but did not get far. I read that using things like [Math]::Pow() load up C# libraries, but avoiding C# did not appear to make a difference to computation times. Interesting experiment, and PowerShell is kind of fun, but I don’t have need for its strengths right now.
Python
Python is a favorite of mine, so naturally I wrote up the program here too.
- import math
- def main():
- primes = getPrimes(10000000, 1000000)
- matches = {}
- # group primes by permutations
- key = 0
- for prime in primes:
- key = getKey(prime)
- if key in matches:
- matches[key].append(prime)
- else:
- matches[key] = [prime]
- # find the highest count of prime permutations
- # and all digits that match that count
- max = [0, []]
- for match in matches.items():
- if len(match[1]) > max[0]:
- max[0] = len(match[1])
- max[1] = [match]
- elif len(match[1]) == max[0]:
- max[1].append(match)
- # report the answer, and all prime permutations
- print("max: %d permutations. (%d matches)" % (max[0], len(max[1])))
- for i in max[1]:
- print("%d:" % i[0], i[1])
- def getKey(n):
- key = 0
- while n > 0:
- key += 10 ** (n%10)
- n //= 10
- return key;
- def getPrimes(maxVal, minVal=0):
- primes = list(range(maxVal))
- limit = int(math.ceil(maxVal ** 0.5) + 1)
- for i in primes[2:limit]:
- if i != 0:
- for j in range(i*i, maxVal, i):
- primes[j] = 0
- return [i for i in primes if i != 0 and i > minVal]
- if __name__ == "__main__":
- main()
Python 2.7.6 and 3.3.3 both took about 10 seconds to calculate the result. I confess I'm a little upset that it's not running faster than the javascript edition. I like python, and may have to fiddle with this to improve it.