Saturday, September 27, 2014

4 - Palindrome Product, Heads or Tails?

Important News:

Project Euler is back running at capacity.  You can now go ahead and make an account and save progress.

Problem

A palindromic number reads the same both ways.  The largest palindrome made from the product of two 2-digit numbers is \(9009 = 91 \times 99\).

Find the largest palindrome made from the product of two 3-digit numbers.

Background


Palindromes are not unique to math, in fact there is little inherently mathematical about them.  They existed first as a linguistic construct.  A coincedence in the arrangement of letters such that they read the same one way as the other way.  Some linguistic palindromes are "racecar," "rotor,"  "star rats," the name "Mike Kim," "A man, a plan, a canal, Panama," which coincides with the centenial aniversary of the Panama Canal's opening on August 15th,  and my personal favorite "Go hang a salami, I'm a lasagna hog!"

The same concept can be applied to numbers.  \(12321\) would be an example of a Palindromic number because if you reverse the digits then it would still have the same value.  To a computer though, there isn't much a difference between a number and a character, especially when determining if it is a palindrome.  A long time ago nearly everything on a computer was written in plain text, using ASCI character designations.  Where each character took up one byte of storage (a string of 8 binary digits)
ASCII Table
The characters in this table are represented by hexadecimal notation, which is just an easy way to represent base-2 numbers. Base-16 is like base-10, only it uses 6 extra digits to represent number, the digits are represented by the letters A-F, where A is 10, B is 11, and so on to where F is 15.  The letter 'A' in hexadecimal is represented as 0x41, which is 65 in decimal \(65 = 4 \times 16 + 1\), and in binary it is 0b01000001.  People who work with binary a lot like to use hexadecimal, base-16, or octal, base-8, because each digit will corespond to 4 or 3 digits of binary, respectively.  Binary is not dense notationally, and needing to write binary all the time would increase the chance of making a mistake, which would take time to correct.

A computer stores text as a list of characters, and a word, or even whole sentances can be written as just a list of the characters.  "racecar" would look like [0x72 , 0x61, 0x63, 0x65, 0x63, 0x61, 0x72] in memory, and "12321" would look like [0x31, 0x32, 0x33, 0x32, 0x31].  Notice that both palindromes look pretty similar, they are just a list of numbers.  So it really matters more about how the characters are structured than any property of the word or number itself.

If this problem concerned letters our job would be slightly more complicated.  We would have spaces, capital letters, and maybe even punctuation to worry about.  None of which effects what we would deem a palindrome.  Thankfully we are working strictly with number, which have no spaces or capitals, so we can look exclusively at the order.

The Math

In the context of this problem it is a little more useful to regard the palindromes as words, using characters instead of numbers.  That is not to say that we cannot use math to describe palindromic numbers.  And some interesting patterns arise after looking at a few candid examples.

\(0, 1, 2, 3, 4, 5, 6, 7, 8, 9\)  These numbers are all trivial palindromes.  They are single digits, and only have one digit to repeat.

\(11, 22, 33, 44, 55, 66, 77, 88, 99\) are two digit palindromic numbers.  Each is divisable by eleven.  And 11 has some more interesting palindromic properties.

The powers of 11 are palindromes up to 11^4
$$11^0 = 1 \\
11^1 = 11 \\
11^2 = 121 \\
11^3 = 1331 \\
11^4 = 14641 \\
11^5 = 161051 $$

11^5 is a combo breaker,  and arises because of carrying extra digits.

Likewise repeating strings of ones give an interesting result
$$1 \times 1 = 1 \\
11 \times 11 = 121 \\
111 \times 111 = 12321 \\
\vdots \\
111111111 \times 111111111 = 12345678987654321 $$

And I propose that every palindromic number with an even number of digits is divisible by 11.  We can throw the odd digit numbers out right away, 101 is a prime number, so by definition it is only divisible by itself and 1.  For two digit numbers this is pretty simple to see, we only need to look at the first 9 multiples of 11.  But I want to prove it for palindromes of 4 digits and 6 digits and 100 digits.  I'm going to do this proof, wrongly at first, by a method we saw in the last blog post, it is a form of proof by contradiction, it is the proof by least counter example.

Let us assume we have a number \(abc \ldots cba\)  this can be rearranged into
$$ a(10^k + 10^0) + b(10^{k-1} + 10^1) + c(10^{k-2} + 10^2) + \ldots + n(10^{k-n} + k^n)$$ where \((k-n) - n = 1\) or \(k = 2n + 1\)  put more clearly, \(k\) is odd.

To solve this I'm going to look at a more general problem, and use that proof in this one.  I propose that any number in the form of \(10^k + 10^j\) where \(k+j\) is odd is divisible by 11.  So first we need to constrain this equation a little bit.  \(k\) and \(j\) are both integers greater than or equal to \(0\).  \(k\) is strictly greater than \(j\) and as stated before, \(k + j\) is odd, or put another way \(k+j = 2m + 1\). Where \(m\) is a nonnegative integer.  Let's assume that the number \(10^k + 10^j\) is the lowest number fitting our constraints that 11 does not divide into.  We know that \(10^1 + 10^0 = 11\) is divisible by 11, so there is at least one instance where this identity holds.

$$
(10^k + 10^j)  = 10^j (10^{k-j} + 10^0)
$$

But in the expression that we get, k-j must be an odd number because we have already asserted k+j to be odd.  We are subtracting \(2j\) from \(k+j\) and adding or subtracting an even number from another number does not change its parity.  Parity is simply a term for a number being odd or even, and there is plenty of opportunity to talk about that in another post.  Anyways, the number in the parentheses also follows the form \(10^k + 10^j\) in this case k is replaced by k-j and j is replaced by 0.  This number is smaller than our original expression, because we factored out 10^j, but we are assuming that \(10^k + 10^j\) is the smallest number that can be expressed like that without being divisible by 11, so \(10^{k-j} + 10^0\) must be divisible by 11.  This means \(10^k + 10^j\) is also divisible by 11, because one of it's factors is.  This creates a contradiction, in our initial condition, that \(10^k + 10^j\) is not divisible by 11, so the premise is incorrect, there is not smallest number such that \(10^k + 10^j\) is not divisible by 11.

I've already told you that this proof is wrong.  And you can stop reading here if you want to figure the answer out by yourself, because I'm just going to give it away in this next sentence.  \(10^{k-j} + 10^0\) is not necessarily smaller than \(10^k + 10^j\)  because j can equal 0.  We could make the proof more correct by stating that j must be greater than 0, but then we would have to prove a new base case, and we would have to factor our \(10^{j-1}\) from our number.  And then we would have to create a corollary extending the proof to be more inclusive by saying we could just divide out a 10, and none of this would be very elegant.

There was a very famous and prolific mathematician named Paul Erdos, who was so prolific that many mathematicians in academics have an Erdos number, basically their degree of separation to Paul Erdos in research papers collaborations.  Think of the system like 7 degrees of Kevin Bacon.  Erdos himself was 0, a collaborator of his on any of his papers would have an Erdos number of 1, any people who collaborated with Erdos's collaborators, but not Erdos himself have an Erdos number of 2, and it goes on. Anyways, Erdos believed that a proof should be elegant. You could prove for example that \(a^2 + b^2 = c^2\) for right triangles in a crazy mathematically rigorous sort of way in a paper that is over 100 pages long.  Or you could show that due to similar triangles the ratios of the sides demand that \(a^2 + b^2 = c^2\) Erdos believed that even if two proofs demonstrated the validity, or invalidity, of a mathematical statement, the more elegant one was always more correct.

My more correct proof involves modular arithmetic sometimes called clock arithmetic.  The modulus operator we have already discussed.  It basically gives the remainder in a division problem.  A clock is modular 12 because even though the time keeps on advancing, we express the time as one of 12 hours, at least in the United States we do.  The modulus is excellent to describe things that are cyclical, and it is easier to work with numbers that follow cyclical patters, because the range that needs to be proved is reduced.

As it turns out, 10^n follows a cyclical and predictable pattern.  And instead of modulo 12 as in our clock example, we will use modulo 11.  Firstly, lets look at at few examples.
$$
10^0 = 1 \equiv 1 (mod \ 11) \\
10^1 = 10 \equiv 10 (mod  \ 11) \\
10^0 = 100 \equiv 1 (mod \ 11) \\
10^0 1000 \equiv 10 (mod \ 11) \\
$$

In modular arithmetic sometimes nice to express numbers in the negative.  It is perfectly okay to do this.  I like to think of the modulus less of a function and more of a way to say that some numbers behave the same if you look at them in a certain way.  So 10 (mod 11) is equivalent to -1 (mod 11)  You can think of it like a clock.   Going from 12, or 0, backwards an hour brings to to 11, or -1.  The hand would still be in the same place in either case, so they can be considered equivalent.

A pattern seems to have emerged, alternating between 1 and 10 in modulo 11, or 1 and -1 put another way.  Let's show that this is always the case.

$$
10^0 \equiv 1 (mod \ 11)\\
10^1 = 10 \times 10^0 \equiv 10 \times 1 (mod \ 11) \equiv 10 (mod \ 11) \equiv -1 (mod \ 11)\\
10^2 = 10 \times 10^1 \equiv 10 \times -1 (mod \ 11) \equiv -10 (mod \ 11) \equiv 1 (mod \ 11)\\
$$

Basically, any time a power of ten is equivalent to \(1 (mod \ 11)\) then multiplying it by 10 again will change it to \(-1 (mod \ 11)\) and vice versa.  The equivalent sign \(\equiv\) is like a equal sign, in that algebraic operations can be performed across it.  It just says that the terms on either side are not exactly equal, only that the first term can be represented in a logically consistent way with the second term.   So, we can say in one simple equation \(10^n \equiv (-1)^n (mod \ 11)\)

As a side note it can also be shown  \(\forall x: x^n \equiv (-1)^n (mod x+1)\).

When we add two powers of 10 to exponents of differing parity then one power will be equal to \(-1 (mod \ 11)\) while the other is equal to \(1 (mod \ 11)\).  The sum of these two numbers would be \(0 (mod \ 11)\) meaning the number is divisible by 11.

The Code:

That was a pretty lengthy post, and a bit longer than I was expecting.  And in typical fashion of this blog, I am not using what I just talked about in my code.  There isn't a point to knowing that 6 digit long palindrome numbers are all divisible by 11.  I mean, it's cool, but I wouldn't call it vital.   But just so what I wrote about wasn't a complete waste, I'm going to see if I can use that information to my advantage to narrow down the search by a factor of 11.

The design philosophy for my code takes a few differing paths to arrive at the same answer.  One is more top down, the other bottom up.  In my python code I am iterating through multiplying all 3 digit numbers together, and checking to see if the answer is a palindrome.  For my C code I am doing the opposite, generating all of the possible palindromes, and finding out if it has two 3 digit factors.  The burning question is which algorithm is more efficient.

You might think that because the python code is more brute-force like it takes more time to run, but there are some interesting optimizations that can be done.  For instance, instead of counting up, we can count down.  That helps since the largest palindrome product is more likely to be at the higher range of numbers.

Python:


largestPrime = 2

def isPalindrome(num):
    return str(num) == str(num)[::-1]

longpal = 1

for i in range(1000):
    for j in range(1000):
        if isPalindrome(i*j) and i*j &gt longpal:
            longpal = i*j

    
    
This code is a really simple brute force code.  My method of checking whether or not a number is a palindrome is a little wonky.  I first turn it into a string, then I use python's advanced string indexing to select all of the string, and the -1 indicates that I want it indexed backwards.  A little convoluted, but it works.

A few ways that I can improve the code is to tweak the range, we only want to check 3 digit numbers.  But this alone doesn't do too much for us.  We can also count down from j = 999, and all numbers after that are guaranteed to be lower, so we can just break the loop.  We also run into the issue where we are multiplying most numbers twice.  When i goes from 999 to 998, j will still start at 999, even though that was already calculated that multiple.  So we can start at i instead of 999 each time.  And for that matter, if i*i is less then the largest palindrome then we might as well stop there, because j is going to be less than i, so any palindrome found will be smaller and rejected anyways.  Since we already know the answer past that point, there is no point in checking it.  So here is the revised code, basically the same, but a few more breaks put in.


largestPrime = 2

def isPalindrome(num):
    return str(num) == str(num)[::-1]

longpal = 1

for i in range(999, 99, -1):
    if i*i &lt longpal:
        break
    for j in range(i, 99, -1):
        if i*j &lt longpal:
            break
        if isPalindrome(i*j) and i*j &gt longpal:
            longpal = i*j
            break
    

I timed out both programs and the difference is pretty apparent.  The first, brute force search took way longer than the second one.  It took 2.36 seconds compared to the optimized .0210 seconds. Two whole orders of magnitude out of just going backwards and knowing what stuff you don't need to check.

And I did manage to make the code a bit more optimized.  You know how I said I wouldn't be using that all even digit palindromes are multiples of 11 business. I may not have been telling the whole truth.  You can put it in the j loop when defining it by "for j in range(i - i%11, 99, -11)"  This will initialize j at the first multiple of 11 below i, and step j down in increments of 11.  So at least one multiple will always be 11.  Using this technique I managed to whittle the time down by another order of magnitude, to 2.00 ms.

Edit:  I realized that this technique only works coincidentally.  To implement it right there should be two j loops, one for when i is a multiple of 11 where j steps down by 1, and when i isn't divisible by 11, then j can be decremented by 11.

C:


#include <stdio.h>
#include <stdio.h>
#include <math.h>

main()
{
    int palindromes[1800];
    
    /*Populate palindromes*/
    /*6-digit numbers*/
    char i, j, k;
    int pos = 0;
    int p = 0;
    for (i = 9; i>=1; i--){
        for (j = 9; j >=0; j--){
            for (k = 9; k>=0; k--){
                p = i*(100001) + j*(10010) + k*(1100);   /*Middle digits*/
                palindromes[pos] = p; /*pos also = 100*(9-i)+10*(9-j)+(9-k) */
                pos++;
                p = 0;
            }
        }
    }
    /*5-digit numbers, for the lulz*/
    for (i = 9; i&g;t=1; i--){
        for (j = 9; j>=0; j--){
            for (k = 9; k>=0; k--){
                p = i*(10001) + j*(10010) + k*(100);
                palindromes[pos] = p; /*pos also = 100*(9-i)+10*(9-j)+(9-k) + 900*/
                pos++;
                p = 0;
            }
        }
    }

    /*Iterate through palindromes*/
    int axis;
    int a, b;
    for (i = 0; i<1800; ++i){
        axis = floor(sqrt(palindromes[i]));

        for (a = axis; a>=100; a--){
            for (b = axis; b<1000; b++){
                if (a * b == palindromes[i]){
                    break;
                }
                else if (a*b >  palindromes[i]){
                    break;
                    /*If the product is too larger, then there is no point
                    in going any larger*/
                }
            }
            if (a*b == palindromes[i]){
                break;
            }
            else if (a*b > palindromes[i]){
                break;
                /*Likewise, if the product is too small at b's largest,
                 then there is no reason a should continue decreasing*/
            }
        }
        if (a*b == palindromes[i]){
            break;
        }
    }
    
    printf("%d = %d * %d\n", palindromes[i], a, b);
}

The first part of this code simply generates an array of all the possible palindromes.  The rest of the code is what I used to find the factors of the palindrome.  It uses similar control points to the python code.  And the method of finding the factors is pretty similar to problem 2.  What the code says isn't really that interesting.  But what it does is.  This program does just the opposite of the python code.  Instead of taking factors and finding a palindrome, we are taking palindromes and finding factors.  I like to think that this code runs faster.  Generating the palindrome list effectively takes no time at all, and since we look from the high end of the list first we can disregard everything after we find a number with two palindromic products.

I did a little bit of math to find out how large of an array is needed, but the math isn't too hard.  It turns out that there are 1800 5 or 6 digit palindromes.  And I used 5 digit palindromes because two small 3 digit numbers can multiply together to get a 5 digit number, and an interesting thing to note is that there are exactly the number of 5 digit palindromes as 6 digit palindromes, 900 each.  I will let you figure out why.  And remember to keep tuning into this blog.  I don't update it often enough.  But the sizes of these posts really get up there.  And formatting stuff is not the easiest work.  Ta!