Here’s a simple toy project – find all the word squares – a word square is a n x n matrix of letters where each row and column is a word. I used brute force approach and no recursion (to make it harder), so this algorithm may be easily improved. Let’s get the words:

words = [ w.strip() for w in open("myfavoritedictfile.txt", "r").readlines() ]

Let’s define the class for the word square. Being lazy, I will not define a class at all.

import collections
Square = collections.namedtuple('Square', ['n', 'hlist', 'vlist'])

Hlist is the list of horizontal words. Vlist is for vertical ones.

Let’s check if the new word can be added – we need to verify each cell where it crosses another word. This is symmetrical, hence we need only one function and swap the lists if needed.

def add(list1, list2, w):
    k = len(list1)
    for i, e in enumerate(list2):
        if e[k] != w[i]:
            return False
    list1.append(w)
    return True

Now the actual function that fills the word square and returns the results as the iterables.

def fill_iter(n, words):
    count = 0
    z = Square(n, [], [])
    nw = len(words)
    stk = [ 0 ]
    while 0 < len(stk) <= 2 * n:
        count += 1

        if len(stk) % 2:
            r = add(z.vlist, z.hlist, words[stk[-1]])
        else:
            r = add(z.hlist, z.vlist, words[stk[-1]])
        if count % 1000000 == 0:
            print(count, z, words[stk[-1]])           

        if r:
            if len(z.hlist) == len(z.vlist) == n:
                yield z
                r = False
                if len(stk) % 2:
                    z.vlist.pop()
                else:
                    z.hlist.pop()
            else:
                stk.append(0)
        if not r:
            stk[-1] += 1            
            while stk[-1] == nw:
                stk.pop()
                if len(stk) == 0:
                    return None
                if len(stk) % 2:
                    z.vlist.pop()
                else:
                    z.hlist.pop()

                stk[-1] += 1 
    return None

Since this approach is very slow, let’s reduce the search space a bit. Here’s how to get all the 5-letter words that remain valid words if read backwards e.g BUS-SUB. If we limit ourselves to these words our word square will be a “magic” word square (almost, since we do not check the diagonal). And finally – run!

s5 = set(w for w in words if len(w) == 5)
rw5 = [w for w in s5 if w[::-1] in s5 ]

for z in fill_iter(5, rw5):
    print(z)

Just a few selected examples in Polish:


Square(n=5, hlist=['igrom', 'glebo', 'rener', 'obelg', 'morgi'], vlist=['igrom', 'glebo', 'rener', 'obelg', 'morgi'])
Square(n=5, hlist=['momus', 'odoru', 'motor', 'urodo', 'surom'], vlist=['momus', 'odoru', 'motor', 'urodo', 'surom'])
Square(n=5, hlist=['kinaz', 'imaga', 'nagan', 'agami', 'zanik'], vlist=['kinaz', 'imaga', 'nagan', 'agami', 'zanik'])

Here’s a programming or mathematical puzzle: given integers n, m such that n>=1 and m<=2^n, compute how many “1” digits are there in ((2^n)-1)*m written in binary.

For example, if n=3 and m=2:

(2^3-1)*2=7*2 = 14 = 0b1110, so the answer is 3.

You can think about how to solve this puzzle or simply read the rest of the text.

Probably everyone who memorized the multiplication table found the multiplication by 9 relatively easy. Nine times two is eighteen, 9*2=18, and the digits of the product add up to 9, 1+8=9. The same pattern holds for 9*3=27, 2+7=9 and so on until 9*9=81, 8+1=9 and 9*10=90, 9+0=9. This little trick helps with remembering and verifying the results of multiplication of single digits by 9.

If we keep multiplying 9 by larger multipliers will the sum of the digits of the product still equal 9?

No, the pattern breaks here: 9*11=99, 9+9=18, even though it is further resumed at 12, 9*12=108. Still, we can do better if we replace 9 by a bigger number.

What probably few people realize, the number 9 is just one of the numbers with the curious property that the sum of the digits of their product by any of [1..the original number + 1] is constant. For example 99:

 99 *  1 =   99, 9+9=18
 99 *  2 =  198, 1+9+8=18
 99 *  3 =  297, 2+9+7=18
 ...
 99 * 99 = 9801, 9+8+0+1=18
 99 *100 = 9900, 9+9+0+0=18

Further such numbers are 999, 9999, etc. Multiplies of 9, 99, 999… also have a similar property which follows from the above fact. For example 4995, 4+9+9+5=27, can be multiplied by any number between 1 and 1000 and the sum of the digits will be 27, e.g. 4995*765=3821175, 3+8+2+1+1+7+5=27.

If the curious property holds for (10^n)-1 for n=1..infinity in base 10, is this true in other numerical bases?

Yes, and the pattern is similar. In base 3, the counterparts of the number above would be written as 2, 22, 222 etc. In base 16: F, FF, FFF… In base 2, i.e. binary these are 1, 11, 111, 1111 etc. For example 2^8-1=255=11111111(base 2) multiplied by any value 1..256 will still have exactly 8 ones when written in binary:

 0b11111111 * 2 = 0b111111110
 0b11111111 * 3 = 0b1011111101
 ...
 0b11111111*123 = 0b111101010000101
 0b11111111*255 = 0b1111111000000001
 0b11111111*256 = 0b1111111100000000

The obvious fact that in case of binary the count of “1” digits is equal to the sum of all digits produces the simplest answer to the puzzle at the top.

Following this very inspiring post by Andrej Karpathy a few people reported success with generating text or even music when using the published source code. I wondered whether I should have a try too since I don’t have much recent experience with neural networks. Although I used to be interested in them a long time ago.

I did take a Neural Networks course during my studies. I remember trying to teach a simple NN on an Sun workstation to recognize shapes on a very small grid, although I don’t recall if it was a success. In fact I became a bit of a sceptic regarding this area after I learned that neural networks are considered (or were at the time) little more than numerical approximation of a multi-dimensional function. So it was maths, very advanced maths… and I lost interest.

Considering that someone did the tricky math for me and went so far as to provide the source code, the answer was – go for it. As for the text selection: I wanted to try a language other than English and chose a Polish author, Nobel prize winner, Henryk Sienkiewicz. His “Trilogy” provided me with 5.5MB of XIX century prose stylized for XVII century Polish. The style is very recognizable for an educated Pole. I wondered if the generated text would be something that could be mistaken for ‘real’ Sienkiewicz.

I had to install Torch7 which requires Linux or OS X, luckily I had a Mac available. This went well apart from some minor trouble with SSL which did not affect the RNN scripts. A major problem was found when I ran the training script for the first time with the recommended parameters for the network: 2 layers of 512 nodes. Unfortunately on my machine the script reported running out of memory. Was that a LuaJIT related problem? I don’t know. The training worked with 256 nodes. But that has probably affected the accuracy of the recognition a lot. I could not use GPU which meant a long wait for the results. After the first training completed I trained another network, a 3 layer RNN with 200 nodes per layer, which was as much as I could go without getting the out-of-memory error.

Here are my observations:

  1. Small network size on my machine (due to the out-of memory errors) might make the following points invalid, so please take those with a grain of salt.
  2. Polish due to its inflection and longer words seems to be much harder to machine generate than English. At high temperatures most words are hallucinated.
  3. At low temperatures the generated text consists exclusively of single-line dialogue while the training text was a mix of dialogue and longer paragraphs.
  4. Some of the structure was preserved, for example the dialogue that in Polish is introduced by m-dash. But other quotations, introduced by a low double quotation mark and completed by a high double quotation mark were never matched in any example text I generated.
  5. To my eye the generated text still had that Sienkiewicz look, but mostly Sienkiewicz on drugs
  6. Both 2 and 3 layer networks gave results that were for me very similar

I’m concluding this post with the samples of the generated texts, although these will be of interest only for the Polish readers.

Example 1, temperature 0.3, 2-layer RNN:

— Mówić?

— Nie mówić?

— Nie chciał waćpanna — rzekł Wołodyjowski.

— Nie mówię?

— A to jest?

— Jako po wielkim było!

— A gdyby chciał się wyprawić?

— Nie mógł — rzekł Zagłoba.

— Jako mu będę się w czym zaczął. Tu też to mu chwycić!

— Nie chciał się za rękę do naszego brać.

— Jako nie mógł?

— Począł się na koniec.

— Jeśli tak przeciw to jest?

— On wystawił?

— Przeciw to mówić?

Example 2, temperature 0.3, 3-layer RNN:

— Mieszkał się na króla.

— Począł się w tym nie mógł go w dobrze, bo się na chcąc przeciw wystanie, bo z króla w koniec, ale jak nie mógł na tego nad nim było mu się za przyjechali. Też było widział się przed królewskim.

— Nie mówił — rzekł Zagłoba.

— Nie było pod król.

— Mości się na stronę do nimi rękę, ale na której podniósł potęgę, bo tak chciałem na wrócili tylko jeszcze do dobrze w nimi wybranie rękę ku niego tego będzie się na tropach.

— To mówił?

— Nie chcą! — rzekł Zagłoba.

— Nad naszego było jako waćpanna.

— To mówił?

— Jako na królewskim nie była do niego straszniejszym, ale chcąc jest? — rzekł Zagłoba.

Example 3, temperature 0.5, 2-layer RNN

— Po podobnym ujrzał coraz czasem oczy z przyjacielem, wybliżało się w drugiego powiedzieć, choćby też mi Sieczy przyjdzie go w nimi na chwilę, ale gdyby mu by nie wystrzelać. Przeciw się na brzeg, przecie miał szeroko — rzekł konia — mówiła przez chwilę jest. On się tu dobrze, bo więc jest teraz strzelał, że ona widział.

— A szlachta? — rzekł Zagłoba.

— To mówić.

Okrutnie wyradując się z nimi znów za regimentarzu.

— To jest?

— To tam tego do głowę zamknął — mówił — i w pokrywało się na koniec razy i przepominać. Jest Jan Andrzej jak bitwa, postanął w czasu pochodził. Na muszę mu?

Example 4, temperature 0.5, 3-layer RNN

— Bóg chce, niechże w królewskiego znajdzie było się na grzmienie, mieszkało się nie mógł się król pod strzymanie nie mogli. Młody największej wielką przez takim chciała.

— A na czym było wydało się w najwięcej, co jak mu czekać.

Pan Zagłoba bronił chwycił się przez razem przy chwilę, gdyby po książęcej chwilę nawet się było wszystkich przy uczynili. Nie mógł nad koniec, że wyglądał się chcących. Tu zdrowie wystać.

— Nie chcę się nad drugim nie wybaczył, jak chwila się za brany — na komendanie na zasłużby wszystkich bardziej w rzekł:

— Nie miał przeszczęście. Wielkiego widział jegomość, bo której najlepszego i bogusław.

— Rozprzyjaciele!

— Jeśli było to się dwóch, bo już w straszniejszych światłym za głowę, że się chce widzieli. Ale za obiec przybył pod żołnierzem.

— Największym jak do Chmielnickich — mówił Kmicic.

Example 5, temperature 0.8, 2-layer RNN

— To nagrodzili go czasu.

Lecz uczynił się drugie, więc tymczasem gardła na nich.

Trzeba Tuhaj-bejowicz. Tłumu ona nie zamiarał szwedzkich roczym, jak na tej odstawiały bardzo nie mogło na kompańskiego poczętych na księcia na Lepiej, ale mnie na wielki powierzył, jeśli niż uczekał się wymienać.

Zagłoba chciała konie na chorągwią staroszać. Ale co ku jeźdźców go wprawdą i zbliżał i po waszego namiestnikowskich stawały.

Niechże pierwszy bronić i nie nie majdzie nie poświecimów. Ale ze mną obsadzili. Boże witał król w takiego i jego chwilę komendanko jak w kolichu, rąków, to w śmiercię było też zwalując z takich burza. Obotem zbudziły się zawsze coraz nie wielkich zaraz chorągwie oddali z gotów i dla głosami poszli niezmiernym…

Example 6, temperature 1.0, 3-layer RNN

— Wyskadził? — brzymała następnie ów mili obiechałem, że jednak lasie było piersia, i przeznaczenie i wiesz; bo zapowiedział się rozwarszyć ciągnąć za nich szepny sama ziemi. Tak zabyt, gdzie które streszcie tego gałakiem. Kości Rasztwierz łzy nie z gach, słusznie, który duchen jej rozścigi mu i chwało chciał widziało ja tęgen. Przecie tam, młody? Przez własnym tylko do kniahina pewnym, i wśród na cztar radach wród, prawdzie pierwsze wydawać i z wszystkim kilkadzielników. Przyczewniejszy wraz bardzo skąd się nie po nasz się od pułkowniku, od żałach. Właśnie z wylertu kaniec. Może się.

Pan Krzysią i ultaniałło się do fortuni? — rzekł — wszyscy to mojej między blichem, by nie go nastał. Wołodyjowskiego ludzie — i przesłyszał, na Kosztura moim ksiądz lepszy zabył i jeno zdawała się wydumienia się i przyszła Żwędy obszache za ławych zamieść. O białym z tych za dniego mam!… Odryszkali, bo jeździ się nie nagle duszne, przyjdzie zdawił zostawią!”

The Cheryl’s Birthday puzzle that took the social networks by storm is a variant of the Sum And Product puzzle. Note that these articles contain the solutions.

Let’s imagine that two mathematicians are told independently and in secret the sum and the product of two different numbers from [2, 100] range. The one that knows the sum is able to declare that the other one does not know the original numbers either. Having heard this one that knows the product states that now he knows the secret pair. In response to that the first mathematician also declares the knowledge of the original numbers.

Upon seeing this puzzle I decided to solve it in Python in a logical and readable way. While the solution works, I’m not happy with the readability. In fact it turned out that the Wikipedia Suma Nad Product puzzle page contains a Python solution that seems to be simpler than mine. Oh well.

Here’s my code for the curious.

import itertools
import collections
import sys

r = [ (x, y) for x in range(2, 101)
             for y in range(2, 101)
             if x + y <= 100 and x < y ]

def allSums(r):
    return set(elem[0] + elem[1] for elem in r)

def allProducts(r):
    return set(elem[0] * elem[1] for elem in r)

def havingSum(s, r):
    return [e for e in r if e[0] + e[1] == s]

def havingProduct(p, r):
    return [e for e in r if e[0] * e[1] == p]

def flatten(z):
    return list(v for v in itertools.chain.from_iterable(z))

allSums0 = allSums(r)

# "sum" mathematician does not know so the sum is ambiguous
sStmt0 = flatten([ havingSum(s, r) for s in allSums0 if len(havingSum(s, r)) > 1 ])

allSums1 = allSums(sStmt0)

selSums1 = [ s for s in allSums1
        if min(len(havingProduct(e[0] * e[1], sStmt0)) for e in havingSum(s, sStmt0)) >= 2 ]

# "product" mathematician cannot know so these are the numbers for which products are ambiguous 
sStmt1 = flatten(havingSum(s, sStmt0) for s in selSums1)

# "product" mathematician now knows, so these are the filtered products for which solutions are unique

selProd1 = sorted([ p for p in allProducts(sStmt1) if len(havingProduct(p, sStmt1)) == 1 ])

# and here are the pairs

pStmt2 = sorted(flatten([havingProduct(p, sStmt1) for p in selProd1]))

sumCount = collections.Counter([ e[0] + e[1] for e in pStmt2 ])

solutionSumList = [ x for x in sumCount if sumCount[x] == 1 ]

# finally the solution

if len(solutionSumList) == 1:
    solutionSum = solutionSumList[0]
    solutionList = havingSum(solutionSum, pStmt2)
    if len(solutionList) == 1:
        solution = solutionList[0]
        print("Solution:", solution)
        sys.exit(0)

print("Solution unknown")
sys.exit(1)

While reading excellent Professor Stewart’s Cabinet of Mathematical Curiosities I came across the 12 coins problem (again). This problem is usually stated like this: you have 12 coins out of which a single one is a counterfeit, either lighter or heavier than the regular ones. You have access to a balance scale and need to determine which coin is fake and if it weighs more or less compared to genuine coins. You only have  three weighings to accomplish this.

I must have seen this problem before and maybe even have solved this previously, however I could not remember the details of the answer. So I started anew especially as I was seeing a way to solve this neatly.

Obviously in each of three weighings every coin might be put on the left pan, right pan, or put away. Let’s code the left pan by 1, right pan by 2 and off the scale as 0. Obviously to differentiate between the coins each needs to have a different pattern of being placed into the left, right and no pan. For three weighings there are 3^3=27 such codes available, although (0, 0, 0) or 000 is not acceptable since we need to measure each coins at least once to determine if it’s heavier or lighter. So we have 001, 002, 010, 011, 012, 020…, …, 221, 222 to choose from.

Conveniently if we assume that each of the coins is weighed exactly twice we will end up with twelve available codes:

 1: 110
 2: 120
 3: 210
 4: 220
 5: 101
 6: 102
 7: 201
 8: 202
 9: 011
10: 012
11: 021
12: 022

For example the 1st coin would be put on the left pan in both first and second weighing step and put away for the third one. If the left pan would be heavier in both 1st and 2nd attempt and there would be an equilibrium in the third one, the first coin would be a heavier counterfeit. Or would it? Equally the 4th coin might be a lighter counterfeit since it was placed in the pan opposite to the 1st coin. My assumption was wrong. The code assignment as given above is NOT a correct solution.

Given this failure I resorted to a more clumsy solution involving a decision tree. It involved weighing a different number of coins each time, which convinced me that there must not be a solution that has 4 +4 coins on the scale each time. Having completed this I looked up the answer in Professor Stewart’s book. To my astonishment he gave a neat answer basing on a coding similar to mine! How was it even possible? Oh, there was a detail I overlooked. You get more usable codes if you weigh some coins just once and some three times!

There are some questions that arise at this point.

How to find a coding that will always give a correct solution?

Is the solution given in the book the only such coding? (Up to the order and left/right symmetry).

Is there a coding that puts only 1, 2, 3, 5 or 6 coins into each pan at some weighing step?

Let’s find out the size of the problem first. We have 27-1=26 codes to choose from and we need to assign one to each coin. The number of such assignments is C(26, 12)=9657700 (C stands for combinations, binomial coefficient). This is small enough to check them all by brute force.

An assignment must fulfill some conditions to be considered a correct solution. First of all, there must be equal numbers of coins in both pans at each of three weighing steps. So if 4 codes have “1′ as the first digit, which stands for the left scale pan in the first step, the remaining chosen codes must have 4 codes starting with “2” (right pan in the first step) and the remaining four require “0”.  Function equal_counts verifies this condition.

The other important condition is the “no symmetric coins” as shown in the example of my failed coding above. If a coin is given a code of 112 (left, left, right) any other coin cannot be assigned 221 (right, right, left). Otherwise you would not be able to determine which of the two is a counterfeit and whether it is heaver or lighter. This is checked by has_symmetric_coins function.

The brute force method finds the correct codings in less than a minute on a modern laptop. After removing duplicates – rejecting left/right symmetrical solutions and these solutions where only the order of weighing steps changes – there are 26 unique solutions left. What’s interesting each of them has 4+4 coins weighed at every step; this means that e.g. weighing 5+5 in any step would destroy too much  information. An example solution is shown here. As you see the first coin is weighed only once, on the left scale pan in the third batch. On the other hand coin number 12 is weighed three times, as right, right, left. Thus if the scale would balance towards right, right and left in the three steps, the 12th coin would be a heavier fake; balance towards left, left and right would mean 12th coin is lighter, while balance towards left, left and equal (110, opposite of 220) would mean 11th coin is lighter. And so on.

 1: 001
 2: 010
 3: 011
 4: 012
 5: 100
 6: 102
 7: 121
 8: 122
 9: 202
10: 210
11: 220
12: 221

The brute force program is given below. I’m sure there must be a more sophisticated and faster algorithm. Finding it is left as an exercise for the reader.

 

import itertools

left_right_map = {}

def equal_counts(tuple_of_tuples, weigh_cnt):
    # weigh_cnt == len(tuple_of_tuples[0])
    for i in range(weigh_cnt):
        left = sum([ 1 for x in tuple_of_tuples if x[i] == 1 ])
        right = sum([ 1 for x in tuple_of_tuples if x[i] == 2 ])
        if left != right:
            return False
    return True

def has_symmetric_coins(tuple_of_tuples, weigh_cnt):
    global left_right_map
    m = [ 0, 2, 1]
    if len(left_right_map) == 0:
        for t in itertools.product([0, 1, 2], repeat=weigh_cnt):

            equivalent_t = tuple([m[x] for x in t])
            left_right_map[t] = equivalent_t
    for t in tuple_of_tuples:
        equivalent_t = left_right_map[t]
        if equivalent_t in tuple_of_tuples:
            return True
    return False

def check(coins, weigh_cnt):
    pos = [ t for t in itertools.product([0, 1, 2], repeat=weigh_cnt)
            if sum(t) > 0 and sum(t) < weigh_cnt * 2 # remove no weighing and always right cases
          ]  

    i = 0
    sol = []
    for t in itertools.combinations(pos, coins):

        i += 1
        if i % 100000 == 0:
            print(i)
        if not equal_counts(t, weigh_cnt):
            continue
        if has_symmetric_coins(t, weigh_cnt):
            continue
        sol.append(t)
        print(i, len(sol), t)
    return sol

def is_unique_solution(tot, solutions, weigh_cnt):
    lrmaps = [[0, 1, 2], [ 0, 2, 1]] # 0: no change, 1: 1->2, 2->1
    for pm in itertools.permutations(range(weigh_cnt)):
        for lrmap in lrmaps:
            eqsol = []
            for t in tot:
                equivalent_t = tuple([ lrmap[t[pm[i]]] for i in range(weigh_cnt) ])
                eqsol.append(equivalent_t)
            eqsol.sort()
            eqsol = tuple(eqsol)
            if eqsol in solutions:
                return False
    return True
        

def main():

    COINS = 12
    COUNT = 3
    solutions = check(COINS, COUNT)
    uniq = []
    for sol in solutions:
        if is_unique_solution(sol, uniq, COUNT):
            uniq.append(sol)
    print(len(uniq))
    for u in uniq:
        print(u)
        for i in range(COUNT):
            print(i + 1, ":", [t[i] for t in u].count(0), end=", ")
        print()

if __name__ == "__main__":
    main()

Just a heads up on the last things I’ve been doing – some old school graphical effects in plain old C without using floating point routines. I have even added dithering, again without floats. Something different, very refreshing.

dither

A challenge from my friend – writing a spider that will walk a certain blog and gather statistics on the comments. Sure, why not?

So here’s the idea, what else you need?

Of course Python, 3.2 in this case. Why not the latest one? I’m trying to find the sweet spot between the goodness of language’s built-in functionality and the richness of the external libraries which ten to lag lag, some have never crossed the Python3 barrier.

Packages like pip or easy_install help a lot when adding the libraries. Follow the instructions here:

http://stackoverflow.com/questions/4750806/how-to-install-pip-on-windows

After I got pip in my system I used it to install the extremely forgiving HTML parser, BeautifulSoup.

pip install beautifulsoup4

The built-in Python3 library urllib.request was used to download the HTML pages while BeautifulSoup provided tools for looking up tags. 100 lines later I had a working spider. I left it happily churning the stats from the blog website and left the room.

After a while I noticed that my computer looks like being idle. And it was. The spider has crashed because of an out of memory error. Oh well.

What was the cause? Again there was a library for that.

pip install objgraph

It turned out that Python’s GC does not work very well with BeautifulSoup’s thicket of inter-object dependencies. Every webpage parse left a solid lump of unused data in the memory until the 32-bit process ran out of the allotted 2GB.

I tried the trick with the soup.decompose(). It helped initially. But the more the tags I touched, the more memory was leaked. Soon my spider hit the 2GB wall again.

I could try to improve the GC further and hope for the best, or use the solution suggested on the web – use multiprocessing to start the parsing in a different process. In this way the memory leaks would be contained.

import multiprocessing

And several lines later I had a spider that did not leak. It parsed the pages perfectly, the only problem was that it did not send the stats to the main thread. Here’s why. The stats were a complex hierarchical dictionary and needed to go via multiprocessing.Queue to the main thread. The problem was that the data needed to be pickled in order to be put onto the queue. And the pickle module did not like my structure and threw a seemingly unrelated error. Oh well.

I could have redesigned my project or use shared memory, but chose the simplest solution ready at hand. Another library. Luckily json module liked my structures and I manually pickled them to strings.

import json

Very soon after I had a working spider that did not leak the memory and gathered the statistics. And worked a whole lot slower due to the pickling. Well, I had to live with that. After all it was only 150 lines (rewritten several times).

Follow

Get every new post delivered to your Inbox.