Macro Macro Man

I love meta-programming. I really do. Making the computer not only do what I program it to do, but making it writing my programs just gives you the feel of power and makes you want to shout It’s alive in disbelieving ecstasy. Or something like that.

But that being said, there are some precautions that make meta-programming often quite different from normal programming.

  • Care how it works: If you don’t know how a process normally works, you can not dream of interfering with it in a meaningful manner. Sometimes this is rather easy, as for example the macro system of modern LISPs, or in Ruby, sometimes it is quite different from the way you’d normally do, as in Java, and sometimes, it seems more or less impossible – yeah, I’m looking at you, C.
  • There is no copy-and-paste: Depending on how easy the programming process is, there might be few if any tutorials and helpful sites out there. Ironically, the more you’d need a helpful hand, the less likely it is that you’ll find it.
  • Your code probably won’t be all that self-explanatory, so document it well. Few people are familiar with the Craft, so explain what you do. Depending on your background, you’d do that anyway, but others write their code self-documenting, which relies on others getting the basic idea of what you are doing.
  • Expect the unexpected. Ok, this is quite similar to what you would normally do, but in my experience, the WTF per minute rise significantly when dealing with meta-programming.

So… why would you even deal with this? The answer is the usual: to save time and to make hard things easier.

I once wrote a decorator for Python, that automated the generation of a test suite, so instead of setting up these things by hand, you’d just write @Test in front of any test method, leading to more concise and arguably more readable code. A simple note @commutative could generate a test case to check, if f(x,y) is f(y,x) (for a given example at least). Or imagine a Ruby meta class, that automates the generation of a wrapper class by translating most calls with the help of a dictionary. So meta-programming can help you to jump over the boring parts and get to the hopefully less boring parts.

I hope, I encouraged someone to give the meta-weirdness a try.

Cheerio,

Zombiecalypse

Posted in Uncategorized | Leave a comment

The Quad Tree (I)

Today I’m going to introduce an interesting data structure for graphical applications: The Quad Tree.

Imaging a game with many more or less rectangular objects, with huge levels, in which only a small part actually has to be drawn – the rectangular screen. The naive idea of checking all objects if they intersect with the screen is nice – you don’t have to draw all objects, but that can be quite difficult to compute if you have thousands of rectangles.

The question the quad tree tries to answer is the following: Given this set of rectangles, is there a faster way to find all the rectangles that intersect with this one, than to iterate over all of them? Lets call that rectangles.allIntersecting(R)

The basic idea behind the Quad Tree (QT) is to sort out parts of the picture, that can not possibly contain any rectangles hitting R.

Take the rectangle [[0,0],[1,1]] as an example. Can R=[[0.1,0.1],[0.25,0.25]] hit the top right quadrant? Nope, so never mind even checking if there are rectangles in there, that hit R.

A QT finding all intersections with a rectangle

The quadtree only checks part of the subtrees for intersections.

Every QT is either a leaf or has exactly four subtrees. Because of the ease of computation, the subtrees divide their bounds in four equal parts. The rectangles are stored in the trees, not the leaves, which are essentially nil.

Let’s start with

tree.allIntersecting(R):=
    hits := rects in self.items that hit R
    for all subtrees s overlapping with R:
        hits.addAll(s.allIntersecting(R))
    return hits

What happens, when a new rectangle is added to the tree? Let’s call that

tree.addRect(rect):=
   if rect hits all subtrees
       self.items.add(rect)
    else
        for all subtrees s getting hit by rect
            s.addRect(rect)

By this, the rect possibly gets stored in multiple places, but that can’t be helped, else one couldn’t be sure, there is no rectangle lingering in another quadrant, that intersects with the screen, but wouldn’t been found.

A tiny rectangle is added

Adding a tiny rectangle leads to many steps down the tree.

Does that solve the problem? That – as always – depends: The worst case for allIntersecting is the bounds of the tree, thus recursing all the way down. That makes it O(n*4maxdepth), which is worse than the O(n) of the naive approach. Even worse: there is no upper bound to inserting a rectangle: Imagine adding a very small rectangle [[0,0][epsilon,epsilon]] to the tree. Now it takes O(log2(1/epsilon)) steps until the rectangle hits a tree’s centre. And that function approaches infinity as epsilon gets small. That’s why QT’s should implement a maximal recursion depth, after which all pending rectangles are just inserted into the tree, no matter if they hit the centre.

However in the average case, there is a fixed size screen that is much smaller than the initial tree. Let’s assume the size of the screen is 2k times smaller than the tree’s bounds, now we can discard 2-3 trees each step down until the bounds of the next trees are as big as the screen (that is after k steps). We get O(2-kn4(maxdepth-k)) runtime – a huge improvement to the naive approach. If we set the maxdepth so that maxdepth=k, we don’t get to feel the problems of the tree at all!

Next time, I’ll show you how I implemented it in Clojure, so stay tuned!

zombiecalypse

Posted in Uncategorized | Leave a comment

How to Represent a Real Number on a Computer

One of the first things we learn in programming is that it is not possible to use a real number on a limited machine as a computer. It turns out that this is not quite true.

Theory first

When we talk about the real numbers [latex]\mathbb{R}[/latex], we tend to think of an infinitely long decimal representation as for example [latex]\pi = 3.1415926535897932384626433\ldots[/latex]. But in fact, a real number is just the limes of a sequence of rational numbers or you could say the sequence itself. To reformulate the example: [latex]\pi = \lim_{n\rightarrow \infty} \sum^n_{i=0}\frac{a_i}{10^i}[/latex] where [latex](a_i)_{i \in \mathbb{N}} = (3,1,4,1,5,\ldots)[/latex] represents pi as a limes, but just giving [latex](a_i)_{i \in \mathbb{N}}[/latex] and the representation scheme would be quite the same information.

Pi as normalized continued fraction

Pi as normalized continued fraction (Wikipedia)

But positional notation is not the only way. A very interesting one are continued fractions: ((the article is very interesting and covers all you need to understand this and quite a bit more)) every real number r is just a converging sequence of fractions known as the convergents. These convergents are optimally approximating the given number, i.e. there is no fraction with equally small numbers, that is a better approximation for r. I will use normalized continued fractions in this article for the sake of simplicity, but it should be obvious, how this approach could be applied to the non-normalized kind. On the other hand, there is the rather new technique of infinite lazy sequences as for example used in Clojure. It uses lazy evaluation to simulate infinite sequences, computing as many elements as needed, thus providing a seemingly never ending list, while effectively using only finite space and time. So by combining these two, we get actual real numbers on a finite machine!

Practical Continued Fractions

Continued Fractions (CF) are most often represented by a notation like [2;1,2,1,1,4,1,1,6,...] which would expand to [latex]2+\frac{1}{1+\frac{1}{2+\frac{1}{1+\frac{1}{1+\frac{1}{4+\frac{1}{1+\frac{1}{1+\frac{1}{6+\ddots}}}}}}}}[/latex] and is incidentally e. With the sequence [a0;a1,a2,...] we can formulate the recursive relationship between ak, the numerators pk and the denominators qk. [latex]p_k = a_kp_{k-1}+p_{k-2}[/latex] [latex]q_k = a_kq_{k-1}+q_{k-2}[/latex] where [latex]p_0=a_0, q_0 = 1, p_1 = a_0a_1+1, q_1 = a_1[/latex]. ((For the proof of many important properties, go check this page. It is not an easy topic.)) I will use this iteration to get a sequence of convergents.

Clojure

My language of choice is the before mentioned Clojure for its simple support of lazy sequences and its easy syntax. Still there are some barriers to understand the code: The syntax is LISPs S-expressions: (f argument1 argument2 ...) calls the function f.

(defn name [args] ...) defines a function.

(let [[name1 name2] [value1 value2]] ...) locally defines the
names as the values (a closure).

(iterate f value) creates a lazy sequence [value (f value)
(f (f value)) ...]
.

first, second, nth and
rest work on sequences and return the corresponding element or in the case of rest the rest of the sequence without the first element.

(map f [1 2 3 ...]) returns a lazy sequence [(f 1) (f 2) (f 3) ...] as in other languages.

#(foo (+ % %)) is an anonymous function that substitutes % for its argument.

Putting it Together

As we have just seen, we can only use a single argument for the function to be iterated, but our recursion will use 5 variables:

  • the list [latex]l = [a_0;a_1,\ldots][/latex]
  • [latex]p_0[/latex] and [latex]q_0[/latex]
  • [latex]p_1[/latex] and [latex]q_1[/latex]

But we can just compose the whole state as a list.  Lets reformulate the iteration step:

(defn kettenstep [state]
  (let [p0 (first state)
	q0 (second state)
	p1 (nth state 2)
	q1 (nth state 3)
	L (nth state 4)]
  (list p1 q1
        (+ (* (first L) p1) p0)
        (+ (* (first L) q1) q0)
    (rest L))))

Iterating this over the bracket notation of a CF gives all convergents.

(defn create-fraction [state]
  (/ (first state) (second state)))
(defn continued-fraction [l]
  (map create-fraction
    (let [[a0 a1] [(first l) (second l)]]
    (iterate kettenstep
             (list
	       a0;                ;p0
	       1;                 ;q0
	       (+ 1 (* a0 a1));   ;p1
	       a1;                ;q1
	       (rest (rest l))))) ;l

Testing it

As seen earlier, e can be calculated as [latex]e=[2;1,2,1,1,4,1,1,6,\ldots,1,1,2n,\ldots][/latex], which is fortunately easy to reproduce:

(def euler
  (cons 2 (cons 1
                (interleave (iterate #(+ 2 %) 2) (repeat 1) (repeat 1)))))

now running

(defn testEuler [n]
  (doseq [fraction (take n (continued-fraction euler))]
	   (println (list fraction (float fraction)))))

shows how the fractions quickly converge towards e:

(3 3.0)
(8/3 2.6666667)
(11/4 2.75)
(19/7 2.7142856)
(87/32 2.71875)
(106/39 2.7179487)
(193/71 2.7183099)
(1264/465 2.7182796)
(1457/536 2.7182837)

Limitations

Even though this approach can be used to work with many irrational numbers to arbitrary precision, it is by no means a perfect idea.

First off, it is not possible to use it to calculate any real number. In fact, since the representation must still be written in a finite way (i.e. the program code to generate an infinite sequence is finite), nearly all irrational numbers can’t be encoded this way. This flaw is inherit to all possible representations, so one can not get around this. In addition, many interesting numbers that can not be expressed in positional notation, have a regular CF representation.

Second is efficiency: since modern CPUs are designed for fast floating point operations, calculations involving ratios won’t perform as well. Adding to this, the maps will add up, possibly leading to stack overflows after many calculations.

Third is representation. I choose the representation by convergents because of its fast convergence and the relative ease of computation (a p-adic system runs into problems with carrying when adding or subtracting), however when applying functions to the CF, it is not guaranteed that the result is still the best approximation and thus might not be convergents themselves.

Conclusion

Lazy evaluation enables us to define infinite structures as for example real numbers. This allows us to use arbitrarily precise numbers without fixing the precision beforehand. The use of continued fractions in this efford ensures a relatively fast convergence and less complexity than similar approaches.

See you,
zombiecalypse

Posted in Uncategorized | Leave a comment

Functional Performance

There was once a famous dispute of Greek philosophers about a ship that was said to have been owned by an hero centuries ago. Of course, the boat had changed: every bit of wood, every rusty nail and every string of linen had to be exchanged during all that time. Now the thinkers were wondering whether or not it was still the same ship.The morale of this little tale is that things can get complicated if stuff changes. But that’s just the way it is, right? That things change is the whole point of programming. Or maybe not… most think of programming as changing states of a machine, however it can be shown, that just calling functions is equivalent to our beloved Turing Machine. And I mean functions as in mathematical functions – they always return the same output for the same input.

So you can program without ever changing anything. But that begs the question: Why would you do that? It seems quite counter-intuitive. Yeah, welcome to the wondrous world of functional programming, where nothing ever changes!

Techniques

Memory

What is the clockrate of the computer you are working on? Chances are, it’s about 2.5GHz. However you probably got at least two cores, the newer the more. That’s the trend, you don’t get faster processors, but you get more. There are problems with that. If you can do more things in parallel, but want to do something as fast as possible, your CPU power does not just add. And if something changes, it can not spread very well across processors, because different parts of the program might change the same value at the same time making them inconsistent. Or you lock access while you access it, but then you’d have to wait.

Or you don’t change it at all and copy it in a smart way. You’d expect this to make your code even slower, but scientist did a great job inventing datastructures, that make this easy. In fact, you’ll probably never see a case that would make your code more than a factor of log(n) slower, as experiments have shown.

As memory access is the performance bottleneck today, being able to perform it in parallel helps quite a bit.

Remember your Calls

Another cool bit is memoization. To show students why recursion sucks, professors like to use the following example

Recursive version

def fibonacci(n):
	if n < 2:
		return 1
	else:
		return fibonacci(n-1)+fibonacci(n-2)

Iterative version

def fibonacci(n):
	a = b = 1
	for i in range(n):
		a,b = b,a+b
	return a

The first one is usually slow as hell, because you keep computing the same numbers over and over again. But as fibonacci is just dependent on n, you could just make a table for it and only compute the number, if it hasn’t been done already. That makes the performance the same as the iterative approach. We call that memoization and you can perform that only if your function is pure, because otherwise, there might be other influences on the outcome, which would not be taken into account with the caching. If the function is pure however, you can use it for all of dynamic programming without real performance penalty.

Being Lazy pays off

Quite possibly my favorite part is laziness. If the return value of a function call does not depend on its environment and thus the precise moment, in which it is called, you don’t need to compute it right away. Wait ’til you actually use the value and maybe, you don’t even need to calculate it at all. This extends so far, that you can simply make an infinitely big datastructure, for example a list of all natural numbers.

That’s it for today, but there might be other articles following up for this topic.

Have fun!

Posted in Uncategorized | 1 Comment

What I learned about learning

Since I got interested in programming, I learned quite a few languages – at least 5 during the last few years and counting. I wouldn’t say, I mastered them, but I’m able to program enough in them for practical means. Here I will present the approach, I found to be the most efficient for me.

Try to remember how you learned to ride a bike, to add numbers or even to write a good essay. Did you read Introduction to riding bikes, learned the Peano axioms or read Voltaire to get the hang of it? Hopefully not.

I propose a practical approach, based on your own interest. In short, you do what you want to do, but you increase the size of your project in every step. You wouldn’t be able to pull that off without help of course. So what I would do is

  • Read about the typical use of the topic and try to get inspired by that.
  • Find a way to figure out, how things are done. That might be a online tutorial, a book on the topic, a reference, a help function, etc. You don’t have to read it just now, just keep it close.
  • Start fooling around. Try out easy thing that you are interested in. When I started with Ruby, I would define a function, that takes a block and a start value and find a fixed point by reapplying the block. I had no idea how to use blocks at the time, but that is what I wanted to do. So I pulled out my trusty reference and looked up the use of blocks, how to define functions and the basic control structures and soon, I was able to call fixpoint(heron(2.0),1) to get the square root of two.

I could just have followed the book, I’m sure, it is a good one, but that would not have given me the opportunity to find it out by myself, and just that improves your ability to learn it.

Greez, zombiecalypse

Posted in Uncategorized | Leave a comment

Evolutionary Ideas!

Prelude

In the beginning, there was nothing and god said Let there be light and there was light. He then continued to construct earth and sky, and stars, and animals, and plants, and finally humans. And he saw that it was good. In retrospective, we should have been worried about the lack of unit testing and documentation, but all in all that’s how a software project should work. Only that it involves so much thinking. Not only you have to know, how your creations will interact with the environment, but also with the other creatures.

As far as nature is concerned, chances are, that this is not what happened. More likely is the following: In the beginning, there was nothing, then some milliard years later, there was the earth and even later, there started some complex chemical compounds to evolve, forming life. The lifeforms better fitted for their environment would reproduce, the more unfortunate would bite the dust. Thus the surviving organisms would be better fitted for their surroundings. This goes on since then, with the latest idea of nature being human able to think about all these things.

Nowadays, we often come to problems, where the solution is not quite obvious and often enough, it is not possible to find the perfect solution by thinking alone. That’s even true for numeric optimization problems. Then some smart guys thought about this: What if you could just use the same trick, nature used to come up with a good enough solution to the problem What’s the best way to survive on planet Earth?

This is where genetic Algorithms come into play.

The Generic Genetic Algorithm

So a genetic algorithm imitates the behavior of systems of organisms to find a good enough solution for a problem. But what is good enough? And what is a solution?

Natural Born Killers

For a genetic algorithm to work, we need the general form of our solution. That is, a set of parameters. Suppose for example, you programmed an Ego-shooter and now you need to come up with an AI, that can compete with a human player. You already parametrized the behavior on some scales:

  • Aggressive vs Passive
  • Moving fast vs Shooting precise
  • Favorite Weapon

Now all you need to do is to find the most efficient combination of the parameters (or rather some very good ones). That is, as a mathematician might put it, highly non-trivial, AKA f_cking hard. However since you already finished the game logic… it should be easy, to set Bots against Bots.

So you begin with some random Bots (say 100), that probably wouldn’t outperform a 2 year old in tactical knowledge. However you can put them in the arena and make them fight to the death. Besides giving you the possibility to laugh diabolically and feel like a roman emperor it shows you what parameters are best. You laugh even more diabolically and kill the 90 worst competitors off. Now you have a set of 10 rather good parameters, but you can surely do better. You could just produce new random guesses and always filter out the best, but that would ignore the fact, that you already have some good guesses. Instead you combine them to get a new generation – you could think of that as making them mate, you dirty boy/girl.

Just chose random pairs and combine them in some reasonable way. So the child of an very aggressive and a rather passive parent might be a little more aggressive than the mean and the favorite weapon is randomly chosen from the parents. Do that until you have enough (for example 100, because that’s where you started).

Now you can just repeat that until you’re happy

However there are other possibilities: You could introduce mutations, that is randomly changing some parameters marginally. That could help, if your first guesses were not quite as good as you might need.

As you can easily imagine, this principle runs into a problem, if there are several highs, you will stick with one and that might not be the best. You are prone to find a local maximum instead of a global one. It seams reasonable, that aggressive, accurate bots and passive, fast bots would perform well, but interbreeding them would produce not very useful ones. There are ways around that, but for now, we’ll keep it simple.

Cooking the Primordial Soup

As we’ve seen, there are some simple steps, that can give you a very simple genetic algorithm:

  1. find the parameters of your solution.
  2. figure out a function, that chooses the best individuals out of a population of parameter sets.
  3. come up with a reasonable way to combine some individuals to produce new ones.
  4. start with random values and iterate.

Now you can plug in what ever you like.

Have fun.
Posted in Uncategorized | Leave a comment

Decorative Unit Testing in Python

So after some experiments in C++, this week I got back to where I once started: Python! The idea was, that I’d produce a quick prototype for something, so that I could translate it into another language. However, I got distracted as I often do.

First of all, I wanted to be able to test things in my prototype, so I wanted some kind of unit testing, so instead of actually starting to write code, I headed for the intertubes and googled for unit testing in Python.

In fact, there is the standard module unittest, just here. That’s really cool, I thought, I’d be looking for hours to find something (as I did with C++), but this way, I could just begin. What I found rather annoying was that you normally have to put test in front of your test method names. My free spirit rebelled against such puny laws – you gotta start small with rebelling, right? Anyway, there was of course the alternative to write a test suite and just manually add all my test methods. Well… no.

But on the other hand, whenever you get an API to do something, you might as well automate the something as good as you can, so I started with the familiar JUnit syntax @Test to annotate test methods.

For those, who know the more obscure features of python, it should be clear, that this is a decoration in Python – a function modifying functions. In this case, I just left a note for me in the func_dict:

def Test(f):
	f.func_dict["Test"] = True
	return f

On the other hand, there was the problem, that no one would care for this fancy note, so I wrote an InSuite decorator for TestCase classes.

class InSuite(object):
	def __init__(self,testsuite):
		if testsuite not in __TESTSUITES__:
			__TESTSUITES__.append(testsuite)
		self.test = testsuite
	def __call__(self,f):
		for i in f.__dict__:
			if self.isTest(f.__dict__[i]):
				self.test.addTest(f(i))
		return f
	def isTest(self,x):
		try:
			return x.Test
		except AttributeError:
			return False

This would add each annotated method to the given TestSuite, and all TestSuites that are somewhere mentioned, to a global list. These two decorators are a great team, just take a look at my tests:

FunctorTestSuite = unittest.TestSuite()

@InSuite(FunctorTestSuite)
class FunctorTestCase(unittest.TestCase):
        def setUp(self):
		...
        @Test
        def shouldSquareResult(self):
		...

        @Test
        def shouldNotTakeTooMuchArguments(self):
		...

Tada! No more worries about naming conventions and InSuite could easily be extended to check for other annotations as well. Splendid, that was a rather useful waste of time!

Posted in Uncategorized | Leave a comment

Thoughts on Neuronal Networks

This has been on my ToDo Heap for some time now: I started to implement some libraries for neuronal networks in nearly all languages I know. I’m no expert on this topic and I haven’t actually been able to implement all the fascinating stuff so far, but there is still enough time left until the entropy death of the universe.

First of all, I should explain, what a (artificial) neuronal network is. Well, obviously it’s about wiring neurons together. I like to imagine them as persistent functions: They provide a value, which can be computed with some inputs. As with functions, the input might itself be the value of a neuron, else one couldn’t write for example abs(sin(x)) to compute the absolute value of the sine of a value x. On the other hand, often the input stays the same, so why not just keep the old value without recalculating, if the input didn’t change?

And of course when the value of one neuron changes, there is a wave of updates: A changes, B has A as input and thus changes, C has B as input, etc. Quite like a normal program actually, you do some calculation, until your line of dependencies ends.

That’s a boring program! It’s just a straight line, we don’t have any loops yet. But we could do that anyway, and there breaks the function metaphor: We could just wire them back to themselves.

Take for example the neuron A with the simple rule add your inputs. Now we connect

1 => A
A => A

and maybe add some things that do something useful. Now A might start off as 0 and update. It becomes 1 and update. 2 and update… That’s just an infinite loop counting up, but one can easily imagine, that it is much better with neurons like 1 if the first input is greater than the second or 0 else.

For now that’s where I am. There is of course more to it, as it is a possibility for creating artificial intelligence and dealing with computer learning, but you should probably read more about this, if you are interested. I might write more about this, once I have more time.

Posted in Uncategorized | 3 Comments

Hello World

(println "Hello World")

Hello out there. I decided to start a blog as a way to document my thoughts and sharing them with the world, which probably doesn’t care about them. Anyway, I will mostly write about my experiences with programming, math and stuff. Have fun!

Posted in Uncategorized | Leave a comment