emergent behavior: 2009

Todays entry is about a fun little bug that a customer reported the other day. The cause is very straight forward, if you know a little trivia about Windows.

Without going into too much detail, the product I work is a type of document management system. You add files to the document management system and can do various things to them. For various reasons the document management system sometimes creates icon files for different document types and stores these icon files in the local file system. For example if you added a file called 'abc.mp3' to the system then an icon might get created on the local file system called 'mp3.ico'. Seems reasonable right?

However a customer reported a bug that had our support team stumped. They were getting an error when they tired to add documents that had the file extension 'con'. If you know a bit about windows you have probably guessed the problem already. Since the very begining of time MS-DOS, and later Windows, have had a list of several file names that are reserved in all folders. From here:

Do not use the following reserved device names for the name of a file:
CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9

Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended.

Of course once you know this the cause of the bug is obvious. The user adds a file named 'abc.con' to the system, which then creates a file named 'con.ico', and then Windows gets upset. The reserved words are of course names of devices, they exist so you can use devices like files à la UNIX. The strange thing is they are reserved in all directories.

Anyway the lessons are:

1. The more general lesson: As the files names were essentially determined by user input they should have been sanitized at some point.

2. The more specific lesson: Be aware that there are some less obvious rules about valid file names in Windows.

After reading Chaos by James Gleick the other day I was inspired to write a program to draw the bifucation diagram of the logistic map.

The logistic map is an iterative function which can be used to model population growth. It looks like this:

The Wikipedia article gives a good explanation on how the logistic map works. You start by picking a value of x between 0 and 1 and plugging into the equation. This gives you a result that you plug back in as you new value of x. You repeat the process several times and see what happens.

The equation is not particularly complex, so you might not expect anything to exciting to happen. Indeed for values of r less then about 3.45 the value of x will bounce around a few times and quickly converge on a value. However if we keep increasing the value of r things become more interesting. As r approaches approximately 3.45 x starts to bounce around a lot more before settling on a value. As r continues to increase x will no longer settle to a single value, but instead oscillate between two values. If we keep increasing the value of r we will find that x now starts to oscillate between 4 values, and then 8 then 16 and so on. At about r = 3.57 the values of x become chaotic, never settling in to a predictable oscillation. However if we keep increasing r we will occasionally see that some values of r again seem to show more ordered behaviour.

If we draw the long term values of x we end up with a dramatic bifurcation diagram that illustrates the complex behaviour contained in the seemingly simple equation.

So anyway I decided to try writing my own program to draw the bifurcation diagram, which turned out to be a relatively simple task. Here is my first result:

Not bad, but there are a few problems. Firstly the diagram indicates that x settles on several values before the first bifurcation. This is incorrect, this is actually an artefact caused by not iterating enough times before drawing the value(s) of x. Secondly, this diagram doesn't capture a lot of the intricate details we would like to see.

Zooming in and increasing the number of dots draw helps to bring out more detail:

Lets try increasing the number of iterations performed before we draw a dot, and also drawing a lot more dots:

Oh dear. While we have gotten rid of the artefacts around the bifurcations mentioned before, we have also draw so many dots that we have wiped out most of the detail. I looked on the web for a way to increase the detail. One solution others had tried was to count how many times a dot is drawn in a particular point and to draw points that were drawn many times as brighter then those drawn only a few times. This seems like a good solution, but I was feeling lazy so I decided to try a dumber approach first. I simply made points that are drawn in later iterations brighter. My theory was that points that are drawn a lot are more likely to be drawn in a later iteration. I wasn't really expecting this approach to work very well, but to my surprise the results were pretty good:

Above you can see that we are getting nice clean bifurcations, and we can see some nice details in the chaotic parts of the diagram.

I wondered what would happen if we drew all iterations, instead of only drawing dots after a certain number of iterations. This idea was inspired in part by the way the Mandelbrot set is often rendered. A rendering of the Mandelbrot set is also created using an iterative function. Most rendering of the Mandelbrot set draw points that are outside the actual Mandelbrot set. These points are draw in different colours, depending on how many times the function is iterated before it is determined that the point is not actually part of the Mandelbrot set. When I tried this with the logistic map I got the following result:

As you can see some new lines appear. For the non chaotic parts of the diagram these lines trace the values that x visits before settling into an oscillation.

Zooming out we see the logistic map seems to exhibit interesting behaviour for values of r between approximately -2 and 4:

Lets add some colour to make things look nicer, and zoom in on some interesting areas:

The initial value of x used does not affect the bifurcation diagram itself. However it does effect the values that x visits before reaching a stable oscillation (for those values of r where is does reach a stable oscillation). Here is an example of what happens when we vary the initial value of x (zoomed in on the first bifurcation again):