IS-LM 2: Gross Domestic Product and Aggregate Demand

December 27th, 2011 § Comments Off § permalink

If you haven’t read the first entry in this series, you can find it here.

Background

Before we can discuss the IS-LM model, we first need to review some basic macroeconomic ideas that will be important once we get to actually modeling the economy.  The first thing we need to understand is what GDP is, and how it is measured.

GDP stands for Gross Domestic Product. Fundamentally, GDP is a measure of all of the goods and services produced within the economy, usually measured quarterly or annually. There are three basic ways of measuring GDP: Total Spending , Total Income, and Total Output (Production). While the IS-LM model focuses on calculating GDP through the mechanism of spending, it is important to note that the same measurement also reflects the total of all incomes as well as the total of all production.

In order to clarify this relationship, consider that all products and services bought and sold by consumers count as incomes for firms. When the government spends money to purchase goods and services, that money is either income for firms (purchases of goods) or income for workers (purchases of services).

The relationship between total output and total spending is  bit more difficult to explain.  Essentially, total production is measured in terms of value added within the economy. So, if iPads are manufactured in China, shipped to U.S., repackaged, transported, and sold, only U.S. production is counted.  That U.S. production is performed by workers, and the value of the production is measured by the amount of money spent on the final product, subtracting the value of the imported product. The same valuation occurs with products manufactured domestically from imported raw materials, etc.

While these relationships might be difficult to wrap your head around, the important thing to remember is that GDP is simultaneously equal to total spending, total income, and total output. For the duration of this project I will use the terms total spending, total income, and output interchangeably when discussing GDP.

If you would like to know more about different ways of calculating GDP, the article at Wikipedia is fairly elaborate.

Aggregate Demand

In order to construct the first part of our model, we need to think in terms of aggregate demand, that is, the sum total of all goods and services demanded in the economy at any given time.  One way of thinking about aggregate demand is to posit the question, “for every level of income, how much production would it take to fulfill the need for goods and services?” If we take a minute to think about that question, we must realize that aggregate demand is not a single number, but rather a range of possible numbers over a range of possible incomes.

More importantly than formulating an abstract answer to the somewhat koan-like question posed, is the question, “how do we calculate aggregate demand?” The answer to that question is somewhat more specific: Consumer Demand (C) + Investment Demand (I) + Government Demand (G) + Demand for Net Exports minus Demand for Imports (X – M). In other words:

AD = C + I + G + (X – M)

For the sake of clarity, we’ll discuss Consumer Demand (C) last, since it is the most complicated of the terms. The other three are rather easy to calculate at any given point in time.

Investment Demand (I) is based on the planned level of investment by all private sector firms (Ip).  In other words, the demand generated by firms that wish to build factories, buy raw materials, etc.  While we won’t go into detail about the word planned in the above explanation, it should suffice to say that there are unplanned investments as well, and that these do not usually count toward the calculation of aggregate demand.

Government Demand (G) is based on the expected demand of all government entities (local, state, and federal are all included). Unlike the other types of demand, government expenditures are somewhat direct and easy to track as a result of the budgeting process. In order to calculate this component of aggregate demand we total up all expected government expenditures for the given period of time.

Net exports (Nx) is sometimes written as exports minus imports (X – M).  In order to calculate this portion of AD we total the expenditures on exports (X) and subtract the expenditures on imports (M). The resulting number is the expected net exports (Nx).  In the U.S. this number is usually negative as we import far more than we export. More on this can be found by looking into the balance of payments.

At any given point in time, these three factors can be considered constant – they are independent of income.  Regardless of the level of income (GDP), the expected demand for these three factors will not change. While they will change over time, in the near term (one or two months time), such changes are highly unlikely. Of course, there are exceptions, such as war, natural disasters, etc. However, for the purposes of constructing the model, we are only looking at expected aggregate demand – unexpected factors will be brought into the model at a later point.

The final component of aggregate demand is consumer demand (C). Consumer demand is further split into two separate components: autonomous consumption and induced consumption.

Consumer Demand

The first part of consumer demand is autonomous consumption, i.e the level of consumption that occurs regardless of changes in income. Essentially, there is a minimum level of spending that must occur in the economy, whether or not incomes raise or lower. Even if incomes were to drop dramatically, people would still have to consume food and electricity, for instance. One way to think of autonomous consumption is as subsistence demand – that is, the level of demand expected as the result of people simply living.

Because autonomous consumption is unlikely to change, it is similar to government spending or planned investment spending in that it is independent of income.

The more interesting part of consumer demand is induced consumption. Induced consumption is consumption that occurs as the result of increases in disposable income. Every time you buy a magazine or toys for your children, it is counted as part of induced consumption. As your income rises, the amount you spend on consumer goods also raises, but not at a one-to-one ratio.

Imagine, for instance, that you get a one hundred dollar a week raise. Before you can spend any of your new income, taxes and transfer payments are withheld, so your one hundred dollars is now seventy five dollars. Of that seventy five dollars, some of it will be saved or used to pay down debts. The proportion of the seventy five dollars that you will actually spend is called your marginal propensity to consume (MPC). If we were to use this data to attempt to predict what your consumption patterns would be at any given level of income, we would derive the following formula:

C = c0 + MPC (yd – t)

In the above formula, C is total consumption, c0 is autonomous consumption, MPC is the marginal propensity to consume, yd is disposable income, and t is taxes. In aggregate, we can then say that expected consumer consumption for all households follows the same rules that were described for a single household above. As incomes rise, the level of consumption will also rise at the rate of the MPC. MPC is a ratio constrained somehere between 0 (a change in after tax income is all saved) and 1 (a change in after tax income is entirely spent). The estimated MPC for consumers in the U.S. as of 2008 ranges between .9 and .98 – that is, of every dollar of after tax, disposable income Americans receive they are likely to spend 90 to 98 cents.

To demonstrate this, I have graphed the relationship below:

Simply put, as incomes increase, consumer demand increases as well, though more slowly than income. It should go without saying that consumption cannot rise faster than income (MPC cannot be greater than 1) and that an increase in incomes should not ever result in a decrease in consumption (MPC <= 0).

Putting It All Together

The last step in calculating aggregate demand is rather easy – we add all of these factors together (C + I + G + Nx) to obtain our Aggregate Demand curve (AD) which gives us our expected demand at every level of income (y). One discrepancy in the graph below is that I have represented Net Exports (Nx) as a positive value. In reality, this value would be negative thus subtracting from the aggregate demand rather than adding. I have presented the value as positive simply for the purpose of illustration, as presenting the value as negative would be much harder to show.

 

The next step in this process will be the construction of our first analytic tool, the Keynesian Multiplier Model. More on this in the next post: IS-LM 3: The Keynesian Cross.

IS-LM 1: Introduction to Short Run Macroeconomics and the IS-LM Model

December 22nd, 2011 § Comments Off § permalink

Introduction

In macroeconomics, the IS-LM model is used in order to analyze the impacts of various forces at work in the economy.  It is critical for answering questions about why different forces create the economic conditions that we experience everyday.  This series of articles is my attempt to raise the level of literacy about the IS-LM model in the hopes that readers will be better able to perform their own macroeconomic analysis when discussing current events.

It is not my goal to be considered an authority on the subject of macroeconomic forecasting. As such, these articles will not focus on econometrics (the use of mathematical techniques to quantify economic forces). Rather, I hope that this series can serve as a primer for how economists model the forces at work without the need to bog down readers with equations and derivations.

That being said, some level of quantitative reasoning is necessary to understand any economic model. Models, by their very nature, rely on cause-effect relationships and strict definitions. Because of this, models can provide a powerful tool for understanding the way the world works, but care must be taken to observe the rules of the model if we want to be able to state our conclusions with any degree of certainty.

Positive Vs. Normative Economics

The IS-LM model is a form of positive analysis. This means that it seeks to explain the data without drawing conclusions about issues of right and wrong. This should be contrasted against a normative analysis in which the analyst’s notions of right and wrong are used to justify a position.

For this series of articles, I will attempt to avoid making normative statements – that is, statements that suggest an appropriate course of action based upon my personal beliefs. If I do find it necessary to make a normative statement, I will make an effort to clearly point out that I am injecting my personal views into the discourse.  In most cases where I suggest a course of action, it will be for the purpose of illustrating an aspect of how the IS-LM model works.  For instance, I might say, “According to the model in this situation, lower unemployment could be created by raising government spending.”  I will not say, “Lower unemployment is good.” or “Government spending is bad.”

 Why You Should Understand IS-LM

For many years, we have been in the clutches of an intractable economic crisis. At the time of this writing, unemployment is above 9% nationally, while GDP growth has been stagnant. Millions of Americans are saddled with high levels of debt as the result of the collapse of the housing bubble, while millions more face foreclosure or bankruptcy. In this environment, we are often faced with assertions about how best to deal with the situation coming from the media and political figures. However, without a proper understanding of basic macroeconomic theory, many Americans will go to the polls armed with information that can be misleading or even completely false.

Making matters worse, many people have the impression that macroeconomics is about guesswork and ideology rather than about serious analysis driven by facts and research. Economists themselves are in many ways to blame for this state of affairs as over the past 50 years there has been a general trend in economics toward esoteric mathematical models that fail to describe reality. Additionally, economists are often portrayed in the media as being divided on issues in order to meet the media’s desire to present a “balanced” point of view.

Despite the perceptions, economics is more hard science than conjecture. While some economists certainly try to use their credentials to push their ideological views, in reality their number is small. The vast majority of economists, like the vast majority of physicists, spend their time performing very complex analysis based upon tested mathematical models. IS-LM is  perhaps the most well established of these models, and has been very accurate in predicting short run impacts of different economic events over the past 20 years.

Many economists have lamented the fact that during this economic crisis, some of the more visible and prominent members of the profession abandoned this model in favor of obscure conjectures which failed to perform nearly so well as IS-LM. Because of the failure of these individuals, the entire profession has suffered damage to its reputation.

It is my hope that by providing background on the IS-LM model, more people will be able to see the real work of macroeconomics, and be better informed about the world in which they live.

In order to accomplish this goal, it is first necessary to give a basic overview of the model.

Overview: What is the IS-LM Model?

To begin with, IS-LM stands for “Investment Saving - Liquidity preference Money supply.” In basic terms, this means that the first part of the model (IS) is used to determine the level of output from the “Real Sector” of the economy, while the second part of model (LM) determines the interest rates by analyzing the “Financial Sector” of the economy.  The hyphen between the two suggests that these two analyses interact to create a “General Equilibrium,” which indicates the expected level of output and interest rates based on the data. That might seem very complicated, but by elaborating a bit we should be able to digest it all in small bits.

The real sector of the economy can be thought of as the economy most people deal with every day. It revolves around the production and consumption of goods and services in the traditional ways we think of: buying cars, selling iPods, food, gas, etc.

The financial sector is best summarized by thinking in terms of bond markets and cash money transfers. While some folks prefer not to think of the financial sector at all, it is nonetheless omnipresent in our daily lives, determining the interest rates on our mortgages, the outcomes of our retirement funds, and the ebb and flow of our stock portfolios.

These two sectors are divided mostly as a matter of convenience, based upon a chain of cause-effect relationships.  This causal chain is at the heart of the IS-LM model, and can be described as follows:

In the real sector, the interest rate determines the level of  business investment and spending. A change in the interest rate, therefore, results in a change in the level of business spending.  The level of aggregate spending (which is composed of business, government, and personal spending, as well as net exports) determines the level of output for the real sector. Therefore, a change in aggregate spending results in a change in output.  If you are having trouble keeping up, don’t worry – we’ll go into much greater detail later.

In the financial sector, the level of output dictates the demand for money.  The demand for money interacts with the supply of money (set by the Federal Reserve Bank) to produce a specific interest rate.

So basically, as the interest rate changes, the level of business spending changes. This changes output, resulting in a change in the demand for money.  A shift in the demand for money, results in a change in the interest rate, which brings us full circle. This relationship can be simplified further by stating it this way:

IS:      Δr => ΔI => Δy

LM:     Δy => ΔMD => Δr

I know, I know – that doesn’t seem simple at all! Luckily, however, this is just an introduction to and an overview of the model and I haven’t yet started to explain what any of this means.  For now, it’s probably best to just make a note to come back and re-read this later. In the meantime, we can begin to start understanding what all this means by moving on to the next article in the series – IS-LM 2: Gross Domestic Product and Aggregate Demand.

 

Weekly Build

July 31st, 2011 § Comments Off § permalink

Just a quick post of my latest build.  A few notes:

1.  As previously mentioned, I’m now designing for a joystick.  You should be able to play with the control set up and get something worthwhile, but if not, sorry.  I’m not going to do alternative key bindings until sometime in the future.  That being said, if you do your own control set up and really think it works, let me know what it is and I’ll jot it down for future reference.

2.  There’s a Windows version and Mac version.  I’ve tested the Windows version, but I can’t guarantee the Mac version.  Hopefully someone with a Mac can tell me if it works.

3.  As always, the game’s in a pre-alpha state, so don’t expect awesomeness.  You can fly around and look at the moon – that’s about it for now.  If you’re interested in what’s outside of the shield (which is nothing, btw) you can exit the area by flying through the grey tunnel in the center of the crater.

Cheers!

PJ

Game Update

July 26th, 2011 § Comments Off § permalink

A few people are goofing around with my spaceship game.  Since I’ve had to take a day or two off from research due to circumstances which will remain undiscussed for the moment, I thought I’d tweak and play with the game.  You can try out the most recent build here.

You’ll need to extract the executable and data folder to the same directory, then just double-click the .exe.

A few things to note:

1.  This is a Windows version.  Period.  I’m not going to try to build with the Web Player anymore, and I won’t be building for Mac or Linux until I have something much more polished.  That being said, if you don’t have access to a Windows box and absolutely MUST try it out, send me an e-mail and I might do an extra build on the weekend.

2.  It’s configured for a flight stick at the moment.  You can change input devices with the loader when you first run the executable.  I’m going to assume that it will be played with a joystick in the future, but will set up a backup “keyboard/mouse” default when there’s more to play with.

3.  The ship goes REALLY FAST.  The intention is to be maneuverable, but not necessarily precise.  I’ll tweak this after I have a full level in place.

4.  There is no “exit” option.  When you are done playing, just hit Alt-F4 to quit.  (or play in windowed mode).

Enjoy!

 

 

 

 

 

 

 

Testing, Testing, 1, 2, … , n

July 25th, 2011 § Comments Off § permalink

After a somewhat grueling couple of weeks, Godwin is about to go live.  This application is (by far) the most complicated software design I’ve ever managed to pull off, and I’m pretty relieved to have all systems testing out.   I’m simultaneously nervous, or course, since I’m going to start getting results which may or may not pan out.

When I finish my live tests I’ll post up the source.  I doubt anyone will have a use for the specific implementation, but it’s an awesome source (pardon the pun) for Java implementations that can be troublesome.  To give an idea of what the code has to deal with:

1.  Asynchronously networking up to 100 clients to a central server.

2.  Farming out processor intense operations via a central controller.

3.  Dynamic threading.

4.  File I/O.

5.  Translating .bmp image files into int arrays and back.

6.  Cellular Automaton optimization using lookup tables.  (Currently my home PC can manage a 1024 x 1024 map (client side) at 10 iterations per second while simultaneously running the server – faster if I turn off updates).

All in all I feel I’ve picked up more programming skill on this project than in my entire programming sequence combined, so it should be fun to pass some of that along.

Game Test (Physics/Destructible/Controls)

July 17th, 2011 § Comments Off § permalink

This is just for testing a game I’m working on.  The controls are as follows:

Mouse look to change direction.

“w” key accelerates in the direction you are facing.

left mouse button fires weapon (hold for auto fire)

“1″ select autocannons

“2″ select missiles.

pewpewlaserbeams

 

To be clear, the boxes are destructible (they take about 15 hits with autocannon).  The other ships are duplicates of your ship.  If you find yourself out of control due to ramming another object, wait for a few seconds and your stabilizers should bring you back to normal.

 

Enjoy!

Computational Deduction of Geological Process Models: An Experimental Design Primer

July 12th, 2011 § Comments Off § permalink

Overview

The purpose of this project will be to develop a cellular automaton (CA) rule set (called a “genome”) that simulates a complex geological process.  This document will serve as a complete design document for the project, containing all details of the implementation and methods to be used.

Heightmaps and Terrain

Almost every three dimensional terrain can be broken down into a heightmap, a two-dimensional array of values, each of which contains the third-dimensional coordinate.  Below is one such heightmap for a sample terrain that was generated artificially in a program called Terragen.

The heightmap, seen on the lower right, can be read similar to a heat map.  In this case, “hot” areas are higher (represented by white and red) while “cooler” areas (represented by blues and greens) are lower.  In the upper left is a 3-dimensional rendering from one perspective of this terrain.

An essential element of heightmaps that makes them ideal for this project is that they exist as a 2-dimensional array.  Each position in the array corresponds to an exact coordinate on the terrain, and in that coordinate position is a single value of the range 0-255, representing the altitude of the terrain at that point.  The following image contains a magnification of a small section of the heightmap that demonstrates the exact nature of the map.

As is clear to see, this map is made up of millions of individual blocks.  In order to manipulate these maps, artists often use image filters to approximate various effects, similar to how one might use blurring or sharpening in Photoshop to modify an image to get a more desirable result.  While these filters are useful for those who wish to generate and modify virtual terrains for the purpose of illustration, they are not useful in the actual modeling of a geological system, though they may have use in the visual approximation of outcomes, such as for demonstration purposes.

Cellular Automaton 1 (CA1)

For this project, the transformation of a heightmap will be done by a cellular automaton (CA).  A CA is computer program that takes an array of blocks (such as a heightmap) and, based upon complex sets of rules, transforms that array into a new array.  While a complete analysis of cellular automata is beyond the scope of this presentation, the specific CA that is implemented, CA1, is specified here.

CA1 works by taking a 2-dimensional array and analyzing each individual block, one at a time.  This analysis proceeds by considering that block’s “neighborhood”, the series of blocks immediately surrounding it.  In this implementation we will be using a special case of the so-called “von Neumann neighborhood” with a range value of 1.  This means that the outcome of each analysis will be based on its most immediate neighbor, as illustrated here:

CA1’s implementation will use a very special case for evaluation which we are calling a “differential analysis.”  The computer creates two “scores” for each cell.  The first score consists of twice the average difference in height value between the “y” coordinates, while the second score consists of twice the average difference in height between the “x” coordinates.  The calculations are as follows, where the value z(x, y) represents the value stored in the coordinate position (x, y).

Yval =2 *(| z(x, y) – z(x, y-1) |+ | z(x, y) – z(x, y+1) |) / n

Xval =2 * (| z(x, y) – z(x-1, y) |+ | z(x, y) – z(x+1, y) |) / n

Where n is equal to 2 in non-boundary cases.

This calculation will return a value between 0 and 510 for Xval and Yval.  Essentially this will provide a value representing the net change in height across the y-neighborhood and the x-neighborhood.

This convention was chosen specifically for a number of reasons, but one key concern is that of boundaries.  In a conventional cellular automaton, the edge of the map presents considerable problems for evaluation.  In this case, the boundary can be disregarded, as the score returned can be calculated only based upon two coordinates rather than three.

 

Look-Up Table and Transformation

Once the block is evaluated and scored, CA1 consults a lookup table.  This table consists of a 2-dimensional array with values ranging from 0 to 510 along both x and y axes.  Each coordinate holds a value ranging in the range -5 to +5.  This “transform” corresponds to a specific (Xval, Yval) pair.  CA1 then adds this value to the original value of the cell being analyzed, and places the resultant value into a corresponding cell on a new heightmap array.

Once the new heightmap array has been completely populated, it replaces the previous array, and the iteration count is incremented by 1.  The graphic below visualizes the outcome of this process after 10 iterations using a randomly generated lookup table.  It was rendered for the purpose of visualization only and is not necessarily representative of specific outcomes.

 

Godwin – the Genetic Algorithm

The underlying mechanics of this cellular automaton model are such that a human-controlled process could not hope to develop a realistic model.  To give some understanding of the scope, the number of possible rules controlling this process is 1.47 x 1054.  That is 147 followed by fifty-two zeroes.  Within that set of possible combinations are likely to exist millions of rule sets that could generate a useful process, however the likelihood of a human ever finding one is remote even in a thousand lifetimes of searching.  Because of this, we must rely instead on a carefully written computer program called a genetic algorithm (GA).

A genetic algorithm is a process that creates large colonies of similar “organisms,” tests them for fitness, and destroys those that are unfit, while mutating and breeding those that are more fit.  Over many generations of this process, the GA hopes to find a specific organism that is most fit for completing the task set out by the programmers.

The specific implementation of this concept in this case is a GA we lovingly call Godwin.  Godwin operates based on a very simple principle, but the details of that operation are quite significant.

Defining the Genome

The goal of the Godwin program is to find a specific set of rules that will transform one terrain into another specified terrain.  The rule sets (which in this case take the form of a 2-dimensional lookup table) are known as the “genome.”  This use of the word genome is identical to the use in biological sciences.

Each specific lookup table is known as a “genotype,” consisting of the rules that make up a specific automaton.  The patterns of behavior that emerge from applying a specific rule set, independent of starting conditions, are known as the “phenotype.”  While the latter term will not be used often, it is nonetheless a key concept in interpreting the results of this experiment.

At the outset of this process, Godwin will pseudo-randomly generate 20,000 unique genotypes.

Initial Conditions and Target Conditions

An initial condition, in this case, consists of a specific heightmap representing a baseline terrain.  An example might be data collected from a flood plain or earthquake region prior to the geological event we are hoping to model.  An additional set of heightmap data taken from the same region post-event will represent the target condition.

Godwin will assign each client machine to apply a unique genotype to the starting condition for n iterations (the value of n is to be determined).  After each iteration, the client will compare the new map against the target condition and generate a number representing the distance from the target.  After a time, this will generate a series of numbers which will converge, diverge, or deviate randomly.

In mathematical terms, one can think of each new CA state as an entry in an infinite sequence.  From this we can derive a rate at which the sequence is converging or diverging by fitting a continuous function to the sequence of values and taking the derivative of that function (technically this entire process is handled by the client, but it should be detailed here).  While this process may not be perfect, it should provide a suitable means of evaluating each genome.

After the initial twenty thousand genotypes have been exhausted, Godwin will isolate one-half of one percent of the genotypes which demonstrate the highest degree of convergence and eradicate the other 99.5 percent.

Godwin will record the degree of convergence of this .5 percent for future reference.

Mutation and Breeding

Beginning with the one hundred “survivor” genotypes, Godwin will then randomly modify each unique genotype 100 different ways to create a new library of ten thousand unique genotypes.  These genotypes will be labeled as “M” types in order to make clear the process by which they were created.

An additional ten thousand genotypes will be generated by composing each survivor genotype with each other survivor genotype.  This process will take one genotype and, quite literally, add it to another, to create a new unique genotype.  These genotypes will be labeled as part of the “B” group.

Generations

The new library of genotypes will be labeled “generation 1” or g1.  Godwin will commence to repeat the process from before, exhausting each genotype and evaluating for survivors, before mutating and breeding survivors into a new generation, g2.  This process will continue indefinitely with subsequent generations labeled g3, g4, …, gn.

Review and Revision

Following the collection of data and the analysis of results, the experiment will be modified to refine the outcomes.  It may be necessary to change the fitness function (the method of evaluating divergence/convergence) the range of the genome (narrowing or widening from +/-5) or the way in which new generations are propagated (based on a statistical analysis of correlation between M and P groups and outcomes).

Epochs

Once a satisfactory genome has been developed for a specific starting condition/target condition pair, the experiment will be re-run multiple times with new pairs that represent the same geological process.  Each of these new runs is called an “epoch.”

Once each epoch has a satisfactory genome, these genomes can be cross-applied to the other sets and analyzed in terms of generalizability.  The ambitious hope is that through the study and refinement of these disparate genomes, a master genome can be created that will accurately model the geological process being evaluated across a large range of initial conditions.  Such a master genome could then be critically evaluated, possibly granting insight into the geological process itself.

Thus, the goal of this experiment is to create a computer system which can accurately and independently simulate a natural process, from which we can draw conclusions about the natural world.

Pretty Pictures

The following images were rendered in Terragen based on the heightmaps used in this paper.

Cheers!

Research Diary 1

June 25th, 2011 § Comments Off § permalink

So, it’s coming to the end of the first week of research, and I thought I’d just take the time to recap the progress I’ve made so far… originally I’d intended to do daily updates, but the process is simply too involved for me to manage that.

So, we all met Monday morning to set our benchmarks and learn the general form the summer will take. Immediately after the meeting I began getting organized. So far I have a giant binder with print-outs of research articles, though I need to figure out a way to organize them for reference. I’m noticing a major hole in the research I’m reading, but I’ll get to that later.

Most importantly, I’ve completed the architecture of the first generation of Godwin. In a nutshell:

Godwin is driven by a central server that maintains records on the results of millions of automata. It communicates with any number of client programs, each of which does the heavy lifting of computation. Godwin is currently going to be able to manage 3000 connections simultaneously, though I doubt I’ll have anywhere near enough resources to handle that many clients.

The client works by receiving a bundle of parameters from Godwin, which include the size of the automaton, the rules for it, any beginning state data, and the number of iterations Godwin wishes to be performed. This architecture is intentionally open so that the same setup can be used for any simulation I want to run, regardless of whether the automata are one-dimensional or two-dimensional, or whether the neighborhood is large or small.

Rather than hard-coding the analysis process as well, I’m keeping each neighborhood pattern as an array, and then providing rules by which the array can be compared to the neighborhood and a result returned. This means that Godwin can provide any size or orientation of pattern to the clients.

Finally, each client will be able to run as many operations as it has resources for by creating dedicated threads for each automaton.

The downside to this set-up is that it trades flexibility for speed. The network load and memory load will be larger than normal, but automata are such small constructs that I’m not too concerned yet. As long as I don’t add load to the processor it should be mostly unnoticeable in the final implementation.

The hardest part of this design stage will be figuring out Godwin’s internal process for determining fitness, and this is where the hole in the literature seems to appear.

Everyone who does this type of research discusses the idea, the experiment, and the results, but no one discusses the nuts and bolts of the internals. The ultimate question that no one seems to answer overtly is what exactly does their algorithm measure when evaluating fitness? I’ve known from the beginning that this was going to be the hardest question to answer, and as implementation rolls on I’m desperately searching for an answer.

I suppose that’s why it’s called research.

Summer Plans

June 10th, 2011 § Comments Off § permalink

So, it’s 3 a.m. and my last final (Linear Algebra) is in 7 hours. Instead of studying, I thought I’d write a quick post about my summer research since I’ll be using my blog here to keep a research journal. I’m not sure about everything that I have to do for the McNair portion of my research – i.e. benchmarks and timecards and such. For the actual scientific bits, though, I’m pretty excited.

I’m going to start by building a cellular automaton. The best way to think of it is as a chessboard with white and black squares. The board will have a set of rules that describe what any individual block is expected to do. Every ‘turn’ each block can turn white or black, depending on its neighbors. While this might seem arbitrary, each “block” actually represents a digit in a binary number. Since the board is 2-dimensional, each row represents a single numerical value, while each column also represents a value.

So, what can you do with it? Well, what I intend to do is use it as a very special type of calculator called a Touring machine. Theoretically, it should be able to calculate anything that can be calculated, it’s just a matter of “programming” it with the right rules. The hard part is figuring out what those rules are, which is why I’m not going to do it myself – I’m going to leave that to my friend Godwin.

Godwin’s not a real person, it’s a computer program I’m building called an “evolutionary algorithm.” What it does is creates millions of these automata with different rules and analyzes them to figure out which are more or less desirable for what I want to accomplish. For instance, if I want to come up with a way of calculating a function f(x) = sin(x), Godwin would see which, among its million random attempts, was getting closest. It then takes those automata that are doing badly and kills them off. Those that remain get “bred” together and “mutated” to create a new group of millions. This process continues until, hopefully, Godwin finds the perfect match for what I want to accomplish.

The name Godwin is an homage to Darwin (father of evolution), fused with God due to the fact that it has an ultimate plan, selecting for a purpose, rather than at random.

The eventual hope is that Godwin will be able to produce discrete processes that can be used to calculate differential equations. Differential equations are a sort of mathematical equation that is used to measure things like the formation of gas clouds in space or the Black-Scholes model in economics.

When I explain all of this, people inevitably ask things like, “Don’t we already have computer programs that solve these problems?” The answer is “Yes, but…”

The but is a big one in this case (hee hee). The purpose of this research isn’t really to find discrete approximations for these equations, it’s to gather supporting evidence for the conjecture that these equations are, in fact, continuous approximations of a discrete universe. In other words, I’m suggesting that the computer program is how the universe really operates, and the mathematical equations we’ve created are imperfect ways of describing reality.

I’ll let that sink in for a while until my next post.

Threads or Semaphores…

May 25th, 2011 § Comments Off § permalink

I’m currently navigating Java threads, which I enjoy, and C semaphores, which I don’t.

I think I know why so few programmers take advantage of threading in modern application building. It seems to me (and admittedly I may be wrong) that object-oriented frameworks such as Java and C++ lend themselves to architectures that are better for handing distributed processing. When you can pass threads around, focusing on issues of deadlock and starvation, an application framework seems to make a certain sort of sense. But once you break away from an o-o architecture’s encapsulation rules, and start using C’s system of forking processes, the logic rapidly becomes massively complex.

While Java may not be more efficient than C, one thing it can do well is maintain the integrity of threaded processes. Yes, there is added overhead (see Vector vs. ArrayList, for instance), but the development time to implement code is far diminished from that of C. I like C for many things, but sometimes I think that people’s love for C is based in its elegance and its philosophies, rather than a realistic economic assessment of its true costs and benefits given particular implementations.