A Samba of Decision Makers

It is in the half time of Brazil-Argentina final match that I get involved in thinking about the decision making process and reinforcement learning. Suppose that one intends to write a program to play soccer. What kind of state representation should he use? The representation must encompass as much information as it is possible to discriminate between two states with different actions (seeing the problem from classification side – separablity classes and …) or it is informative as it enables the agent to separate two state-action pairs with different value (like the abstraction theorem in MaxQ or …).
For instance, the player’s position, the ball’s relative position, and a few nearby players and their tags (opponent/teammate) would be a rational choice. However, one may ask how many nearby players must be selected? Two? Three? More state information you use, the best possible answer to this POMDP would be better. But it is notable that the state representation is exponentially growing with the number of states – at least in most common representations. What should we do?
I guess we must seek for newer (rich) state representation. Here, I am not talking about using general function approximators or hierarchical RL that are useful in their own. I am talking about wise selection of state representation: a dynamic and automatic generation of states is crucial. As an example, suppose that you are a player in the middle of the field and the ball is with your very close opponent (a few meters). The most important factors (read it as state) for your decision making is your relative distance with the opponent and if you are a good player, his limbs’ movement. It is not “that” important to know where the exact position of your teammate is when he is 20 meters away. However, when you are close to the penalty area of the opponent, not only your relative position to your opponents are important, but also your teammate positions might be critical for a good decision making, e.g. passing to your teammate may come at a goal.

I believe that there must be a method for automatic selection of important features for each state. Different states need not have the same kind of representation and dimension. In some situations, the current sensory information might be sufficient, in some other situations, the predictions of other agent’s sensory information might be necessary and … . An extended Markov property may apply to this situation: having a set of S1…Sn (n-dim) state variables, I guess it is possible to reduce the state transition of the MDP environment in this way: P(Si(t+1)..Sj(t+1)..Sk(t+1)|S1(t)…Sn(t)) = P(Si(t+1)..Sj(t+1)..Sk(t+1)|Sp(t)..Sq(t)..Sr(t)) for some p..q..r, i.e. there are some independency here.
As far as I know, the most similar well-known research similar to this idea is work of McCallum: Utile Suffix Memory and Nearest Sequence Memory. Nevertheless, those methods do consider only a single kind of states which is simpler than what I am thinking about.
Well … Brazil won the game tremendously with those samba dancers! Congratulations to Marcelo!

Dreaming John Nash

Last night, I dreamt John Nash!
It was a conference in Japan or somewhere alike – and if I should name the conference, it was something like IROS04 (maybe due to it was my only international conference). My advisor and I were trying to find rooms in a hotel that was almost filled. After placing our luggage in the hotel, we went down. During this room seeking job, I saw many famous people which I recognized then but I cannot remember right now (I guess those people do not actually exist in the real world). Well … outside the hotel, my advisor became curious to see John Nash. He did not know him but I knew (again: I have not seen his real face until an hour before. emmm … not very similar.). Someone introduced us to him. He greeted us. My advisor tried to speak to him. I don’t know what happened, but I remember that the speech continued between John Nash and me. I wanted to persuade him that interpreting the neural network as a multi-person game would be interesting (e.g. seeing learning as a cooperative game and … ) and I insisted him to do some research on artificial intelligence. I felt that he became interested in the subject. Unfortunately, my dream broke and we could not discuss any more.
Isn’t it interesting that I dream this way? Am I going to become mad?!

Iran Presidential Election and Predictability

1) Friday was a very surprising day for us. Iran’s presidential election got into a very unbelievable result (read + + +). The reformist candidate, Mostafa Moin -who was believed to get many votes- was humbled by fifth-place finish. On the other hand, Mahmoud Ahmadinejah –a fundamentalist mayor of Tehran- got many votes and stood in the second place. He and Rafsanjani will continue their fight next week in the second round of election.
I do not intend to analyze the situation or even inform you about the dark ages we may face if this Ahmadinejad become our president. Instead, I want to concentrate on the predictability of such elections.
2) No one in blogosphere predicted this situation. They guessed that Moin would get the first or second place. But it did not happen and all of us are in a great shock. I hypothesize the following fact:

Internet community does not reflect what is going on the society. Internet polls may have an order of magnitude difference with reality. I guess this becomes truer (in the Fuzzy sense?!) whenever the society is pre-modern or under-development.

Let me discuss it in the notion of statistical learning theory: Consider the whole society as a set A0. Those people who use Internet considerably and discuss about their opinion in that media make a subset of A0, naming it A1. If we consider the general belief as X, we can define probability measure P0 over set A0 that indicates the probability of selecting each instances of X. The same is true for A1 and P1. I hypothesize that one may not estimate E[f(X)] over P0 by making i.i.d. samples from A1 by probability measure P1. Well … a very evident fact?! Yeap! OK!
Anyway, be careful to assess the commonsense using the Internet media.

3) A few days ago, I thought about making something like Fuzzy Cognitive Map to model the society and predict its behavior. I am not that aware of the society modelling literature, but it might be interesting subject to work on. One may talk about the model’s predictability and the effect of model’s error to the prediction. Is the society –that must be modeled- chaotic or stable to a fixed point? Or in other words, can we make a model that even if it has some errors predicts well?