Archive for July, 2004

Approximate Reward report writing

Monday, July 26th, 2004

Today, I came to Control Lab. in order to write a technical report about approximate reward in RL. I write something, but my efficiency is not very good, e.g. you may get involved in a long conversation and you cannot escape! :D Anyway …
During my writings, I found out that there might be some fallacy in agnostic learning: policy would change after changed agnostic reinforcement signal. I am not sure whether my result is correct or not.
If I can prove that policy does not change value function, everything would be ok! It is not generally correct, but may be correct in some situations, i.e. being sure that every state-action will be visited infinitely, then V->V* and so policy is irrelevant. emmm … must be thought!

Behavior learning in SSA: a mid-work report

Sunday, July 25th, 2004

PACness of MDP

Thursday, July 22nd, 2004

Post those-busy-days era: Chaos control and Co-evolution

Tuesday, July 20th, 2004

non-singular situation

Tuesday, July 20th, 2004

These busy days!

Friday, July 16th, 2004