Eduardo Alonso
Reader in Computing
Senator
Room A309D
Department of Computing
School of Informatics
City University London
London EC1V OHB
E.Alonso@city.ac.uk
tel: +44 20 7040 4049
fax: +44 20 7040 0244
FULLY-FUNDED DOCTORAL STUDENSHIPS:
Click here for general details, and here on how to submit an application. You can also contact me directly to discuss your proposal and my work at E.Alonso@city.ac.uk.Requirements: Good programming skills (preferably but not limited to Java/C++; MATLAB) and expertise in two of the following areas: neuroscience (learning and behaviour), machine learning, control and optimisation, dynamic systems. Applicants would also need to have a strong mathematical background.
- Simulators of psychological models of conditioning:
It is universally accepted that conditioning is at the basis of most learning phenomena. It is thus paramount that we develop accurate simulators of models of conditioning.
In collaboration with Esther Mondragon at the Centre for Computational and Animal Learning Research and Alberto Fernandez-Gil (Universidad Rey Juan Carlos), I am developing simulators of classical conditioning models. We have just released version 3 of our Rescorla and Wagner Simulator (CAL-RWSim, a Java simulator of Rescorla and Wagner, 1972), and plan to extend it with other models (Mackintosh 1975; Pearce and Hall, 1980) in the near future. We are also considering the simulation of operant conditioning models (for instance, Staddon, 1979).
We are developing a Java program for Temporal Difference, and are simulating results obtained by Charlotte Bonardi (University of Nottimgham) and Domhnall Jennings (University of Newcastle), on the role of time variance and uncertainty in overshadowing. Details to come.
- Reinforcement Learning and Adaptive Dynamic Programming:
Michael Fairbank's PhD work extends Temporal Difference methods to continuous state spaces and deterministic environments: He has invented a new algorithm, Value Gradient Learning-VGL(lambda), that learns gradients of values rather than values and proved that it converges to local optimality under certain conditions when lambda=1; he has also proved that VGL(1) is equivalent to Policy Gradient Learning and to Backpropagation through time (BPTT), and shown some interesting experimental results (arXiv:1101.0428v1; arXiv:1107.4606v1; see NEWS below). We would like to complement this line of research and investigate new Adaptive Dynamic Programming methods that make use of model functions --in line with Pontryagin's maximum principle . The long-term aim is to develop a set of Reinforcement Learning methods that apply to both traditional stochastic model-free problems as well as to model-based deterministic ones. In this, with are collaborating with Danil Prokhorov, from Toyota Research Institute, Michigan.
We are also working with Shuhui Li at The University of Alabama, and Don Wunsch at the Missouri University of Science and Technology, in applying VGL to real-life control problems.
- Formal models of evolution and learning:
I am interested in exploring variational principles in learning and behaviour. I am investigating classical conditioning and instrumental conditioning using calculus of variations and optimal control methods. Such methods have been proved useful in expressing extremal principles that reflect constitutive and conservation laws as well as underlying symmetries in Nature. This project is to be developed within the recently founded Mathematical and Computational Behaviour and Evolution Research Group (MCBE) that I co-lead along with Mark Broom.
- Reinforcement Learning and Associative Learning:
Reinforcement Learning methods have been proposed as computational models of learning and behaviour. We follow a complementary agenda and investigate how to enhance Reinforcement Learning algorithms by borrowing concepts and techniques from associative learning. More specifically,
we have developed Driven-Q learning, a variation of Q-learning where agents are provided with internal drives against which they assess the value of states according to a similarity function (instead of ad hoc rewards). We have proved convergence to optimality and obtained preliminary yet encouraging generalisation results, that we plan to further investigate (see NEW below).
NEWS
- Accepted paper in Neural Computation: Efficient Calculation of the Gauss-Newton Approximation of the Hessian Matrix in Neural Networks, with Michael Fairbank.
- Accepted chapter in Frank Lewis and Derong Liu (Eds.), Handbook of Learning and Approximate Dynamic Programming, Volume 2, Wiley-IEEE Press: Approximating Optimal Control with Value Gradient Learning,with Michael Fairbank and Danil Prokhorov.
- Accepted paper for The Fourth International Conference on Agents and Artificial Intelligence (ICAART-2012), Internally Driven Q-learning: Convergence and Generalization Results, with Esther Mondragon and Niclas Kjall-Ohlsson.Vilamoura, Portugal, February 2012.
- Accepted paper for The 38th ABAI (Association for Behavior Analysis International) Annual Convention, on variational principles of classical and operant conditioning. Seattle, WA, May 2012.
- Awarded a Royal Society Research Grant, The British History of Artificial Intelligence --as we speak I am writing the history of The Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AISB), of which I served as vice-chair between 2003 and 2006 and as co-editor of the AISB Journal.
- Appointed Member of the Engineering and Physical Sciences Research Council (EPSRC) Peer Review College for 2009-2012.
- Editing with Nestor Schmajuk (Duke University), a Special Issue on Computational Models of Classical Conditioning for Learning & Behavior.
- Acting as reviewer for Artificial Intelligence.
- I have edited with Esther Mondragon the book Computational Neuroscience for Advancing Artificial Intelligence: Models, Methods and Applications, IGI Glogal, 2011.
- I am contributing to The Cambridge Handbook of Artificial Intelligence, Cambridge University Press, Keith Frankish and William Ramsey (Eds.) with a chapter on Actions and Agents.