Multi-Reward Policies for Medical Applications: Anthrax Attacks and Smart Wheelchairs

Medical decisions are often difficult; they involve uncertain information, multiple-objectives and debatable outcomes. In this work, we discuss the application of the multi-reward partially-observable Markov decision process (MR-POMDP) and NSGA2-LS, a hybridised multi-objective evolutionary solver, to two problems in the medical domain: anthrax re- sponse and smart-wheelchair control. For the first problem, we use a discrete model and analyse the trade-offs between the best solutions (in the form of finite-state controllers) found by our evolutionary algorithm. For the second, we contribute an extension of our method to the continuous space and optimising recurrent neural networks (RNNs) for use on medical robots such as smart wheelchairs.

Download | ACM Digital Library Link