Nonapproximability Results for Partially Observable Markov Decision Processes
Nonapproximability Results for Partially Observable Markov Decision Processes
We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here ``unlikely'' means ``unless some complexity classes collapse,'' where the collapses …