UMDPS

AcronymDefinition
UMDPSUniversity of Maryland Department of Public Safety
References in periodicals archive ?
Since UMDPs are POMDPs, the same hardness proof as in Theorem 4.15 applies for the time-dependent policy existence problem for POMDPs.
The stationary, time-dependent and history-dependent policy existence problems for MDPs and UMDPs with nonnegative rewards are NL-complete.
The policy existence problems for compressed UMDPs are simpler than for compressed MDPs.
The stationary policy existence problem for compressed UMDPs is complete for [NP.sup.PP].
By Theorem 4.9, there is a polynomial-time computable two-parameter function f and a polynomial p, such that for every x, x [element of] A if and only if there exists a y of length p(|x|) such that f(x, y) outputs a positive instance (M, [Pi]) of the stationary policy evaluation problem for compressed UMDPs. We fix an x.
The time-dependent policy existence problem for compressed UMDPs is [NP.sup.PP]-complete.
The stationary, time-dependent and history-dependent policy existence problems for compressed MDPs, compressed POMDPs or compressed UMDPs, with nonnegative rewards are NP-complete.
Moreover, because the horizon is exponential in the size of the POMDP's description, even time-dependent policies for UMDPs cannot be efficiently specified.
As a consequence, we can prove the long-term policy evaluation problem for compressed UMDPs is exponentially more complex than for the flat ones.
The long-term stationary policy evaluation problem for compressed UMDPs is PSPACE-complete.
The long-term stationary policy evaluation problem for compressed UMDPs with nonnegative rewards is PSPACE-complete.
The long-term stationary policy existence problem for compressed UMDPs is PSPACE-complete.