to more complex environments with (i) reachability, (ii) safety-constrained reachability, or (iii) discounted-reward