Multi-Module Reinforcement Learning from User Feedback

We designed the state/action abstraction, sub-policy design, and multi-objective reward modeling to support reinforcement learning on a photo-taking robot from explicit human feedback and refine its movement, timing, and interaction strategy.