<aside> ❗

This is originally a project building off my Planning Fast and Slow: Trajectory Synthesis via Conditional Flow Matchings master dissertation, but this is now a side project and I'm prioritizing Generative World Models instead.

I've decided to de-prioritize this because my original proposal was ill-posed and to put it transparently, I didn't conduct an in-depth literature review. The learnings have been documented in Retro From My First Three Months.

</aside>

Abstract


Recent advances in test-time inference (TRM, HRM) have opened new avenues for improving model performance. In parallel, generative models such as diffusion and flow have emerged as effective policy parameterizations in robotics. In this work, we propose that these are complementary components of a unified system. We extend the reasoning process by framing it as an agent interacting with its internal world model. We present Recursive Flow Policy (RFP), a novel framework that integrates test-time compute into continuous control. We are the first to reframe planning as a reasoning paradigm that iteratively refines trajectories.

Personal Notes


23 Dec 2025

22 Dec 2025

18 Dec 2025

17 Dec 2025

16 Dec 2025

15 Dec 2025

11 Dec 2025

10 Dec 2025

9 Dec 2025

7 Dec 2025

5 Dec 2025

3 Dec 2025

1 Dec 2025