Response Example

Table of contents

Comparing PRM, Math-psa (Ours) V.S. Math-Shepherd
Justifing RL Training
Exploring Test-time Computation

Comparing PRM, Math-psa (Ours) V.S. Math-Shepherd

QA 1 QA 2

Justifing RL Training

QA 3 QA 4

Exploring Test-time Computation

QA 5 QA 6 QA 7