Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key — Tianle Wang, Zhaoyang Wang, Guangchen Lan, Xinpeng Wei, Sipeng Zhang, Guanwen Qiu, Abulhair Saparov | Kutubxona