INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025
CONCLUSION
This study will establish that the technique of reinforcement learning (RL) combined with physically realistic
dynamics can automatically realize fuel-efficient and stable descent strategies that are similar to classical
optimal-control results.
By repeatedly acting on a physics-based environment, the PPO agent learnt how to minimize terminal velocity
with minimal expenditure on propellant, and it can reproduce behavioral patterns of real lunar landers.
The resulting control curves bore adaptive throttle modulation, smooth braking as well as steady touchdown
curves - without detailed programming of thrust laws.
This research paper shows that knowledge-based learning can bring to a convergent state of Newtonian
consistent control and combine computational intelligence with physical reasoning. These results indicate that
reinforcement learning can be a suitable model to consider in the development of next-generation autonomous
spacecraft guidance that can be adaptable in uncertain situations and resilient against the limitations inherent in
more deterministic controllers.
REFERENCES :
1. Blackmore, L., Fathpour, N., & Sutter, B. (2010). Autonomous precision landing of space rockets. AIAA
Guidance, Navigation, and Control Conference.
2. Bryson, A. E. (1975). Applied optimal control: Optimization, estimation, and control. Taylor & Francis.
3. Chobotov, V. (2001). Orbital mechanics (3rd ed.). AIAA.
5. Fujimoto, S., van Hoof, H., & Meger, D. (2018). Addressing function approximation error in actor-critic
methods. Proceedings of the 35th International Conference on Machine Learning.
6. Gupta, M., & Kochenderfer, M. (2019). Online planning for autonomous planetary landing. Journal of
Guidance, Control, and Dynamics, 42(6), 1256–1267.
7. Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft actor-critic algorithms and applications.
arXiv:1812.05905.
8. Harris, C., & D’Souza, C. (2011). Powered descent guidance and control for Mars landing. NASA
Technical Reports.
9. Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2018). Deep reinforcement
learning that matters. Proceedings of the AAAI Conference on Artificial Intelligence.
10. Humphries, S. (2020). Propulsion efficiency modelling for low-fuel space landing operations. Aerospace
Propulsion Journal, 9(4), 200–214.
11. Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering,
9(3), 90– 95.
12. ISRO. (2023). Chandrayaan-3 mission report. Indian Space Research Organisation.
13. Kakade, S. (2002). A natural policy gradient. MIT Press.
14. Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016).
Continuous control with deep reinforcement learning. arXiv:1509.02971.
15. Mattingly, J. (2017). Elements of propulsion: Gas turbines and rockets (2nd ed.). AIAA Education
Series.
16. Mihail, J. C. (2022). Modelling lunar descent dynamics using computational physics. Acta Astronautica,
175, 58–69.
17. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., et al. (2015). Human-level control through deep
reinforcement learning. Nature, 518(7540), 529–533.
18. NASA. (1969). Apollo 11 mission report. NASA Headquarters.
19. NASA. (2019). Lunar landing and descent trajectory analysis. NASA Technical Publications.
20. NASA. (2020). Artemis program: Lunar surface mission planning. NASA Exploration Systems
Directorate.
Page 933