{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T07:14:01Z","timestamp":1774682041105,"version":"3.50.1"},"reference-count":42,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,2,10]],"date-time":"2020-02-10T00:00:00Z","timestamp":1581292800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51705514"],"award-info":[{"award-number":["51705514"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/R026173\/1"],"award-info":[{"award-number":["EP\/R026173\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2016YFC0300401"],"award-info":[{"award-number":["2016YFC0300401"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Mobile manipulation has a broad range of applications in robotics. However, it is usually more challenging than fixed-base manipulation due to the complex coordination of a mobile base and a manipulator. Although recent works have demonstrated that deep reinforcement learning is a powerful technique for fixed-base manipulation tasks, most of them are not applicable to mobile manipulation. This paper investigates how to leverage deep reinforcement learning to tackle whole-body mobile manipulation tasks in unstructured environments using only on-board sensors. A novel mobile manipulation system which integrates the state-of-the-art deep reinforcement learning algorithms with visual perception is proposed. It has an efficient framework decoupling visual perception from the deep reinforcement learning control, which enables its generalization from simulation training to real-world testing. Extensive simulation and experiment results show that the proposed mobile manipulation system is able to grasp different types of objects autonomously in various simulation and real-world scenarios, verifying the effectiveness of the proposed mobile manipulation system.<\/jats:p>","DOI":"10.3390\/s20030939","type":"journal-article","created":{"date-parts":[[2020,2,11]],"date-time":"2020-02-11T09:25:21Z","timestamp":1581413121000},"page":"939","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":79,"title":["Learning Mobile Manipulation through Deep Reinforcement Learning"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2316-2331","authenticated-orcid":false,"given":"Cong","family":"Wang","sequence":"first","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China"},{"name":"Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110016, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"},{"name":"School of Engineering &amp; Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK"}]},{"given":"Qifeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China"},{"name":"Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110016, China"}]},{"given":"Qiyan","family":"Tian","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China"},{"name":"Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110016, China"}]},{"given":"Shuo","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China"},{"name":"Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110016, China"}]},{"given":"Xiaohui","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China"},{"name":"Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110016, China"}]},{"given":"David","family":"Lane","sequence":"additional","affiliation":[{"name":"School of Engineering &amp; Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1596-289X","authenticated-orcid":false,"given":"Yvan","family":"Petillot","sequence":"additional","affiliation":[{"name":"School of Engineering &amp; Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK"}]},{"given":"Sen","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Engineering &amp; Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK"}]}],"member":"1968","published-online":{"date-parts":[[2020,2,10]]},"reference":[{"key":"ref_1","unstructured":"Urakami, Y., Hodgkinson, A., Carlin, C., Leu, R., Rigazio, L., and Abbeel, P. (2019). DoorGym: A Scalable Door Opening Environment And Baseline Agent. CoRR, Available online: https:\/\/arxiv.org\/pdf\/1908.01887.pdf."},{"key":"ref_2","unstructured":"Lynch, C., Khansari, M., Xiao, T., Kumar, V., Tompson, J., Levine, S., and Sermanet, P. (2019). Learning Latent Plans from Play. CoRR, Available online: https:\/\/arxiv.org\/pdf\/1903.01973.pdf."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., and Hutter, M. (2019). Learning agile and dynamic motor skills for legged robots. Sci. Robot., 4.","DOI":"10.1126\/scirobotics.aau5872"},{"key":"ref_4","first-page":"1334","article-title":"End-to-end training of deep visuomotor policies","volume":"17","author":"Levine","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1177\/0278364917710318","article-title":"Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection","volume":"37","author":"Levine","year":"2018","journal-title":"Int. J. Rob. Res."},{"key":"ref_6","unstructured":"Akkaya, I., Andrychowicz, M., Chociej, M., Litwin, M., McGrew, B., Petron, A., Paino, A., Plappert, M., Powell, G., and Ribas, R. (2019). Solving Rubik\u2019s Cube with a Robot Hand. CoRR, Available online: https:\/\/arxiv.org\/pdf\/1910.07113.pdf."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Matl, M., Satish, V., Danielczuk, M., DeRose, B., McKinley, S., and Goldberg, K. (2019). Learning ambidextrous robot grasping policies. Sci. Rob., 4.","DOI":"10.1126\/scirobotics.aau4984"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1002\/rob.21683","article-title":"The DARPA Robotics Challenge Finals: Results and Perspectives","volume":"34","author":"Krotkov","year":"2017","journal-title":"J. Field Rob."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1002\/rob.21851","article-title":"Journal of Field Robotics special issue on MBZIRC 2017 Challenges in Autonomous Field Robotics","volume":"36","author":"Dias","year":"2019","journal-title":"J. Field Rob."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1342","DOI":"10.1002\/rob.21825","article-title":"Deployment of an autonomous mobile manipulator at MBZIRC","volume":"35","author":"Carius","year":"2018","journal-title":"J. Field Robot."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1002\/rob.21826","article-title":"Team NimbRo at MBZIRC 2017: Autonomous valve stem turning using a wrench","volume":"36","author":"Schwarz","year":"2019","journal-title":"J. Field Robot."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1109\/MRA.2010.938502","article-title":"ROS on the PR2 [ROS Topics]","volume":"17","author":"Cousins","year":"2010","journal-title":"IEEE Rob. Autom. Mag."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Yamamoto, T., Terada, K., Ochiai, A., Saito, F., Asahara, Y., and Murase, K. (2018, January 1\u20135). Development of the Research Platform of a Domestic Mobile Manipulator Utilized for International Competition and Field Test. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593798"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3687","DOI":"10.1109\/LRA.2019.2927955","article-title":"Whole-Body MPC for a Dynamically Stable Mobile Manipulator","volume":"4","author":"Minniti","year":"2019","journal-title":"IEEE Rob. Autom. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1109\/TII.2018.2879426","article-title":"Dexterous Grasping by Manipulability Selection for Mobile Manipulator With Visual Guidance","volume":"15","author":"Chen","year":"2019","journal-title":"IEEE Trans. Ind. Inf."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Klamt, T., Rodriguez, D., Schwarz, M., Lenz, C., Pavlichenko, D., Droeschel, D., and Behnke, S. (2018, January 1\u20135). Supervised Autonomous Locomotion and Manipulation for Disaster Response with a Centaur-Like Robot. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems, IROS 2018, Madrid, Spain.","DOI":"10.1109\/IROS.2018.8594509"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0278364919887447","article-title":"Learning Dexterous In-Hand Manipulation","volume":"39","author":"Andrychowicz","year":"2020","journal-title":"Int. J. Rob. Res."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Haarnoja, T., Ha, S., Zhou, A., Tan, J., Tucker, G., and Levine, S. (2019). Learning to Walk Via Deep Reinforcement Learning. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1812.11103.pdf.","DOI":"10.15607\/RSS.2019.XV.011"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1049","DOI":"10.1109\/TRO.2014.2316022","article-title":"Catching Objects in Flight","volume":"30","author":"Kim","year":"2014","journal-title":"IEEE Trans. Rob."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1109\/TRO.2016.2536749","article-title":"A Dynamical System Approach for Softly Catching a Flying Object: Theory and Experiment","volume":"32","author":"Salehian","year":"2016","journal-title":"IEEE Trans. Rob."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zeng, A., Song, S., Lee, J., Rodriguez, A., and Funkhouser, T.A. (2019). TossingBot: Learning to Throw Arbitrary Objects with Residual Physics. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1903.11239.pdf.","DOI":"10.15607\/RSS.2019.XV.004"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1613\/jair.3451","article-title":"Learning and Reasoning with Action-Related Places for Robust Mobile Manipulation","volume":"43","author":"Stulp","year":"2012","journal-title":"J. Artif. Intell. Res."},{"key":"ref_23","unstructured":"Li, C., Xia, F., Martin, R.M., and Savarese, S. (2019). HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1910.11432.pdf."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Welschehold, T., Dornhege, C., and Burgard, W. (2017, January 24\u201328). Learning mobile manipulation actions from human demonstrations. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8206152"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1109\/TMECH.2017.2717461","article-title":"Reinforcement Learning of Manipulation and Grasping Using Dynamical Movement Primitives for a Humanoidlike Mobile Manipulator","volume":"23","author":"Li","year":"2018","journal-title":"IEEE\/ASME Trans. Mechatron."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1177\/0278364911435515","article-title":"One-shot visual appearance learning for mobile manipulation","volume":"31","author":"Walter","year":"2012","journal-title":"Int. J. Rob. Res."},{"key":"ref_27","unstructured":"Sutton, R.S., McAllester, D.A., Singh, S.P., and Mansour, Y. (December, January 29). Policy Gradient Methods for Reinforcement Learning with Function Approximation. Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Todorov, E., Erez, T., and Tassa, Y. (2012, January 7\u201312). MuJoCo: A physics engine for model-based control. Proceedings of the 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal.","DOI":"10.1109\/IROS.2012.6386109"},{"key":"ref_29","unstructured":"Coumans, E., and Bai, Y. (2019, December 12). PyBullet, a Python Module for Physics Simulation for Games, Robotics and Machine Learning. Available online: http:\/\/pybullet.org."},{"key":"ref_30","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1707.06347.pdf."},{"key":"ref_31","unstructured":"Schulman, J., Levine, S., Abbeel, P., Jordan, M.I., and Moritz, P. (2015, January 6\u201311). Trust Region Policy Optimization. Proceedings of the 32nd International Conference on Machine Learning ICML, Lille, France."},{"key":"ref_32","unstructured":"Schulman, J., Moritz, P., Levine, S., Jordan, M.I., and Abbeel, P. (2018). High-Dimensional Continuous Control Using Generalized Advantage Estimation. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1506.02438.pdf."},{"key":"ref_33","unstructured":"Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., and Birchfield, S. (2018). Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1809.10790.pdf."},{"key":"ref_34","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). OpenAI Gym. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1606.01540.pdf."},{"key":"ref_35","unstructured":"Liang, E., Liaw, R., Nishihara, R., Moritz, P., Fox, R., Goldberg, K., Gonzalez, J., Jordan, M.I., and Stoica, I. (2018, January 10\u201315). RLlib: Abstractions for Distributed Reinforcement Learning. Proceedings of the 35th International Conference on Machine Learning, ICML, Stockholmsm\u00e4ssan, Stockholm, Sweden."},{"key":"ref_36","unstructured":"Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., and Jordan, M.I. (2018, January 8\u201310). Ray: A Distributed Framework for Emerging AI Applications. Proceedings of the13th USENIX Symposium on Operating Systems Design and Implementation, Carlsbad, CA, USA."},{"key":"ref_37","unstructured":"Liaw, R., Liang, E., Nishihara, R., Moritz, P., Gonzalez, J.E., and Stoica, I. (2018). Tune: A Research Platform for Distributed Model Selection and Training. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1807.05118.pdf."},{"key":"ref_38","unstructured":"Kingma, D.P., and Ba, J. (2017). Adam: A Method for Stochastic Optimization. arXiv, Available online: https:\/\/arxiv.org\/pdf\/1412.6980.pdf."},{"key":"ref_39","unstructured":"Fujimoto, S., van Hoof, H., and Meger, D. (2018, January 10\u201315). Addressing Function Approximation Error in Actor-Critic Methods. Proceedings of the 35th International Conference on Machine Learning, Stockholmsm\u00e4ssan, Stockholm, Sweden."},{"key":"ref_40","unstructured":"Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P., Harley, T., Silver, D., and Kavukcuoglu, K. (2016, January 19\u201324). Asynchronous Methods for Deep Reinforcement Learning. Proceedings of the 33nd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_41","unstructured":"Quigley, M., Conley, K., Gerkey, B.P., Faust, J., Foote, T., Leibs, J., Wheeler, R., and Ng, A.Y. (2019, December 12). ROS: An open-source Robot Operating System. ICRA Workshop on Open Source Software. Available online: http:\/\/www.willowgarage.com\/sites\/default\/files\/icraoss09-ROS.pdf."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1109\/70.34770","article-title":"A new technique for fully autonomous and efficient 3D robotics hand\/eye calibration","volume":"5","author":"Tsai","year":"1989","journal-title":"IEEE Trans. Rob. Autom."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/3\/939\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T08:56:33Z","timestamp":1760172993000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/3\/939"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,10]]},"references-count":42,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,2]]}},"alternative-id":["s20030939"],"URL":"https:\/\/doi.org\/10.3390\/s20030939","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,2,10]]}}}