One of the key challenges in the development of the LEMURS project is the availability of an accurate and efficient simulation environment. Such a simulator is essential to enable learning and validation of robotic manipulation tasks in a virtual setting before transferring them to real robotic platforms, reducing risks, costs, and development time.
Task T3.3 – Virtual Environment Implementation focuses on the development of a realistic simulation framework capable of supporting the evaluation and optimization of machine learning-based manipulation algorithms. In this task, a virtual environment has to be implemented using the Stonefish simulator, to simulate the upgraded Girona 500 I-AUV. The environment must be designed to:
-
Support the evaluation and optimization of ML-based algorithms
-
Define appropriate reward functions (sparse and/or dense) for manipulation tasks
-
Be calibrated through standard benchmark tasks such as reaching, pushing, and pick-and-place
-
Ensure sufficient realism to enable effective Sim2Real transfer
To address the challenges of realism and computational efficiency, a dual development approach has been adopted:
1. Stonefish-Based Simulation (High Realism)
We are extending the highly realistic Stonefish underwater simulator to support reinforcement learning workflows. This includes adapting the simulator for training and evaluating learning-based control policies (see https://github.com/narcispr/stonefish_rl).
2. MuJoCo-Based Simulation (High Performance)
In parallel, we are leveraging the widely used MuJoCo simulator, known for its speed and efficiency in machine learning applications. A preliminary underwater model inspired by the Girona 500 I-AUV has already been implemented. Using this framework, we have successfully trained policies for a free-floating dual manipulation control task, demonstrating the feasibility of learning complex behaviors in simulation.
Future work within T3.3 will focus on:
-
Developing a more accurate dynamic model of the Girona 500 I-AUV to reduce the Sim2Real gap
-
Enhancing simulation fidelity while maintaining computational efficiency
-
Transitioning from standard MuJoCo to MuJoCo XLA (MJX) to significantly improve training speed and scalability
This dual simulation strategy strengthens the LEMURS pipeline by combining realism and performance, enabling faster iteration of learning algorithms while ensuring their applicability in real-world underwater robotic systems.