Publications

You can also find me on Google Scholar.

Conference and Journal Papers

Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov*, Pierluca D’Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff
2023.
ICLR, 2024.
Behind the scenes

At the beginning of summer 2023, a wave of research works applied Large Language Models to sequential decision-making. This caused both excitement and confusion in me, about which I wrote about in a blog post that was pivotal for me. Seriously, I wanted to know whether there was something really interesting behind the hype. When Martin joined Meta as an intern, during many days of intense brainstorming, we enumerated the possible ways to use LLMs for decision-making. I got pretty convinced that extracting a reward function from them was the most promising of all. In that particularly long, rainy and confusing summer in Montreal, we pushed ourselves out of our comfort zone and witnessed the potential of LLMs for creating AI agents.

The Curse of Diversity in Ensemble-Based Exploration
Zhixuan Lin, Pierluca D'Oro, Evgenii Nikishin, Aaron Courville
2023.
ICLR, 2024.
Behind the scenes

Zhixuan was initially curious to explore the interaction between the resetting mechanisms we have been leveraging in our previous work and ensembling methods for deep reinforcement learning. In the end, in a piece of work on the empirical science of neural networks for reinforcement learning, we discovered surprising phenomena about the interaction among ensembles, data collection and representation learning.

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare
NeurIPS, 2023.
Behind the scenes

Motivated by new discoveries about the empirical science of deep reinforcement learning, we started discussing techniques for discovering other phenomena and advancing our understanding of neural network-based agents. After some attempts and after building an appropriate experimental framework, we came to the conclusion that the lens of the return landscape was a good one for our goal. Funnily enough, we found some new interesting phenomena right when we stopped searching for them. I learned a lot about how to do understanding-oriented science.

Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Pierluca D'Oro*, Max Schwarzer*, Evgenii Nikishin, Pierre-Luc Bacon, Marc G. Bellemare, Aaron Courville
ICLR (oral presentation, notable top 5%), 2023.
Behind the scenes

As the paradigm of increasing performance by scaling the amount of computation was being established in the rest of the machine learning community, we were looking for a way to generalize this to reinforcement learning. Guided by some preliminary evidence we had shown in the primacy bias paper, we thought a way to do it was to increase the amount of updates per environment interaction. I had fun doing some research in which the main goal was not to develop a totally new method or to show good performance really, but to go deep with analyses and to try to empirically understand the implications of different aspects of a complex system.

The Primacy Bias in Deep Reinforcement Learning
Evgenii Nikishin*, Max Schwarzer*, Pierluca D'Oro*, Pierre-Luc Bacon, Aaron Courville
ICML, 2022.
Behind the scenes

Evgenii had shared some interesting results on how resetting parameters in neural networks was sometimes giving unexpected performance gains. In a fun scientific sprint, we tried to understand how general the improvements provided by resets were and where they were coming from. Through this project, I understood the huge power of deeply collaborative research and of intuition-guided empirical science.

Policy Optimization as Online Learning with Mediator Feedback
Alberto Maria Metelli*, Matteo Papini*, Pierluca D'Oro, Marcello Restelli
AAAI, 2021.
Behind the scenes

I moved to Milan in March 2020, one week before the very first Covid lockdown started. I didn't leave my apartment for several weeks, and helped out as an intern with a theory-oriented project. Enduring the lockdown and the pandemic was a life-changing challenge, for me as for almost everybody else.

How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Pierluca D'Oro, Wojciech Jaśkowski
NeurIPS, 2020.
Behind the scenes

While living with a daily commute between the wonderful lake city of Como and Switzerland, I tried a bunch of mostly theoretical ideas during my time at NNAISENSE. We found out at some point with my host Wojciech that the ideas I had been thinking about for my master thesis were actually generalizable to actor-critic methods: it implied a simple theory-backed deep reinforcement learning algorithm that yielded good results out-of-the-box. I have established in that occasion, synthesizing my previous experience with what I learned from Wojciech, the core of what would have been my research taste in subsequent years.

SMfinder: Small Molecules Finder for Metabolomics and Lipidomics analysis
Giuseppe Martano, Michele Leone, Pierluca D'Oro, Vittoria Matafora, Angela Cattaneo, Marco Masseroli, Angela Bachi
Analytical Chemistry, 2020.
Behind the scenes

Michele, a flatmate at that time, told me that he was collaborating with a chemist on creating a platform for the analysis of experimental data. I decided to help, with the goal of learning about cross-disciplinary collaborations. We spent with Michele several evenings building software together in the student residence we were living in. We learned a lot and celebrated small successes with cheap grocery store cake slices.

Gradient-Aware Model-based Policy Search
Pierluca D'Oro*, Alberto Maria Metelli*, Andrea Tirinzoni, Matteo Papini, Marcello Restelli
AAAI, 2020.
Behind the scenes

At the beginning of my Master's research work, I really had to learn the hard way how to precisely formalize problems and think in math, since I had realized my drawings of boxes on a whiteboard were not enough anymore to express my scientific self. We had in mind the general goal to learn a model of the dynamics that was tailored to its use in reinforcement learning. We ponderer about what that meant exactly, and got inspired by the ideas of Amir-massoud Farahmand on decision-aware model learning. I spent several months staring at a whiteboard and thinking about math for most of my time.

Adversarial Framework for Unsupervised Learning of Motion Dynamics in Videos
Concetto Spampinato, Simone Palazzo, Pierluca D’Oro, Daniela Giordano, Mubarak Shah
International Journal of Computer Vision, 2019.
Behind the scenes

I wanted to have a first experience with scientific research, to learn what it was and to see whether it was fun for me. Concetto told me they were working on video generation and I was very excited to help them. It's probably the first time I've realized you can actually do science as a job for real, when you put enough effort in it. I brainstormed about research ideas, learned how to draw big boxes on a whiteboard, and trained neural networks for the first time. I guess I liked it enough to decide to continue on that path.

Workshop Papers

Unleashing The Potential of Data Sharing in Ensemble Deep Reinforcement Learning
Zhixuan Lin, Pierluca D'Oro, Evgenii Nikishin, Aaron Courville
NeurIPS Deep Reinforcement Learning Workshop, 2022.
Long-Term Credit Assignment via Model-based Temporal Shortcuts
Michel Ma, Pierluca D'Oro, Yoshua Bengio, Pierre-Luc Bacon
NeurIPS Deep Reinforcement Learning Workshop, 2021.
Meta Dynamic Programming
Pierluca D'Oro, Pierre-Luc Bacon
NeurIPS Workshop on Metacognition in the Age of AI: Challenges and Opportunities, 2021.
Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Giorgio Giannone, Asha Anoosheh, Alessio Quaglino, Pierluca D'Oro, Marco Gallieri, Jonathan Masci
NeurIPS workshop on Interpretable Inductive Biases and Physically Structured Learning, 2020.
Group Anomaly Detection via Graph Autoencoders
Pierluca D’Oro, Ennio Nasca, Jonathan Masci, Matteo Matteucci
NeurIPS Graph Representation Learning Workshop, 2019.
Generating Synthetic Video Sequences by Explicitly Modeling Object Motion
Simone Palazzo, Concetto Spampinato, Pierluca D’Oro, Daniela Giordano, Mubarak Shah
ECCV Workshop on Generating Realistic Visual Data of Human Behavior, 2018.