research interests

I am interested in how meaning works in natural languages. I work with computational and statistical methods, and I am particularly interested in the potential of continuous models (distributional semantics, neural networks --aka deep learning) to account for meaning. More concretely, my current research focuses on the following topics:
  • Modeling the concept-to-reference anchoring process. Collaborations with Marco Baroni, Abhijeet Gupta, Aurelie Herbelot, Louise McNally, and Sebastian Padó.
  • Understanding what kind of semantic theory distributional semantics is. Collaborations with Aurelie Herbelot, Katrin Erk, and Louise McNally. 
  • Grounding word meaning in perception, in particular visual information extracted from images. Collaborations with Marco Baroni, Elia Bruni, Nam Khanh Tran, and Sebastian Padó.
  • More generally, finding out more about language through the use of quantitative data and computational modelling. That includes a pretty crazy collaboration on Zipf's law and the dynamics of word repetition distance with Álvaro Corral and Ramon Ferrer-i-Cancho.

My research involves tackling large-scale language data. Therefore, I am and have been actively involved in creating linguistic resources (corpora, taggers, parsers, lexica, datasets), especially for Catalan and Spanish, but also for English. See this page for more information about these resources.

For more details and results, see publications

past research interests

  • Accounting for compositional processes, or how the meaning of a complex expression (say, red wine) is built from the meaning of its parts (red and wine). Collaborations with Boban Arsenijevic, Marco Baroni, Islam Beltagy, Katrin Erk, Dan Garrette, Berit Gehrke, Stefan Evert, Louise McNally, Ray Mooney, and Eva Maria Vecchi. 
  • Building computational models of regular polysemy and other analogical processes in word meaning. Collaborations with Toni Badia, Sebastian Padó, Sabine Schulte im Walde, and Jason Utt.
  • Other aspects of lexical semantics in the nominal domain (semantic classes for adjectives, semantics of relational adjectives, semantics of nominalizations). Collaborations with Boban Arsenijevic, Toni Badia, Berit Gehrke, Louise McNally, Aina Peris, Horacio Rodríguez, Roser Sanromà, Mariona Taulé, Sabine Schulte im Walde.
  • Measuring the reliability of linguistic data obtained from human subjects. With great help from Stefan Evert.