research interests

I am interested in how meaning works in natural languages. I work with computational and statistical methods, and I am particularly interested in the potential of continuous models (distributional semantics, neural networks --aka deep learning) to account for meaning. More concretely, my current research focuses on the following topics:
  • Modeling the interplay between conceptual and referential aspects of language ("dog" is a concept or category, used to refer to a specific instance in a given utterance: "that cute little dog"). Collaborations with Abhijeet Gupta, Aurélie Herbelot, Louise McNally, Sebastian Padó, and the AMORE team.
  • Grounding language in perception, in particular visual information extracted from images, and more generally in the extralinguistic context. Currently with the AMORE team.
  • Accounting for compositional processes, or how the meaning of a complex expression (say, red wine) is built from the meaning of its parts (red and wine). Currently with the AMORE team.
  • Understanding what kind of semantic theory distributional semantics is. Collaborations with Aurélie Herbelot, Katrin Erk, and the AMORE team. 
  • More generally, finding out more about language through the use of quantitative methods and computational modelling. That includes a pretty crazy collaboration on Zipf's law and the dynamics of word repetition distance with Álvaro Corral and Ramon Ferrer-i-Cancho.

My research involves tackling large-scale language data. Therefore, I am and have been actively involved in creating linguistic resources (corpora, taggers, parsers, lexica, datasets), especially for Catalan and Spanish, but also for English. See this page for more information about these resources.

For more details and results, see publications

past research interests

  • Building computational models of regular polysemy and other analogical processes in word meaning. Collaborations with Toni Badia, Sebastian Padó, Sabine Schulte im Walde, and Jason Utt.
  • Other aspects of lexical semantics in the nominal domain (semantic classes for adjectives, semantics of relational adjectives, semantics of nominalizations). Collaborations with Boban Arsenijevic, Toni Badia, Berit Gehrke, Louise McNally, Aina Peris, Horacio Rodríguez, Roser Sanromà, Mariona Taulé, Sabine Schulte im Walde.
  • Measuring the reliability of linguistic data obtained from human subjects. With great help from Stefan Evert.