publications and talks

2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
2011, 2012, 2013, 2014, 2015, 2016, 2017

Also see my Google Scholar profile.

2017 / forthcoming

G. Boleda, S. Padó, N. The Pham, M. Baroni. 2017. Living a discrete life in a continuous world: Reference with distributed representations. In Proceedings of IWCS 2017, Montpellier, France. To appear. 

G. Boleda, A. Gupta, S. Padó. 2017. Instances and concepts in distributional space. In Proceedings of EACL 2017 (15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers)79-85, Valencia, Spain. Association for Computational Linguistics. [bib]

A. Gupta, G. Boleda, S. Padó. 2017. Distributed Prediction of Relations for Entities: The Easy, The Difficult, and The ImpossibleIn Proceedings of STARSEM 2017, 104-109, Vancouver, BC. Association for Computational Linguistics. [bib] DOI: 10.18653/v1/S17-1012.

McNally, L., G. Boleda. 2017. Conceptual vs. Referential Affordance in Concept Composition. To appear in Yoad Winter & James Hampton (eds.) Concept Composition and Experimental Semantics/Pragmatics, Springer. (Note: final, pre-print version.)

M. Baroni, G. Boleda, S. Padó. 2017. Show me the cup: Reference with continuous representations. In Proceedings of CICLing (International Conference on Computational Linguistics and Intelligent Text Processing), to appear.

2016

Boleda, G. 2016. Remarks on the CommAI-env. Talk at MAIN@NIPS (MAchine INtelligence workshop, NIPS 2016), Barcelona, December 9.

Boleda, G. and A. Herbelot. 2016 (eds.). 2016. Special Issue on Formal Distributional SemanticsComputational Linguistics 42:4.

Boleda, G. and A. Herbelot. 2016. Formal Distributional Semantics: Introduction to the Special Issue. Computational Linguistics 42:4, 619-635. 

Pham, N., G. Kruszewski, G. Boleda. 2016. Convolutional Neural Network Language Models. Proceedings of EMNLP 2016 (Conference on Empirical Methods in Natural Language Processing), 1153-1162, Austin, US, November. Association for Computational Linguistics. [bib]

Paperno, D., G. Kruszewski, A. Lazaridou, Q. Ngoc, R. Bernardi, S. Pezzelle, M. Baroni, G. Boleda, R. Fernandez. 2016. The LAMBADA dataset: Word prediction requiring a broad discourse contextProceedings of ACL 2016 (54th Annual Meeting of the Association for Computational Linguistics), 1525-1534, Berlin, Germany, August. Association for Computational Linguistics. [bib] Download the LAMBADA dataset.

Sorodoc, I., S. Pezzelle, A. Lazaridou, A. Herbelot, G. Boleda, R. Bernardi. 2016. Look, some green circles! Learning to quantify from images. Proceedings of the 5th Workshop on Vision and Language at ACL 2016, 75-79. Association for Computational Linguistics.

2015

Gupta, A., G. Boleda, M. Baroni, S. Padó. 2015. Distributional vectors encode referential attributesProceedings of EMNLP 2015 (Conference on Empirical Methods in Natural Language Processing), 12-2. Lisbon, Portugal, September. Association for Computational Linguistics. [bib]

Bernardi, R., G. Boleda, R. Fernandez, D. Paperno. 2015. Distributional semantics in use. Proceedings of EMNLP 2015 Workshop LSDSem 2015: Linking Models of Lexical, Sentential and Discourse-level Semantics, 95-101. Lisbon, Portugal, September. Association for Computational Linguistics. [bib]

Corral, Á., G. Boleda, R. Ferrer-i-Cancho. 2015. Zipf's Law for Word Frequencies: Word Forms versus Lemmas in Long Texts. PLoS ONE 10(7):doi:10.1371/journal.pone.0129031

Palmer, M., G. Boleda, P. Rosso (eds.). 2015. Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics. Denver, Colorado: Association for Computational Linguistics.

Boleda, G., K. Erk. 2015. Distributional Semantic Features as Semantic Primitives – or Not. Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches. Papers from the AAAI Spring Symposium, 2-5. Palo Alto, California: AAAI Press. ISBN 978-1-57735-707-0. [slides]

Gupta, A., G. Boleda, M. Baroni, S. Pado. 2015. Mapping conceptual features to referential properties. Talk at the 3rd International ESSENCE Workshop: Algorithms for Processing Meaning, Barcelona (Spain), May 2015.

Boleda, G., Distributional semantics for lexical semantics. Invited talk at Catalonia-Israel Symposium on Lexical Semantics and Grammatical Structure in Event Conceptualization, The Hebrew University of Jerusalem, February 16-18 2015. 

2014

G. Boleda, K. Erk. 2014. Distributional Semantic Features as Semantic Primitives – or Not. Poster at NIPS 2014 workshop on Learning Semantics, Montreal, Canada, December 12, 2014.

Beltagy, I., S. Roller, G. Boleda, K. Erk, R. Mooney. 2014. UTexas: Natural Language Semantics using Distributional Semantics and  Probabilistic Logic. Proceedings of SemEval 2014, pp. 796-801, Dublin, Ireland, August 23-24 2014. [paper]

Roller, S., K. Erk, G. Boleda. 2014. Inclusive yet Selective: Supervised Distributional Hypernymy Detection. Proceedings of CoLing 2014, Dublin, Ireland, pp. 1025-1036. [paper]

Arsenijevic, B., B. Gehrke, G. Boleda, L. McNally. 2014. Ethnic adjectives are proper adjectives. In R. Baglini, T. Grinsell, J. Keane, A. R. Singerman, J. Thomas (eds.), CLS 46-I The Main Session: Proc. of 46th Annual Meeting of the Chicago Linguistic Society, Chicago, IL, USA, pp. 17-30. [PDF (preprint)]

2013

Font-Clos, F., G. Boleda, A. Corral. 2013. A scaling law beyond Zipf's law and its relation to Heaps' law. New Journal of Physics 15:9, pp. 093033. [paper, bib, article page]

McNally, L. G. Boleda, M. Baroni. 2013. Conceptual vs. Referential Affordance in Concept Composition. Talk at the Workshop Concept Composition & Experimental Semantics/Pragmatics (CC&ESP 2013), Utrecht, The Netherlands, 2-3 September. [abstract]

Beltagy, I., C. K. Cuong, G. Boleda, D. Garrette, K. Erk, R. Mooney. 2013. Montague meets Markov: Deep semantics with probabilistic logical form. Proceedings of *SEM 2013. Atlanta, US. [paper]

Herbelot, A. and Zamparelli, R. and Boleda, G. (eds.) 2013. Proceedings of the IWCS 2013 Workshop Towards a Formal Distributional Semantics. Association for Computational Linguistics. [bookbib]

Boleda, G., M. Baroni, N. The Pham, L. McNally. 2013. Intensionality was only alleged: On adjective-noun composition in distributional semantics. Proceedings of IWCS 2013, Potsdam, Germany, pp. 35-46. [paper, slides, data, bib]

2012

Boleda, G., S. Schulte im Walde, T. Badia. 2012. Modeling regular polysemy: A study of the semantic classification of Catalan adjectives. Computational Linguistics 38:3, pp. 575-616. [paper, journal page, data, bib]

Boleda, G., E. M. Vecchi, M. Cornudella, L. McNally. 2012. First-order vs. higher-order modification in distributional semantics. Proceedings of EMNLP-CoNLL 2012, pp. 1223--1233, Jeju Island, Korea. [paper, slides, data, bib]

Bruni, E., G. Boleda, M. Baroni, N. K. Tran. 2012. Distributional semantics in technicolor. Proceedings of ACL 2012, pp. 136-145, Jeju Island, Korea. [paper, slides, data, bib]

Boleda, G., S. Padó, J. Utt. 2012. Regular polysemy: a distributional model. First Joint Conference on Lexical and Computational Semantics (*SEM), pp. 151–160, Montréal, Canada. [paper, slides, data, bib]

Boleda, G., S. Evert, B. Gehrke, L. McNally. 2012. Adjectives as saturators vs. modifiers: Statistical evidence. In Maria Aloni, Vadim Kimmelman, Floris Roelofsen, Galit Weidman Sassoon, Katrin Schulz, Matthijs Westera (Eds.): Logic, Language and Meaning - 18th Amsterdam Colloquium, Amsterdam, The Netherlands, December 19-21, 2011, Revised Selected Papers. Lecture Notes in Computer Science 7218, pp. 112-121. Springer. ISBN 978-3-642-31481-0. [paper, bib]

Boleda, G., S. Evert, B. Gehrke, L. McNally. 2012. A Logistic Regression Model for adjectival modification vs. modification by prepositional phrases. Poster at Linguistic Evidence. Tübingen, Germany, February 9-11 2012.

Sánchez Marco, C., G. Boleda, J. M. Fontana. 2012. Propuesta de codificación de la información paleográfica y lingüística para textos diacrónicos del español. Uso del estándar TEI. In Torrens Álvarez, María Jesús y Sánchez-Prieto Borja, Pedro (eds.), Nuevas perspectivas para la edición y el estudio de documentos antiguos, pp. 447-463. Fondo Hispánico de Lingüística y Filología. Vol. 12. Bern (etc.): Peter Lang. ISSN 1663-2648. [paper]

2011

Baroni, M., E. Bruni, G. B. Tran, A. Anderson, G. Boleda, M. Ciaramita, A. Lenci. 2011. Multimodal distributional semantics. Poster at V&L Net Workshop on Vision and Language. Brighton, UK, 15 September. [PDF]

Sánchez Marco, C., G. Boleda, L. Padró. 2011. Extending the tool, or how to annotate historical language varieties. In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, Portland, Oregon, USA, June 2011, pp. 1--9. [paper, bib]

Berndt, D., G. Boleda, B. Gehrke, L. McNally. 2011. Semantic factors in the choice between ethnic adjectives and PP counterparts: Quantitative evidence. Talk at 4th Conference on Quantitative Investigations in Theoretical Linguistics (QITL-4), Berlin, Germany, 28-31 March. [slides]

2010

Arsenijevic, B., G. Boleda, B. Gehrke, L. McNally. 2010. Unifying the semantics for "thematic" and "classificatory" uses of ethnic adjectives. Talk at 8èmes Journées Sémantique et Modélisation, LORIA-INRIA, Nancy, France, 25-26 March.

Melero, M., G. Boleda, M. Cuadros, C. España-Bonet, L. Padró, M. Quixal, C. Rodríguez. 2010. Language technology challenges of a 'small' language (Catalan). In Proceedings of LREC 2010, Valletta, Malta. ISBN 2-9517408-6-7. [PDF]

Peris, A., M. Taulé, G. Boleda, Horacio Rodríguez. 2010. ADN-classifier: Automatically assigning denotation types to nominalizations. In Proceedings of LREC 2010, Valletta, Malta. ISBN 2-9517408-6-7. [PDF]

Samuel Reese, G. Boleda, Montse Cuadros, Lluís Padró, German Rigau. 2010. Wikicorpus: A word-sense disambiguated multilingual Wikipedia corpus. In Proceedings of LREC 2010, Valletta, Malta. ISBN 2-9517408-6-7. [PDF]

Cristina Sánchez-Marco, G. Boleda, J. M. Fontana, J. Domingo. Annotation and representation of a diachronic corpus of Spanish. In Proceedings of LREC 2010, Valletta, Malta. ISBN 2-9517408-6-7. [PDF]

Sanromà, R., G. Boleda. 2010. The Database of Catalan Adjectives. In Proceedings of LREC 2010, Valletta, Malta. ISBN 2-9517408-6-7. [PDF]

2009

Corral, A., R. Ferrer i Cancho, G. Boleda, A. Diaz-Guilera. 2009.  Universal Complex Structures in Written Language. Available at http://arxiv.org/abs/0901.2924.

Boleda, G., M. Cuadros, C. España-Bonet, M. Melero, L. Padró, M. Quixal, C. Rodríguez. 2009. Primera Jornada del Procesamiento Computacional del Catalán. Revista de Procesamiento del Lenguaje Natural 43: 387-388. ISSN: 1135-5948. [PDF]

Boleda, G., M. Cuadros, C. España-Bonet, M. Melero, L. Padró, M. Quixal, C. Rodríguez. 2009. El català i les tecnologies de la llengua. Llengua, Societat i Comunicació 7: 20-26. ISSN: 1697 5928. [PDF]

Boleda, G., M. Cuadros, C. España-Bonet, M. Melero, L. Padró, M. Quixal, C. Rodríguez. 2009. Sobre la I Jornada del Processament Computacional del Català. Llengua i ús 45: 23-32. ISSN: 1134-7724. [PDF]

Boleda, G., A. Corral, R. Ferrer i Cancho, A. Diaz-Guilera. 2009. From word recurrence patterns to cognitive mechanisms. Poster at 15th Annual Conference on Architectures and Mechanisms for Language Processing (AMLAP). Barcelona, 7-9 September.

Berndt, D., G. Boleda, B. Gehrke, L. McNally. 2009. Nominalizations and nationality expressions: A corpus analysis. Talk at Corpus Linguistics 2009. Liverpool, UK, 20-23 July.

Boleda, G., A. Corral, R. Ferrer i Cancho, A. Diaz-Guilera. 2009. Word distance distribution in literary texts. Talk at Corpus Linguistics 2009. Liverpool, UK, 20-23 July.

Boleda, G. 2009. Uso de PLN en otras disciplinas. Talk at III Jornadas PLN-TIMM: Modelos y técnicas para el acceso a la información multilingüe y multimodal en la web. Colmenarejo, Spain, 5-6 February.

2008

Corral, A., R. Ferrer i Cancho, G. Boleda, A. Diaz-Guilera. 2008. Universality classes and community structure in word recurrence. Talk at BCNet Workshop - trends and perspectives in complex networks. Barcelona, Spain, 10-12 December.

Artstein, R., G. Boleda, F. Keller, S. Schulte im Walde (eds). 2008. Proceedings of the COLING Workshop on Human Judgements in Computational Linguistics. Manchester, UK: Coling 2008 Organizing Committee. ISBN 978-1-905593-49-1. [URL]

Boleda, G., S. Schulte im Walde, T. Badia. 2008. An Analysis of Human Judgements on Semantic Classification of Catalan Adjectives. Research on Language and Computation 6(3): 247-271, Special Issue on Ambiguity and Semantic Judgements. ISSN 1570-7075. [doiPDF] (preprint).

Boleda, G. 2008. Emulant els infants: induint propietats lingüístiques a partir de dades empíriques. Revista de Catalunya, 235, pp. 33-40. ISSN 0213-5876. [PDF] (preprint)

Vandeghinste, V., P. Dirix, I. Schuurman, S. Markantonatou, S. Sofianopoulos, M. Vassiliou, O. Yannoutsou, T. Badia, M. Melero, G. Boleda, M. Carl, P. Schmidt. 2008. Evaluation of a Machine Translation System for Low Resource Languages: METIS-II. In Proceedings of LREC 2008, Marrakech, Morocco. ISBN 2-9517408-4-0.

Boleda, G., T. Badia. 2008. CUCWEB: un corpus de la llengua catalana construït a partir de la web. Estudis romànics 30:291-293. Institut d'Estudis Catalans.

2007

Boleda, G., S. Schulte im Walde, T. Badia. 2007. Modelling Polysemy in Adjective Classes by Multi-Label Classification. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 171-180. [PDF, BIB]

Boleda, G.. 2007. Automatic acquisition of semantic classes for adjectives. Ph.D. thesis, Pompeu Fabra University. [PDF]

2006

Boleda, G., S. Bott, C. Castillo, R. Meza, T. Badia, V. López. 2006. CUCWeb: a Catalan corpus built from the Web. In Proceedings of the Second Workshop on the Web as a Corpus at EACL'06, pp. 19-26, Trento, Italy, April 2006. [BIB, PDF]

2005

Badia, T., G. Boleda, M. Melero, A. Oliver. 2005. El proyecto METIS-II. Revista de Procesamiento del Lenguaje Natural. ISSN 1135-5948, 35, pp. 443-444. [PDF]

Oliver, A., T. Badia, G. Boleda, M. Melero. 2005. Traducción automática estadística basada en n-gramas. Revista de Procesamiento del Lenguaje Natural. ISSN 1135-5948, 35, pp. 77-84. [PDF]

Badia, T., G. Boleda, M. Melero, A. Oliver An n-gram approach to exploiting a monolingual corpus for Machine Translation. In Proceedings of the Second Workshop on Example-based Machine Translation, MT Summit X, Phuket, Thailand, 16 September 2005. [PDF]

Boleda, G., T. Badia, S. Schulte im Walde. 2005. Morphology vs. Syntax in Adjective Class Acquisition. In Proceedings of the ACL-SIGLEX 2005 Workshop on Deep Lexical Acquisition, June 30, Ann Arbor, USA. [PDF]

Mayol, L., G. Boleda, T. Badia. 2005. Automatic acquisition of syntactic verb classes with basic resources. Language Resources and Evaluation, 39(4):295-312 [PDF, PS BIB (preprint)]

Boleda, G., S. Bott, R. Meza, C. Castillo, T. Badia, V. López. 2005. Usant la web per estudiar el català. Talk at III Jornades sobre el català a les noves tecnologies, Barcelona, Spain, April 14-16. [PDF]

Mayol, L., G. Boleda, T. Badia. 2005. Automatic learning of syntactic verb classes. In Proceedings of the Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes, pp. 92-97, Feb 28th-March 1st, Saarbrücken, Germany. [PS]

2004

Boleda, G., S. Bott, B. Poblete, C. Castillo, M.E. Fuenmayor, T. Badia, V. López. 2004. CuCWeb: un corpus del català construït a partir de la web. II Congrés Online de l'Observatori per a la Cibersocietat, Barcelona, Spain. [HTML]

McNally, L. and G. Boleda. 2004. Relational adjectives as properties of kinds. In Olivier Bonami and Patricia Cabredo Hofherr (eds.) Empirical Issues in Syntax and Semantics 5, pp. 179-196 [Abstract, PDF, BIB]

Boleda, G., T. Badia and Eloi Batlle. 2004. Acquisition of Semantic Classes for Adjectives from Distributional Evidence. In Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), pp. 1119-1125, Geneva, Switzerland. ISBN:1-932 432-48-5 [PDF, PPT, BIB]

Pado, S., G. Boleda. 2004. The Influence of Argument Structure on Semantic Role Assignment. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP-04), pp. 103-110, July 25-26, Barcelona, Spain. ISBN: 1-932432-36-1

Colominas, Carme and G. Boleda. 2004. The extraction of translationally relevant information from small ad-hoc corpora. Talk at Third International Conference on Corpus Use and Learning to Translate (CULT-BCN), Barcelona, Spain, 22-24 January.

Aguilar, L., À. Alsina, A. Avilés, T. Badia, S. Balari, G. Boleda, S. Bott, J. Brumme, C. Colominas, A. Espunya, J. Fontana, J. Fontseca, À. Gil, C. Hernández, L. Mayol, L. McNally, C. de la Mota, M. Quixal, Y. Rodríguez, O. Valentín, E. Vallduví, T. Vallverdú. 2004. PrADo: Preparación Automatizada de Documentos (TIC 2000-1681). Technical report. [PDF] (2,5MB)

2003

Pado, S., G. Boleda. 2003. Towards Better Understanding of Automatic Semantic Role Assignment. Proceedings of Prospects and Advances in the Syntax/Semantics Interface, Nancy, France. [PDF, BIB]

Boleda, G., L. Alonso. 2003. Clustering Adjectives for Class Acquisition. Proceedings of the 10th Conference of The European Chapter of the Association for Computational Linguistics (EACL 2003) Student Research Workshop, pp. 9-16, Budapest, Hungary. ISBN: 1-932432-01-9 [PDF, BIB]

2002

Alsina, À., T. Badia, G. Boleda, S. Bott, À. Gil, M. Quixal, O. Valentín. 2002. CATCG: a general purpose parsing tool applied. Proceedings of Third International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Spain, Vol. III, pp. 1130-1135. ISBN: 2-9517408-0-8 [PDF, BIB]

Alsina, À., T. Badia, G. Boleda, S. Bott, À. Gil, M. Quixal, O. Valentín. 2002. CATCG: un sistema de análisis morfosintáctico para el catalán. Revista de Procesamiento del Lenguaje Natural, 29, Sept. 2002, pp. 309-310. ISSN: 1135-5948 [RTF]

Badia, T., G. Boleda, J. Brumme, C. Colominas, M. Garmendia, M. Quixal. 2002. BancTrad: un banco de corpus anotados con interficie web. Procesamiento del Lenguaje Natural, 29, septiembre 2002, pp. 293-294. ISSN: 1135-5948 [ RTF]

Badia, T., G. Boleda, C. Colominas, M. Garmendia, A. González, M. Quixal. 2002. BancTrad: a web interface for integrated access to parallel annotated corpora. Proceedings of the First International Workshop On Language Resources For Translation Work And Research held during the 3rd LREC Conference (LREC 2002), Las Palmas, Spain, 28 May 2002. [PDF, BIB]

Badia, T., G. Boleda, C. Colominas, M. Garmendia, A. González, M. Quixal. 2002. Eines de lingüística computacional per a la traducció: corpus paral·lels anotats. Proceedings of the 2nd International Conference on Specialized Translation, Barcelona, Spain, pp. 129-137. ISBN: 84-477-0820-9 [DOC]

Alonso, L., G. Boleda. 2002. An Approach to Catalan Adjective Lexical Classes by Clustering. Talk at Workshop on Quantitative Investigations for Theoretical Linguistics, Osnabrück, Germany, 3-5 October.

2001

Badia, T., G. Boleda, M. Quixal. 2001. Curso sobre Tecnologías de la lengua (segunda edición). QUARK, Ciencia, Medicina, Comunicación y Cultura, 21, Jul - Sept. 2001. pp. 14-16. ISSN: 1135-8521

Badia, T., G. Boleda, M. Quixal, Eva Bofias. 2001. A modular architecture for the processing of free text. Proceedings of the Workshop on Modular Programming applied to Natural Language Processing at EUROLAN 2001, Iasi, Romania, August 2001. pp. 11-18 [PDF, BIB]