The EMLex Wiktionary Hackfest Experience in Training Future Lexicographers
Carlos Valcárcel
iLingua / University of Vigo
Imagine for a moment a dictionary that is never complete, that grows every day and is built by thousands of people around the world. It's not science fiction: it's Wiktionary, a project that is revolutionising not only how we access linguistic knowledge, but also how we train the experts of the future. This is precisely why I was invited to speak at the Porto Meeting in 2025, which brought together representatives of minority language communities, Wikimedians, researchers, students, and promoters of local development in Porto and Miranda do Douro, with a common goal: to strengthen minority languages through Wikimedia projects.
The Dictionary That Never Sleeps
Launched in 2002 by the Wikimedia Foundation, Wiktionary is much more than a traditional dictionary. It is a collaborative project that brings together more than 30 million entries in almost 200 languages, including definitions, etymologies, pronunciations, translations, and usage examples. Unlike conventional dictionaries, created by closed teams of experts, Wiktionary is built by an open community of volunteer editors. The English edition, the largest of all, has more than 7.5 million entries, followed by French with 4.7 million. But what makes this project truly fascinating is its ability to give a voice to minority languages and linguistic varieties that rarely find a place in commercial dictionaries.
The Case of Galizionario
An inspiring example is Galizionario, the Galician edition of Wiktionary, created in 2004. With a small but extraordinarily dedicated community—only two administrators manage the entire project—this edition has more than 88,000 entries and an impressive average of 7.02 edits per page, demonstrating the care and attention to detail of its community. Galizionario is not just a translation or an adaptation of other editions. It develops content specific to Galician culture, includes local toponymy, and establishes links with other projects like Galipedia. It is a perfect example of how communities can take ownership of digital tools to preserve and promote their linguistic heritage.
Where Theory Meets Practice: The EMLex Wiktionary Hackfest
It is precisely in this context that an innovative initiative in the field of lexicography training emerges. The Erasmus Mundus Master's programme EMLex (European Master in Lexicography) is an international programme that trains specialists in lexicography through a consortium of eight European universities, from Santiago de Compostela to Budapest, via Rome and Lorraine. This programme, which combines theory and practice in an interdisciplinary way, has found an exceptional teaching tool in Wiktionary. After all, where better to train future lexicographers than in an environment where they can see and participate in the collaborative construction of real dictionaries?
Since 2018, the CREA Campus of the University of Vigo, in Pontevedra, has hosted the EMLex Wiktionary Hackfest, an annual event that exemplifies the educational potential of collaborative lexicography. This lexicographical "hackathon" brings together students from the EMLex Master's, professors, and experienced Wiktionary editors for an intensive day of creating and improving dictionary content. The format is simple but effective: after introductory talks on how Wiktionary works, participants organise themselves into working groups and dive into creating and editing entries. They create new articles, add pronunciations and etymologies, establish links between languages, and enrich existing entries with examples and phraseology. Over five editions—from "Edita 'pa diante!" ("Edit Forwards!") in 2018 to "Encontrármonos nos dicionários" ("Finding Ourselves in Dictionaries") in 2024—the event has continued to grow in participants, demonstrating the increasing interest in this pedagogical approach.
More Than Just Technique: A New Way of Thinking About Lexicography
What makes this experience truly valuable is not just the technical skills students develop. It is the deep understanding that modern lexicography is, fundamentally, a collaborative and intercultural activity. Participants learn that a dictionary is not a static monument to linguistic knowledge, but rather a living organism that is constantly evolving. They discover that each language has its own peculiarities and that linguistic diversity is a treasure that deserves to be preserved and celebrated. For the students from the Faculty of Education and Sports Science of Pontevedra, who participate alongside the EMLex students, the hackfest represents a window into the world of linguistic research and an opportunity to connect with colleagues from other cultures and countries.
The Fruits of Collaboration
The benefits of this approach are manifold. For Galizionario, each hackfest represents a significant boost in its growth, both in terms of content quantity and technical quality. For the students, it is a unique opportunity to apply theoretical knowledge in a real-world context and to understand the dynamics of volunteer editor communities. But perhaps most importantly, it's the change in perspective that this experience provides. Future lexicographers learn that their work is not done in the isolation of an office, but in constant dialogue with communities of speakers, editors, and other specialists.
Challenges and Horizons
Like any innovative project, the hackfest faces challenges. Long-term sustainability depends on continued institutional support and the ability to engage new generations of students and researchers. There is also the challenge of expanding the initiative to other universities in the EMLex consortium, creating a true network for training in collaborative lexicography. Another ambitious goal is to strengthen ties with schools and language standardisation centres, building bridges between academic research and the real needs of linguistic communities.
A New Paradigm
The EMLex Wiktionary Hackfest experience represents more than just a simple pedagogical innovation. It is an example of how digital technologies can democratise access to knowledge and transform the way we think about higher education. In an increasingly interconnected world, where the boundaries between the digital and the physical are blurring, initiatives like this show us that the future of lexicography—and perhaps of research in general—lies in open collaboration, interdisciplinarity, and dialogue between different knowledge communities.
When the next generation of lexicographers sits down to create dictionaries, they will bring with them not only technical knowledge but also a deep understanding that words, like people, live better in a community. And perhaps that is the most valuable lesson of all.