iRead4Skills - Basic Lexicons per Complexity Level

  1. Wilkens, Rodrigo 1
  2. Pintard, Alice 1
  3. François, Thomas 1
  4. Barbosa, Sílvia 23
  5. Reis, Maria Leonor 23
  6. Amaro, Raquel 23
  7. Ribeiro, Eugénio 4
  8. Mamede, Nuno 4
  9. Baptista, Jorge 4
  10. Blanco, Xavier 5
  11. Catena, Angels 5
  12. Gauchola, Roser 5
  13. Mu, Keran 5
  1. 1 Université Catholique de Louvain
    info

    Université Catholique de Louvain

    Louvain-la-Neuve, Bélgica

    ROR https://ror.org/02495e989

  2. 2 CLUNL
  3. 3 Universidade Nova de Lisboa
    info

    Universidade Nova de Lisboa

    Lisboa, Portugal

    ROR https://ror.org/02xankh89

  4. 4 Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento
    info

    Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento

    Lisboa, Portugal

    ROR https://ror.org/04mqy3p58

  5. 5 Universitat Autònoma de Barcelona
    info

    Universitat Autònoma de Barcelona

    Barcelona, España

    ROR https://ror.org/052g8jq94

Editor: Zenodo

Year of publication: 2024

Type: Dataset

License: CC BY-NC-ND 4.0

Abstract

The iRead4Skills Basic lexicons per Complexity Level consists of three basic lexicons per complexity level for French, Spanish, and Portuguese, provided in .xlsx format. These lexicons were compiled under the scope of the project iReadSkills – Intelligent Reading Improvement System for Fundamental and Transversal Skills Development, funded by the European Commission (grant number: 1010094837). The project aims to enhance reading skills within the adult population by creating an intelligent system that assesses text complexity and recommends suitable reading materials to adults with low literacy skills, contributing to reducing skills gaps and facilitating access to information and culture (https://iread4skills.com/). Each lexicon covers the complexity levels deemed relevant for the project - Very Easy (approximately A1), Easy (approximately A2), and Plain (approximately  B1) -, and will contribute to the complexity analysis systems for the three languages of the project: French, Portuguese, and Spanish. The data files are accompanied by a description of the data. The baselines for each lexicon definition can be consulted here: iRead4Skills - Baselines for complexity lexicons definition (https://doi.org/10.5281/zenodo.10069793)   French lexicon: 10103 entries  Portuguese lexicon:  2 729 entries Spanish lexicon: 3 033 entries