CesCa: El Català Escolar Escrit a Catalunya
The aim of this project is to provide the educative community with a fundamental tool to know pupils’ linguistic usage. It is a reference corpus of the written scholar Catalan in Catalonia which also provides data derived from its processing.
It contains 2.426 processed texts that have been produced by children between the last year of childhood education (P5) and the last year of obligatory education (4th ESO). They have been collected from 31 educative centers of different Catalan regions.
The corpus contains vocabulary produced for five lexical fields:
- Food names
- Natural phenomena
- Free-time activities
- Personality features
You will find organized information about:
Words frequency of usage: forms and lemmas
Forms and lemmas relationships
Lemmas distribution by scholar level. For how long the informants have been speaking Catalan and their mother tongue.