INTRODUCTION
The SULEC Corpus is a project managed by a group of
researchers from the Department of English Philology at the
The subsequent, all-embracing analysis of such data
will allow us to perform investigations at different levels:
Phonological level: main difficulties found by these students when
learning pronunciation (segmental and suprasegmental features), linguistic
interferences, preferences for some specific model or linguistic variety.
Morphosyntatic level: word-order, concord problems, length and syntatic
structures, acquisition of given constructions (negative forms, relative
clauses, existential constructions), empty categories.
Lexical
level: type and number of words used, frequencies of use, lexical collocations,
"false friends".
Discourse
level: organisation of the information, use of cohesive devices, communicative
strategies.
In addition, we will also explore the pedagogical
applications derived from our corpus, incorporating this information to the
materials used for English language teaching (dictionaries, glossaries,
grammars, also reference books). Likewise, we believe that the results of our
analysis might have important
impplications for the fields of Translation and the so called Computer
Assisted Language Learning (CALL).
The aim of the project is the compilation of a large
and solid corpus of real language, both spoken and written, produced by Spanish
learners of English. Nowadays, corpora with all these features do not exist and
ours would bring about a great number of subsequent works in the various
different areas that are somehow related to the acquisition and the teaching of
English, such as Translation and Constrative Linguistics.
Although many important linguistic scholars such as
Chomsky do not believe in research based on corpora, corpus-based research has
been used with great success in the study of English, leading to the creation
of corpora such as the British National Corpus (BNC) or the International
Corpus of English (ICE) This has had a great influence on the creation of
corpora to study second language performance, and therefore researchers have put
together data collections such as the
International Corpus of Learner
English
(ICLE), the Taiwanese Learner Corpus of English (TLCE) or the Japanese
EFL Learner Corpus (JEFLL). We believe in the importance of basing our
reseach on a corpus. By looking at real second language performance, we do not
just base our research on simple theories and hypothesis. Therefore, we expect
that this project will contain interesting data showing the performance of
Spanish speakers of English, and that it
will be succesfully applied to many different research purposes.