Download Free Working With German Corpora Book in PDF and EPUB Free Download. You can read online Working With German Corpora and write the review.

The essays in this volume, writen by Germanists from Britain, Ireland, the USA and Australia, illustrate the enormous potential which corpus-based work has for German Studies as a whole and the rich diversity of work currently being undertaken. A detailed introduction explains basic concepts, methods, and applications of corpus-based work.
The book specifies a corpus architecture, including annotation and querying techniques, and its implementation. The corpus architecture is developed for empirical studies of translations, and beyond those for the study of texts which are inter-lingually comparable, particularly texts of similar registers. The compiled corpus, CroCo, is a resource for research and is, with some copyright restrictions, accessible to other research projects. Most of the research was undertaken as part of a DFG-Project into linguistic properties of translations. Fundamentally, this research project was a corpus-based investigation into the language pair English-German. The long-term goal is a contribution to the study of translation as a contact variety, and beyond this to language comparison and language contact more generally with the language pair English - German as our object languages. This goal implies a thorough interest in possible specific properties of translations, and beyond this in an empirical translation theory. The methodology developed is not restricted to the traditional exclusively system-based comparison of earlier days, where real-text excerpts or constructed examples are used as mere illustrations of assumptions and claims, but instead implements an empirical research strategy involving structured data (the sub-corpora and their relationships to each other, annotated and aligned on various theoretically motivated levels of representation), the formation of hypotheses and their operationalizations, statistics on the data, critical examinations of their significance, and interpretation against the background of system-based comparisons and other independent sources of explanation for the phenomena observed. Further applications of the resource developed in computational linguistics are outlined and evaluated.
The main focus of this book is the investigation of linguistic variation in Spanish, considering spoken and written, specialised and non-specialised registers from a corpus linguistics approach and employing computational updated tools. The ten chapters represent a range of research on Spanish using a number of different corpora drawn from, amongst others, research articles, student writing, formal conversation and technical reports. A variety of methodologies are brought to bear upon these corpora including multi-dimensional and multi-register analysis, latent semantics and lexical bundles. This in-depth analysis of using Spanish corpora will be of interest to researchers in corpus linguistics or Spanish language.
In contrastive linguistics of English and German, there is a tradition of accounting for contrasts with respect to grammar and, to a lesser extent, for lexis and phonetics. Moving on to discourse and text, there is a sizeable body of literature on cohesive patterns in English and German respectively - but very little in terms of a comparison. The latter, though, is of particular interest for language learners, translators and, of course, linguists and researchers in language technology. This book attempts to close this gap, based on a number of years of corpus-based study into variation and cohesion in the two languages. While there is an overall focus on language contrasts, it also investigates variation between different registers language-internally, and between written and spoken mode in particular. For each of the five major types of cohesion (co-reference, substitution, ellipsis, conjunctive relations and lexical cohesion), overviews are given of contrasts in the system and of contrastive frequencies in texts. Results and methods presented in this book are thus relevant for language teaching, translation, language technology and corpus-based work on English and German generally.
Corpora are well-established as a resource for language research; they are now also increasingly being used for teaching purposes. This book is the first of its kind to deal explicitly and in a wide-ranging way with the use of corpora in teaching. It contains an extensive collection of articles by corpus linguists and practising teachers, covering not only the use of data to inform and create teaching materials but also the direct exploitation of corpora by students, both in the study of linguistics in general and in the acquisition of proficiency in individual languages, including English, Welsh, German, French and Italian. In addition, the book offers practical information on the sources of corpora and concordances, including those suitable for work on non-roman scripts such as Greek and Cyrillic.
Investigating the history of a language depends on fragmentary sources, but electronic corpora offer the possibility of alleviating the problem of 'bad data'. But they cannot overcome it totally, and questions arise of the optimal architecture for a corpus and its representativeness of actual language use, and how a historical corpus can best be annotated to maximize its usefulness. Immense strides have been made in recent years in addressing these questions, with exciting new methods and technological advances. The papers in this volume, which were presented at a conference on New Methods in Historical Corpora (Manchester 2011), exemplify the wide range of these recent developments.
This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.
Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the exciting and cutting-edge corpus linguistic work being done with Portuguese language corpora on different continents. The book includes chapters by leading corpus linguists dealing with Portuguese corpora across the world, and their contributions explore various methods and how they are applicable to a wide range of language issues. The book is divided into six sections, each covering a key issue in Corpus Linguistics: lexis and grammar, lexicography, language teaching and terminology, translation, corpus building and sharing, and parsing and annotation. Together these sections present the reader with a broad picture of the field.
The Routledge Handbook of Second Language Acquisition and Corpora is a state-of-the-art collection of cutting-edge scholarship at the intersection of second language acquisition and learner corpus research. It draws on data-driven, statistical analysis to outline the background, methods, and outcomes of language learning, with a range of global experts providing detailed guidelines and findings. The volume is organized into five sections: Methodological and theoretical contributions to the study of learner language using corpora – setting the scene Key aspects in corpus design, annotation, and analysis for SLA Corpora in SLA theory and practice SLA constructs and corpora Future directions This is a ground-breaking collection of essays offering incisive and essential reading for anyone with an interest in second language acquisition, learner corpus research, and applied linguistics.