Papers by antoinette renouf

The many electronic text corpora available nowadays present ever fewer obstacles to a wide range ... more The many electronic text corpora available nowadays present ever fewer obstacles to a wide range of corpus linguistic study. However, corpora are expensive resources to create and to update, and there remain problems for linguists if they seek access to very large, very recent, or changing language. The World Wide Web, whilst intended as an information source, is an obvious resource for the retrieval of linguistic information, being the largest store of texts in existence, freely-available, covering a range of domains, and constantly added to and updated. Individual linguistic researchers have been trying to retrieve instances of rare or neologistic language use from the web by manipulating existing web search engines. Whilst this strategy is possible, in particular via Google, the output is rather haphazard and not linguist-friendly. The Research and Development Unit for English Studies has been seeking to remedy the situation through the creation of' 'WebCorp', a tool designed to search the Internet and provide on-line tailored access to linguists. A demonstration tool is available at
Sticking to the text: a corpus linguist's view of language
Aslib Proceedings, 1993
The original design for the Cobuild project envisaged the first stages of activity as being the b... more The original design for the Cobuild project envisaged the first stages of activity as being the building up, over a period of years, of a computerised collection of text, and a comprehensive database. From the latter would be drawn a dictionary in the first instance, but this would by no means be the end of the project. The same resource could be used to create an extensive family of publications.
Contextual Clues to Word-Meaning
International Journal of Corpus Linguistics, 2001
A Corpus-Based Study of Compounding in English
Journal of English Linguistics, 2001
It is long established that corpus-based studies force the linguist-analyst to come face-to-face ... more It is long established that corpus-based studies force the linguist-analyst to come face-to-face with a number of phenomena that might easily be overlooked in an armchair-type study. In this article, we demonstrate the validity of this truism once again in a study of English ...
WebCorp: Applying the Web to Linguistics and Linguistics to the Web
WebCorp is an ongoing project, the aim of which is to produce a search tool designed to present e... more WebCorp is an ongoing project, the aim of which is to produce a search tool designed to present examples of word usage from the Web in a form suitable for linguistic analysis. We illustrate how WebCorp adds a layer of refinement to standard Web search by allowing extended wildcard search and by providing tailored output in a customisable format. ... Publication Name: World Wide Web 2002 Conference, Honolulu, Hawaii. ... Sorry, the author hasn't uploaded a copy yet. Please check back later. ... Thank you! Your feedback has been sent.
Linguistic Research with XML/RDF-aware WebCorp Tool
In this paper, we report on our XML/RDF-aware WebCorp application, a specialist search tool desig... more In this paper, we report on our XML/RDF-aware WebCorp application, a specialist search tool designed to treat the Web as a large text corpus. We review the current state of annotation in Web resources and report on our attempts to reconcile this with our application development and linguistic research as a whole. ... Publication Name: World Wide Web 2003 Conference, Budapest. ... Sorry, the author hasn't uploaded a copy yet. Please check back later. ... Thank you! Your feedback has been sent. ... Want an instant answer to your ...

The web has unique potential among corpora to yield large-volume data on up-to-date language use,... more The web has unique potential among corpora to yield large-volume data on up-to-date language use, obvious shortcomings notwithstanding. Since 1998, we have been developing a tool, WebCorp, to allow corpus linguists to retrieve raw and analysed linguistic output from the web. Based on internal trials and user feedback gleaned from our site (http://www.webcorp.org.uk/), we have established a working system which supports thousands of regular users world-wide. Many of the problems associated with the nature of web text have been accommodated, but problems remain, some due to the nonimplementation of standards on the Internet, and others to reliance on commercial search engines, which mediation slows up average WebCorp response time and places constraints on linguistic search. To improve WebCorp performance, we are in the process of creating a tailored search engine, an infrastructure in which WebCorp will play an integral and enhanced role.
Uploads
Papers by antoinette renouf