English-Corpora: COCA At English-Corpora org, we're introducing a new way to interact with corpus data Using Large Language Models (LLMs) like GPT, Gemini, and Claude, users will soon be able to have collocates, phrases, and frequency data clustered, categorized, and explained automatically
The COCA corpus (new version released March 2020) - English Corpora A unique feature of COCA, which makes it very useful for language learners and teachers, is the ability to browse through a list of the top 60,000 words (lemmas) in the corpus, and then to see an extremely wide range of information on each of these words
English-Corpora. org: a guided tour (see video) As COCA (the one billion word Corpus of Contemporary American English) shows, this word is used much more in formal genres than in informal genres, and its use is sharply declining over time (Note: in the case of seldom and all other searches in this file, click on the blue link to run the search Depending on your
PDF with images - English Corpora The following is a short tour of the COCA corpus, including new features in March 2020 You can click on any of the links below to carry out sample searches, and then return to this page (for more searches) by clicking on TOUR at the top of the page
English Corpora: most widely used online corpora. Billions of words of . . . For example, for COCA: "the Corpus of Contemporary American English" with the appropriate citation to the references section of the paper, e g (Davies 2008-) After that reference, feel free to use something shorter, like " COCA " (for example: " and as seen in COCA, there are ")
ANALYZING TEXTS (see video) - English Corpora information about the words in the texts, as well as select phrases in your text and then finding similar phrases in COCA To access this feature, just click on the “text” icon at the top of the corpus (indicated in red below) and then paste in your text
Compare: Corpus of Contemporary American English (COCA) and the British . . . COCA and the BNC complement each other nicely, and they are are only large, well-balanced corpora of English that are publicly-available The BNC has better coverage of informal, everyday conversation, while COCA is much larger and more recent, which has important implications for the quantity and quality of the data overall
English Corpora: most widely used online corpora. Billions of words of . . . Download nearly one billion words of data from COCA, in any of three different formats Once downloaded, you can process this offline data in any way you want Word Frequency: Download lists of the top 60,000 lemmas in COCA, including the frequency by the eight main genres and nearly 100 sub-genres