jaj > corpus | BibSonomy

bookmarks (hide)38
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1The Blog Authorship Corpus
consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person.
12 years ago by @jaj
show all tags
blogs
corpus
data
blogscorpusdata
(0)
copydelete
- community post
- history of this post
7Open Language Archives Community
an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources. includes a search across text-archives.
12 years ago by @jaj
show all tags
corpus
linguistics
data
corpuslinguisticsdata
(0)
copydelete
- community post
- history of this post
1Linguist List - Open Language Archives Community
dedicated to collecting information about language resources and making it available from a single search.
12 years ago by @jaj
show all tags
corpus
linguistics
data
corpuslinguisticsdata
(0)
copydelete
- community post
- history of this post
1The Petabyte Age: Because More Isn't Just More — More Is Different
Wired Magazine issue 16.07. Data Deluge. Crop predictions. Quark. Data mining. tracking news. watching the skies, scanning skeletons. airfares. voting. epidemics. google events. terrorism. visualizing big data
12 years ago by @jaj
show all tags
data
datavisualization
corpus
textmining
datadatavisualizationcorpustextmining
(0)
copydelete
- community post
- history of this post
2WaCKy: Web-as-Corpus kool ynitiative
We are a community of linguists and information technology specialists who got together to develop a set of tools (and interfaces to existing tools) that will allow linguists to crawl a section of the web, process the data, index and search them. We also
12 years ago by @jaj
show all tags
corpus
corpus
(0)
copydelete
- community post
- history of this post
1Web as Corpus
English-language corpora compiled from the Web in 2006 and 2007, and more
12 years ago by @jaj
show all tags
corpus
concordances
corpusconcordances
(0)
copydelete
- community post
- history of this post
2Phrases in English
PIE incorporates a database derived from the second or World Edition of the British National Corpus (BNC 2000). It aims to provide a simple yet powerful interface for studying words and phrases up to eight words long appropriate for both experienced researchers and novice users.
12 years ago by @jaj
show all tags
corpus
tools
linguistics
corpustoolslinguistics
(0)
copydelete
- community post
- history of this post
11British National Corpus [bnc]
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.
12 years ago by @jaj
show all tags
corpus
data
reference
linguistics
corpusdatareferencelinguistics
(0)
copydelete
- community post
- history of this post
1WebAsCorpus.org - find Web Concordances
search the web for words, phrases. get results with hits marked. download all pages for further research.
12 years ago by @jaj
show all tags
corpus
searchengine
research
linguistics
textmining
corpussearchengineresearchlinguisticstextmining
(0)
copydelete
- community post
- history of this post
1UCI Knowledge Discovery in Databases (KDD) Archive
Online repository of large data sets for researchers in knowledge discovery and data mining. includes Discrete Sequence Data, Image Data, Multivariate Data, Relational Data, Spatio-Temporal Data, Text (corpora), Time Series, Web Data (web pages and log files).
12 years ago by @jaj
show all tags
data_archive
datasets
datamining
big_data
corpus
data_archivedatasetsdataminingbig_datacorpus
(0)
copydelete
- community post
- history of this post

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

No matching posts.

⟨⟨
⟨
⟩
⟩⟩

BibSonomy

bookmarks (hide)38
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1The Blog Authorship Corpus

7Open Language Archives Community

1Linguist List - Open Language Archives Community

1The Petabyte Age: Because More Isn't Just More — More Is Different

2WaCKy: Web-as-Corpus kool ynitiative

1Web as Corpus

2Phrases in English

11British National Corpus [bnc]

1WebAsCorpus.org - find Web Concordances

1UCI Knowledge Discovery in Databases (KDD) Archive

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

browse

related tags

concepts

tags

bookmarks (hide)38 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide) displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

browse

related tags

tags

bookmarks (hide)38
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...