jaj > data corpus | BibSonomy

bookmarks (hide)7
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1The Blog Authorship Corpus
consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person.
12 years ago by @jaj
show all tags
blogs
corpus
data
blogscorpusdata
(0)
copydelete
- community post
- history of this post
7Open Language Archives Community
an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources. includes a search across text-archives.
12 years ago by @jaj
show all tags
corpus
linguistics
data
corpuslinguisticsdata
(0)
copydelete
- community post
- history of this post
1Linguist List - Open Language Archives Community
dedicated to collecting information about language resources and making it available from a single search.
12 years ago by @jaj
show all tags
corpus
linguistics
data
corpuslinguisticsdata
(0)
copydelete
- community post
- history of this post
1The Petabyte Age: Because More Isn't Just More — More Is Different
Wired Magazine issue 16.07. Data Deluge. Crop predictions. Quark. Data mining. tracking news. watching the skies, scanning skeletons. airfares. voting. epidemics. google events. terrorism. visualizing big data
12 years ago by @jaj
show all tags
data
datavisualization
corpus
textmining
datadatavisualizationcorpustextmining
(0)
copydelete
- community post
- history of this post
11British National Corpus [bnc]
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.
12 years ago by @jaj
show all tags
corpus
data
reference
linguistics
corpusdatareferencelinguistics
(0)
copydelete
- community post
- history of this post
3Home Page for 20 Newsgroups Data Set
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. The collection has become a popular data set for experiments in text applications of machine learning techniques, such as text classification and text clustering.
12 years ago by @jaj
show all tags
data
corpus
datasets
socialnetworking
datacorpusdatasetssocialnetworking
(0)
copydelete
- community post
- history of this post
12UCI Machine Learning Repository
data sets as a service to the machine learning community.
12 years ago by @jaj
show all tags
reference
data
corpus
datasets
datamining
machine-learning
referencedatacorpusdatasetsdataminingmachine-learning
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
1
⟩
⟩⟩

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

No matching posts.

⟨⟨
⟨
⟩
⟩⟩

BibSonomy

bookmarks (hide)7
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1The Blog Authorship Corpus

7Open Language Archives Community

1Linguist List - Open Language Archives Community

1The Petabyte Age: Because More Isn't Just More — More Is Different

11British National Corpus [bnc]

3Home Page for 20 Newsgroups Data Set

12UCI Machine Learning Repository

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

browse

related tags

concepts

tags

bookmarks (hide)7 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide) displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

browse

related tags

tags

bookmarks (hide)7
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...