tag :: corpus data

bookmarks (hide)19
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1Trec Spam Corpus
http://plg.uwaterloo.ca/~gvcormac/treccorpus/
18 years ago by @hotho
show all tags
set
dataset
corpus
data
trec
spam
setdatasetcorpusdatatrecspam
copydelete
- community post
- history of this post
1Linguist List - Open Language Archives Community
dedicated to collecting information about language resources and making it available from a single search.
12 years ago by @jaj
show all tags
linguistics
corpus
data
linguisticscorpusdata
copydelete
- community post
- history of this post
6Open Language Archives Community
an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources. includes a search across text-archives.
12 years ago by @jaj
show all tags
linguistics
corpus
data
linguisticscorpusdata
copydelete
- community post
- history of this post
3Home Page for 20 Newsgroups Data Set
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. The collection has become a popular data set for experiments in text applications of machine learning techniques, such as text classification and text clustering.
12 years ago by @jaj
show all tags
socialnetworking
corpus
data
datasets
socialnetworkingcorpusdatadatasets
copydelete
- community post
- history of this post
9UCI Machine Learning Repository
data sets as a service to the machine learning community.
12 years ago by @jaj
show all tags
reference
machine-learning
corpus
data
datamining
datasets
referencemachine-learningcorpusdatadataminingdatasets
copydelete
- community post
- history of this post
7British National Corpus [bnc]
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.
12 years ago by @jaj
show all tags
reference
linguistics
corpus
data
referencelinguisticscorpusdata
copydelete
- community post
- history of this post
1The ACL Anthology Network (All About NLP)
http://clair.eecs.umich.edu/aan/index.php
6 years ago by @becker
show all tags
set
dataset
detection
nationality
paper
gender
corpus
profiling
data
author
papers
setdatasetdetectionnationalitypapergendercorpusprofilingdataauthorpapers
copydelete
- community post
- history of this post
1Open Research Corpus: Public Datasets of Scholarly Research Papers
http://labs.semanticscholar.org/corpus/
6 years ago by @schwemmlein
show all tags
set
dataset
corpus
data
publications
scholar
research
setdatasetcorpusdatapublicationsscholarresearch
copydelete
- community post
- history of this post
1The Petabyte Age: Because More Isn't Just More — More Is Different
Wired Magazine issue 16.07. Data Deluge. Crop predictions. Quark. Data mining. tracking news. watching the skies, scanning skeletons. airfares. voting. epidemics. google events. terrorism. visualizing big data
12 years ago by @jaj
show all tags
datavisualization
corpus
data
textmining
datavisualizationcorpusdatatextmining
copydelete
- community post
- history of this post
1Das Voynich-Blog » Download-Seite
http://voynich.tamagothi.de/download-seite/
12 years ago by @sac
show all tags
corpus
data
voynich
download
facsimiles
corpusdatavoynichdownloadfacsimiles
copydelete
- community post
- history of this post
1The Blog Authorship Corpus
consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person.
12 years ago by @jaj
show all tags
corpus
data
blogs
corpusdatablogs
copydelete
- community post
- history of this post
4| CommonCrawl
http://commoncrawl.org/
8 years ago by @bshanks
show all tags
nlp
web
corpus
data
open
crawl
nlpwebcorpusdataopencrawl
copydelete
- community post
- history of this post
1http://emolex.eu
The Lexis of Emotion in Five European Languages : Semantics, Syntax and Discourse
11 years ago by @jaeschke
show all tags
emotion
dataset
language
corpus
data
emolex
europe
emotiondatasetlanguagecorpusdataemolexeurope
copydelete
- community post
- history of this post
1Home: AAN
http://tangra.cs.yale.edu/newaan/
6 years ago by @becker
show all tags
set
dataset
detection
nationality
paper
gender
corpus
data
profiling
author
papers
setdatasetdetectionnationalitypapergendercorpusdataprofilingauthorpapers
copydelete
- community post
- history of this post
1Datasets for Data Mining, Analytics and Knowledge Discovery
Datasets for testing Data Mining, Analytics, and Knowledge Discovery algorithms
14 years ago by @dbenz
show all tags
overview
corpus
data
copora
datasets
kdd
overviewcorpusdatacoporadatasetskdd
copydelete
- community post
- history of this post
2Natural Language Corpus Data: Beautiful Data
Natural Language Corpus Data: Beautiful Data This directory contains code and data to accompany the chapter Natural Language Corpus Data from the book Beautiful Data (Segaran and Hammerbacher, 2009). If you like this you may also like: How to Write a Spelling Corrector.
10 years ago by @lysander07
show all tags
corpus
data
textmining
corpusdatatextmining
copydelete
- community post
- history of this post
1Time-Aware Entity Recommendation
Annotated Web documents
7 years ago by @jaeschke
show all tags
news
newspaper
dataset
corpus
data
newsnewspaperdatasetcorpusdata
copydelete
- community post
- history of this post
1Edition et analyse de copus XML
http://www.arizona-software.ch/applications/xs/en/
17 years ago by @mortimer_m8
show all tags
linguistics
osx
york
corpus
data
xml
phd
aiml
linguisticsosxyorkcorpusdataxmlphdaiml
copydelete
- community post
- history of this post
1TextGrid: Digital Library
http://www.textgrid.de/en/digitale-bibliothek.html
12 years ago by @sac
show all tags
large
corpus
tei
data
xml
text
largecorpusteidataxmltext
copydelete
- community post
- history of this post

⟨⟨
⟨
1
⟩
⟩⟩

publications (hide)3
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

8The Wikipedia XML Corpus
L. Denoyer, and P. Gallinari. SIGIR Forum, (2006)
18 years ago by @hotho
show all tags
mining
corpus
data
xml
dm
wikipedia
ml
miningcorpusdataxmldmwikipediaml
copydeleteadd this publication to your clipboard
1Between Abundance and Austerity: Academic Work in Times of Data Mining?
J. Hofmann. (Apr 8, 2014)
7 years ago by @jaeschke
show all tags
newspaper
mining
corpus
dnb
data
hiig
newspaperminingcorpusdnbdatahiig
copydeleteadd this publication to your clipboard
2The Annotated Beethoven Corpus (ABC): A Dataset of Harmonic Analyses of All Beethoven String Quartets
M. Neuwirth, D. Harasim, F. Moss, and M. Rohrmeier. Frontiers in Digital Humanities, 5 (July): 1--5 (2018)
2 years ago by @fabian-moss
show all tags
ground,symbolic
Musicology,ground
research,digital
corpus
Data
truth,harmony,music,Music,music;
beethoven,Beethoven,corpus
digital
data,Symbolic
music
musicology,Digital
research,Corpus
Music
musicology;
research;
truth,Ground
ground,symbolicMusicology,groundresearch,digitalcorpusDatatruth,harmony,music,Music,music;beethoven,Beethoven,corpusdigitaldata,Symbolicmusicmusicology,Digitalresearch,CorpusMusicmusicology;research;truth,Ground
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
⟩
⟩⟩

BibSonomy

bookmarks (hide)19
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1Trec Spam Corpus

1Linguist List - Open Language Archives Community

6Open Language Archives Community

3Home Page for 20 Newsgroups Data Set

9UCI Machine Learning Repository

7British National Corpus [bnc]

1The ACL Anthology Network (All About NLP)

1Open Research Corpus: Public Datasets of Scholarly Research Papers

1The Petabyte Age: Because More Isn't Just More — More Is Different

1Das Voynich-Blog » Download-Seite

1The Blog Authorship Corpus

4| CommonCrawl

1http://emolex.eu

1Home: AAN

1Datasets for Data Mining, Analytics and Knowledge Discovery

2Natural Language Corpus Data: Beautiful Data

1Time-Aware Entity Recommendation

1Edition et analyse de copus XML

1TextGrid: Digital Library

publications (hide)3
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

8The Wikipedia XML Corpus

1Between Abundance and Austerity: Academic Work in Times of Data Mining?

2The Annotated Beethoven Corpus (ABC): A Dataset of Harmonic Analyses of All Beethoven String Quartets

browse

related tags

bookmarks (hide)19 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide)3 displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

browse

related tags

bookmarks (hide)19
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)3
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...