schwemmlein > dataset

Lesezeichen (verstecken)42
Anzeige
alles
nur Lesezeichen
Lesezeichen pro Seite
5
10
20
50
100
sortieren nach
hinzugefügt am
Titel
RSS
BibTeX
XML

1HetRec 2011 Data Sets | GroupLens Research
http://www.grouplens.org/node/462#attachments
vor 12 Jahren von @schwemmlein
alle anzeigen
dataset
recommendation
systems
datasetrecommendationsystems
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1The Last.fm Dataset | Million Song Dataset
http://labrosa.ee.columbia.edu/millionsong/lastfm#numbers
vor 12 Jahren von @schwemmlein
alle anzeigen
2011
dataset
last.fm
2011datasetlast.fm
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2BibSonomy Dataset: dumps for research purposes
http://www.kde.cs.uni-kassel.de/bibsonomy/dumps
vor 12 Jahren von @schwemmlein
alle anzeigen
bibsonomy
dataset
bibsonomydataset
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
7Leipzig Corpora Collection - Wortschatz
http://corpora.informatik.uni-leipzig.de/
vor 12 Jahren von @schwemmlein
alle anzeigen
collection
corpora
dataset
leipzig
wortschatz
collectioncorporadatasetleipzigwortschatz
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
242 data
http://42-data.org/home
vor 11 Jahren von @schwemmlein
alle anzeigen
dataset
market
research
datasetmarketresearch
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1CoPhIR - How to get
http://cophir.isti.cnr.it/get.html
vor 10 Jahren von @schwemmlein
alle anzeigen
2009
dataset
flickr
2009datasetflickr
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Jstor - Data for research
http://about.jstor.org/service/data-for-research
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
jstor
literature
datasetjstorliterature
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1ECCO-TCP: Eighteenth Century Collectons Online - Text Creation Partnership
http://www.textcreationpartnership.org/tcp-ecco/
vor 10 Jahren von @schwemmlein
alle anzeigen
book
dataset
ecco
text
bookdataseteccotext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1WWP - women writers project
http://www.wwp.northeastern.edu/
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
text
datasettext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
103Free ebooks - Project Gutenberg
Download free ebooks for kindle, android, ipad, nook, epub or read online. No registration required.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
gutenberg
text
datasetgutenbergtext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
4Annotated Books Online | A digital archive of early modern annotated books
Handwritten annotations in books are an important key to understand how historical readers used their books. ABO aims to bring these books together. It is a digital library that reveals the variety of traces that readers left in their books. These examples were previously dispersed over many different libraries in the world. Yet it is also a digital laboratory, where visitors can work together: ABO has tools to enrich the early modern annotations with transcriptions and translations. ABO seeks to encourage collaboration.
vor 10 Jahren von @schwemmlein
alle anzeigen
annotated
books
dataset
digital_humanities
text
annotatedbooksdatasetdigital_humanitiestext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2The Open Utopia
The Open Utopia is a complete edition of Thomas More’s Utopia that honors the primary precept of Utopia itself: that all property is common property. But Utopia is more than the story of a far-off land with no private property. It’s a text that instructs us how to approach texts, be they literary or political, in an open manner: open to criticism, open to participation, and open to re-creation.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
open
text
utopia
datasetdigital_humanitiesopentextutopia
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
5The Reading Experience Database 1450-1945 (RED)
The UK Reading Experience Database (UK RED) is an open access database and research project housed in the English Department of the Open University. It is the largest resource recording the experiences of readers of its kind anywhere. UK RED has amassed over 30,000 records of reading experiences of British subjects, both at home and abroad, and of visitors to the British Isles, between 1450 and 1945. These include both famous and anonymous readers. It is both an open access resource and open to unsolicited public contributions.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
experience
reading
text
datasetdigital_humanitiesexperiencereadingtext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
3Incunabula Short Title Catalogue
The Incunabula Short Title Catalogue is the international database of 15th-century European printing created by the British Library with contributions from institutions worldwide.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
incunabula
text
datasetdigital_humanitiesincunabulatext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
4Leipzig Corpora Collection (LCC) - the Datahub
Deutscher Wortschatz contains data generated from newspapers and web resources that are publicly available. The data were collected per language and encompass statistics about co-occurrences of words in randomly selected sentences.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
german
lcc
text
datasetdigital_humanitiesgermanlcctext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2English Short Title Catalogue - Welcome
The English Short Title Catalogue (ESTC) lists over 460,000 items published between 1473 and 1800 mainly, but not exclusively, in English published mainly in the British Isles and North America from the collections of the British Library and over 2,000 other libraries
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
english
text
datasetdigital_humanitiesenglishtext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2GITenberg.github.io by GITenberg
project aims to put some Project Gutenberg ebooks into GitHub so people can fix problems in the files. use GitHub to open up the PG corpus to maintenance and use by libraries and librarians. The result will include MARC records, covers, OPDS feeds and ebook files to facilitate library use. Version-controlled fork and merge workflow, combined with a change triggered back-end build environment will allow scaleable, distributed maintenance of the greatest works of our literary heritage. 43,000 books and their metadata have been moved to the git version control software.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
github
gutenberg
text
datasetgithubgutenbergtext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2Fabian Society | LSE Digital Library
The Fabian Society collection includes: Pamphlets published as part of the Fabian Tracts series, 1884-2000, Minutes of Executive Committee meetings and other key committee meetings, 1884 to 1954, Pamphlets published as part of the Young Fabian pamphlet series, 1961-2009. The London School of Economics and Political Science
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
fabian_society
lse
text
datasetdigital_humanitiesfabian_societylsetext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
31Networked Digital Library of Theses and Dissertations (NDLTD)
Welcome to the Networked Digital Library of Theses and Dissertations (NDLTD), an international organization dedicated to promoting the adoption, creation, use, dissemination, and preservation of electronic theses and dissertations (ETDs). We support electronic publishing and open access to scholarship in order to enhance the sharing of knowledge worldwide. Our website includes resources for university administrators, librarians, faculty, students, and the general public.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
dissertation
text
thesis
datasetdissertationtextthesis
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
3Early English Books Online - EEBO
From the first book printed in English by William Caxton, through the age of Spenser and Shakespeare and the tumult of the English Civil War, Early English Books Online (EEBO) will contain over 125,000 titles listed in Pollard and Redgrave's Short-Title Catalogue (1475-1640), Wing's Short-Title Catalogue (1641-1700), the Thomason Tracts (1640-1661), and the Early English Tract Supplement - all in full digital facsimile from the Early English Books microfilm collection.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
english
text
datasetdigital_humanitiesenglishtext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
27Perseus Digital Library
Our flagship collection, under development since 1987, covers the history, literature and culture of the Greco-Roman world. We are applying what we have learned from Classics to other subjects within the humanities and beyond.
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
perseus
text
datasetdigital_humanitiesperseustext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2NLTK Data
http://www.nltk.org/nltk_data/
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
nltk
text
datasetdigital_humanitiesnltktext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1AoTM playlist data stats
Art of the Mix playlist data
vor 10 Jahren von @schwemmlein
alle anzeigen
dataset
music
playlist
datasetmusicplaylist
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2Opinosis Dataset - Topic related review sentences | Kavita Ganesan
http://kavita-ganesan.com/opinosis-opinion-dataset
vor 9 Jahren von @schwemmlein
alle anzeigen
code
dataset
summarization
text
text_mining
codedatasetsummarizationtexttext_mining
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Document Understanding Conferences
http://duc.nist.gov/
vor 9 Jahren von @schwemmlein
alle anzeigen
dataset
document
summarization
task
text
datasetdocumentsummarizationtasktext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1SentiMerge/data at master · guyemerson/SentiMerge · GitHub
SentiMerge - Merging Sentiment Lexicons in a Bayesian Framework. Produced in the TrendMiner Project.
vor 9 Jahren von @schwemmlein
alle anzeigen
analysis
dataset
novels
romane
sentiment
sentimerge
analysisdatasetnovelsromanesentimentsentimerge
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Resourceful Reading datasets | Katherine Bode
These datasets represent months of work collecting and collating the information in AustLit on Australian novels
vor 9 Jahren von @schwemmlein
alle anzeigen
australian
dataset
novels
australiandatasetnovels
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1A dataset for distant-reading literature in English, 1700-1922. | The Stone and the Shell
To reduce barriers to entry, I’ve collaborated with HathiTrust Research Center to create an easier place to start with English-language literature.
vor 9 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
english
genre
literature
datasetdigital_humanitiesenglishgenreliterature
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1HTRC Portal - Word Frequencies in English-Language Literature, 1700-1922
The project combines two sources of information. The word counts themselves come from the HathiTrust Research Center (HTRC), which has tabulated them at the page level in 4.8 million public-domain volumes. Information about genre comes from a parallel project led by Ted Underwood, and supported by the National Endowment for the Humanities and the American Council of Learned Societies.
vor 9 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
english
genre
literature
datasetdigital_humanitiesenglishgenreliterature
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1novels-project · GitHub
The dataset genres.json contains (sub)genre classifications for novels published between 1770 and 1915. The genres covered are gothic novels "silver fork" novels national tale novels
vor 9 Jahren von @schwemmlein
alle anzeigen
dataset
digital_humanities
english
genre
literature
novels
datasetdigital_humanitiesenglishgenreliteraturenovels
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1The Blog Authorship Corpus
The Blog Authorship Corpus consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person.
vor 9 Jahren von @schwemmlein
alle anzeigen
author
authorship
blog
dataset
text_mining
authorauthorshipblogdatasettext_mining
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
3CLUTO - Software for Clustering High-Dimensional Datasets | Karypis Lab
CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.
vor 7 Jahren von @schwemmlein
alle anzeigen
IR
clustering
dataset
software
IRclusteringdatasetsoftware
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Original data of NYT10
NYT10 is originally released by the paper "Sebastian Riedel, Limin Yao, and Andrew McCallum. Modeling relations and their mentions without labeled text."
vor 7 Jahren von @schwemmlein
alle anzeigen
dataset
extraction
relation
datasetextractionrelation
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1NLP datasets
Data sets I have developed and used in my research.
vor 7 Jahren von @schwemmlein
alle anzeigen
analysis
data
dataset
extraction
hyponym
relation
semantic
sentiment
twitter
analysisdatadatasetextractionhyponymrelationsemanticsentimenttwitter
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Open Research Corpus: Public Datasets of Scholarly Research Papers
http://labs.semanticscholar.org/corpus/
vor 6 Jahren von @schwemmlein
alle anzeigen
corpus
data
dataset
publications
research
scholar
set
corpusdatadatasetpublicationsresearchscholarset
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1GLUE Benchmark
https://gluebenchmark.com/leaderboard
vor 6 Jahren von @schwemmlein
alle anzeigen
benchmark
dataset
leader
nlp
scorer
sota
task
benchmarkdatasetleadernlpscorersotatask
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1PAN
PAN is a series of scientific events and shared tasks on digital text forensics and stylometry
vor 6 Jahren von @schwemmlein
alle anzeigen
classification
dataset
stylometry
task
text
classificationdatasetstylometrytasktext
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
2Tracking Progress in Natural Language Processing | NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
vor 6 Jahren von @schwemmlein
alle anzeigen
dataset
nlp
progress
sota
task
datasetnlpprogresssotatask
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1GitHub - RaRe-Technologies/gensim-data: Data repository for pretrained NLP models and NLP corpora.
Data repository for pretrained NLP models and NLP corpora. - RaRe-Technologies/gensim-data
vor 6 Jahren von @schwemmlein
alle anzeigen
dataset
embeddings
gensim
glove
models
nlp
w2v
word
datasetembeddingsgensimglovemodelsnlpw2vword
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1StereoSet measures racism, sexism, and other forms of bias in AI language models | VentureBeat
https://venturebeat.com/2020/04/22/stereoset-measures-racism-sexism-and-other-forms-of-bias-in-ai-language-models/
vor 4 Jahren von @schwemmlein
alle anzeigen
bias
dataset
language
lm
models
nlp
biasdatasetlanguagelmmodelsnlp
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Snorkel
Snorkel is a system for programmatically building and managing training datasets without manual labeling. In Snorkel, users can develop large training datasets in hours or days rather than hand-labeling them over weeks or months.
vor 3 Jahren von @schwemmlein
alle anzeigen
automatically
data
dataset
labeling
system
training
automaticallydatadatasetlabelingsystemtraining
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags
1Resources for Abstractive Summarization of Long Documents
Resources for the NAACL 2018 paper "A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents" - armancohan/long-summarization
vor 3 Jahren von @schwemmlein
alle anzeigen
code
data
dataset
discourse
document
naacl
nlp
resources
summarization
codedatadatasetdiscoursedocumentnaaclnlpresourcessummarization
(0)
KopierenLöschen
- Community-Eintrag
- Versionsverlauf dieses Eintrags

⟨⟨
⟨
1
⟩
⟩⟩

Publikationen (verstecken)3
Anzeige
alles
nur Publikationen
Publikationen pro Seite
5
10
20
50
100
sortieren nach
hinzugefügt am
Titel
Autor
Erscheinungsdatum
Eintragstyp
Hilfe für erweiterte Sortierung...
RSS
BibTeX
RDF
mehr...

3One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, und T. Robinson. (2013)
vor 6 Jahren von @schwemmlein
alle anzeigen
benchmark
billion
dataset
language
model
news
nlp
benchmarkbilliondatasetlanguagemodelnewsnlp
(0)
KopierenLöschenDiese Publikation zur Ablage hinzufügen
7Evaluation of post-hoc XAI approaches through synthetic tabular data
J. Tritscher, M. Ring, D. Schlör, L. Hettinger, und A. Hotho. (2020)Accepted but not published.
vor 5 Jahren von @schwemmlein
alle anzeigen
XAI
dataset
evaluation
explainable
myown
XAIdatasetevaluationexplainablemyown
(0)
KopierenLöschenDiese Publikation zur Ablage hinzufügen
2CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
N. Nangia, C. Vania, R. Bhalerao, und S. Bowman. EMNLP, (2020)cite arxiv:2010.00133Comment: EMNLP 2020.
vor 4 Jahren von @schwemmlein
alle anzeigen
bias
dataset
language
lm
models
biasdatasetlanguagelmmodels
(0)
KopierenLöschenDiese Publikation zur Ablage hinzufügen

⟨⟨
⟨
1
⟩
⟩⟩

Lesezeichen (verstecken)42 Anzeigeallesnur LesezeichenLesezeichen pro Seite5102050100 sortieren nachhinzugefügt amTitel RSSBibTeXXML

Publikationen (verstecken)3 Anzeigeallesnur PublikationenPublikationen pro Seite5102050100 sortieren nachhinzugefügt amTitelAutorErscheinungsdatumEintragstypHilfe für erweiterte Sortierung... RSSBibTeXRDFmehr...

Stöbern

Verwandte Tags

Tags

Lesezeichen (verstecken)42
Anzeige
alles
nur Lesezeichen
Lesezeichen pro Seite
5
10
20
50
100
sortieren nach
hinzugefügt am
Titel
RSS
BibTeX
XML

Publikationen (verstecken)3
Anzeige
alles
nur Publikationen
Publikationen pro Seite
5
10
20
50
100
sortieren nach
hinzugefügt am
Titel
Autor
Erscheinungsdatum
Eintragstyp
Hilfe für erweiterte Sortierung...
RSS
BibTeX
RDF
mehr...