tag :: dataset | BibSonomy

bookmarks (hide)740
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

4All Our N-gram are Belong to You |:| Google Research Blog
Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usu
18 years ago by @avivagabriel
show all tags
dataset
linguistics
machine+translation
models
n-gram
datasetlinguisticsmachine+translationmodelsn-gram
(0)
copydelete
- community post
- history of this post
4All Our N-gram are Belong to You |:| Google Research Blog
Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usu
18 years ago by @avivamagnolia
show all tags
machine+translation
models
dataset
linguistics
n-gram
machine+translationmodelsdatasetlinguisticsn-gram
(0)
copydelete
- community post
- history of this post
1SIMILE | RDF Data Collection
http://simile.mit.edu/repository/datasets/index.html
18 years ago by @schmitz
show all tags
semanticweb
dataset
semanticwebdataset
(0)
copydelete
- community post
- history of this post
4CLUTO - Family of Data Clustering Software Tools | Karypis Lab
http://glaros.dtc.umn.edu/gkhome/views/cluto
18 years ago by @hotho
show all tags
clustering
tools
dataset
dm
ml
clusteringtoolsdatasetdmml
(0)
copydelete
- community post
- history of this post
2Learning Question Classifiers
http://l2r.cs.uiuc.edu/~cogcomp/Data/QA/QC/
18 years ago by @hotho
show all tags
qa
classification
dataset
qaclassificationdataset
(0)
copydelete
- community post
- history of this post
11Enron Email Dataset
http://www.cs.cmu.edu/~enron/
18 years ago by @apo
show all tags
dataset
mail
enron
datasetmailenron
(0)
copydelete
- community post
- history of this post
5AOL search data mirrors
This collection consists of ~20M web queries collected from ~650k users over three months. The data is sorted by anonymous user ID and sequentially arranged.
18 years ago by @hotho
show all tags
search
dataset
searchdataset
(0)
copydelete
- community post
- history of this post
5Netflix Prize: Home
http://www.netflixprize.com/
18 years ago by @hotho
show all tags
recommender
movie
dataset
preis
recommendermoviedatasetpreis
(0)
copydelete
- community post
- history of this post
3Pajek / How to: Convert text file datasets into Pajek format
http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/text2pajek.htm
18 years ago by @schmitz
show all tags
converter
csv
dataset
graphtheory
pajek
convertercsvdatasetgraphtheorypajek
(0)
copydelete
- community post
- history of this post
5AOL search data mirrors
This collection consists of ~20M web queries collected from ~650k users over three months. The data is sorted by anonymous user ID and sequentially arranged.
18 years ago by @schmitz
show all tags
search
dataset
aol
searchdatasetaol
(0)
copydelete
- community post
- history of this post
2Bibliography
Imbalance Problem
18 years ago by @hotho
show all tags
data
dataset
paper
imbalance
datadatasetpaperimbalance
(0)
copydelete
- community post
- history of this post
1foafPub dataset
http://ebiquity.umbc.edu/resource/html/id/82/
18 years ago by @jaeschke
show all tags
graph
dataset
foaf
network
graphdatasetfoafnetwork
(0)
copydelete
- community post
- history of this post
1Trec Spam Corpus
http://plg.uwaterloo.ca/~gvcormac/treccorpus/
18 years ago by @hotho
show all tags
trec
spam
set
data
dataset
corpus
trecspamsetdatadatasetcorpus
(0)
copydelete
- community post
- history of this post
3Where's George? ® 2.2
http://www.wheresgeorge.com/
18 years ago by @hotho
show all tags
dollar
dataset
dollardataset
(0)
copydelete
- community post
- history of this post
1Seuchen-Prognose: Forscher finden das Gesetz des Reisens - Wissenschaft - SPIEGEL ONLINE - Nachrichten
http://www.spiegel.de/wissenschaft/mensch/0,1518,397303,00.html
18 years ago by @hotho
show all tags
bewegung
dollar
dataset
reise
vorhersagen
bewegungdollardatasetreisevorhersagen
(0)
copydelete
- community post
- history of this post
7UCI Machine Learning Repository
http://www.ics.uci.edu/~mlearn/MLRepository.html
18 years ago by @schmitz
show all tags
dataset
repository
machinelearning
uci
datasetrepositorymachinelearninguci
(0)
copydelete
- community post
- history of this post
2Benchmark Data Sets used in [RaeOnoMue01] and [MikRaeWesSchMue99]
http://ida.first.fraunhofer.de/projects/bench/benchmarks.htm
18 years ago by @sb3000
show all tags
datamining
dataset
dataminingdataset
(0)
copydelete
- community post
- history of this post
1Miscellaneous MATLAB Software, Data, Tricks and Demonstrations
Gunnar Raetsch's Benchmark Datasets
18 years ago by @hotho
show all tags
benchmark
dataset
dm
matlab
ml
kernel
benchmarkdatasetdmmatlabmlkernel
(0)
copydelete
- community post
- history of this post
1Algorithms for Large Data Sets: Lecture Notes & Slides
http://www.ee.technion.ac.il/courses/049011/index_files/Page337.html
18 years ago by @hotho
show all tags
folien
ir
large
dataset
folienirlargedataset
(0)
copydelete
- community post
- history of this post
2Benchmark Data Sets used in [RaeOnoMue01] and [MikRaeWesSchMue99]
http://ida.first.fraunhofer.de/projects/bench/benchmarks.htm
18 years ago by @hotho
show all tags
dataset
dm
ida
ml
datasetdmidaml
(0)
copydelete
- community post
- history of this post
1Datasets
http://www.niaad.liacc.up.pt/old/statlog/datasets.html
18 years ago by @hotho
show all tags
statlog
dataset
dm
ml
statlogdatasetdmml
(0)
copydelete
- community post
- history of this post
7UCI Machine Learning Repository
http://www.ics.uci.edu/~mlearn/MLRepository.html
18 years ago by @hotho
show all tags
learning
data
dataset
dm
mining
machine
ml
uci
learningdatadatasetdmminingmachinemluci
(0)
copydelete
- community post
- history of this post
1Delve Datasets
http://www.cs.toronto.edu/~delve/data/datasets.html
18 years ago by @hotho
show all tags
learning
data
delve
dataset
dm
mining
machine
ml
learningdatadelvedatasetdmminingmachineml
(0)
copydelete
- community post
- history of this post
3Martin Hepp
http://www.heppnetz.de/eclassowl/
18 years ago by @hotho
show all tags
ontology
dataset
ontologydataset
(0)
copydelete
- community post
- history of this post
3Omega Ontology: Home
http://omega.isi.edu/
18 years ago by @hotho
show all tags
ontology
omega
dataset
nlp
ontologyomegadatasetnlp
(0)
copydelete
- community post
- history of this post
4Welcome to the UCR Time Series Classification/Clustering Page
Welcome to the UCR Time Series Classification/Clustering Page
18 years ago by @hotho
show all tags
dataset
dataset
(0)
copydelete
- community post
- history of this post
1HepCorpus - Sinai
http://sinai.ujaen.es/wiki/index.php/HepCorpus#English_version
18 years ago by @hotho
show all tags
text
dataset
corpus
textdatasetcorpus
(0)
copydelete
- community post
- history of this post
2Manuel Barbera, Corpus based computational linguistic resources. General: E-Texts (§ 2.3).
Electronic Literary Text Archives.
18 years ago by @hotho
show all tags
text
dataset
corpus
textdatasetcorpus
(0)
copydelete
- community post
- history of this post
1dataset
http://www.informatics.bangor.ac.uk/~kuncheva/activities/artificial_data.htm
18 years ago by @hotho
show all tags
clustering
dataset
clusteringdataset
(0)
copydelete
- community post
- history of this post
1Fundamental Clustering Problem Suite | Databionics
Fundamental Clustering Problem Suite
18 years ago by @hotho
show all tags
clustering
dataset
clusteringdataset
(0)
copydelete
- community post
- history of this post
1Miscellaneous Datasets Repository
http://www-db.stanford.edu/~glenj/datasets/repository.html
18 years ago by @schmitz
show all tags
dataset
dataset
(0)
copydelete
- community post
- history of this post
3Andrew McCallum's Code and Data
Cora Citation Matching [reference matching, object correspondence] Text of citations hand-clustered into groups referring to the same paper.
18 years ago by @hotho
show all tags
ie
dataset
bibliographic
references
cora
iedatasetbibliographicreferencescora
(0)
copydelete
- community post
- history of this post
1Lost Boy: SPARQLing the BBC Programme Catalogue
http://www.ldodds.com/blog/archives/000272.html
18 years ago by @hotho
show all tags
data
dataset
rdf
datadatasetrdf
(0)
copydelete
- community post
- history of this post
11System One - Wikipedia3
http://labs.systemone.at/wikipedia3
18 years ago by @schmitz
show all tags
wiki
dataset
wikipedia
rdf
wikidatasetwikipediardf
(0)
copydelete
- community post
- history of this post
3GroupLens Home Page
http://www.grouplens.org/
18 years ago by @sb3000
show all tags
recommender
dataset
recommenderdataset
(0)
copydelete
- community post
- history of this post
2much.more
A number of resources have been compiled within the context of the MuchMore project. These include: a bilingual, parallel medical corpus; corresponding queries and relevance assessments; evaluation sets of disambiguated terms for GermaNet and UMLS; an evaluation list for morphological decomposition of medical terms.
18 years ago by @hotho
show all tags
dataset
corpus
datasetcorpus
(0)
copydelete
- community post
- history of this post
1SourceForge.net: Files
New text datasets (donated by George Forman) are available for download on Sourceforge:
18 years ago by @hotho
show all tags
weka
text
dataset
wekatextdataset
(0)
copydelete
- community post
- history of this post
1Obtaining corpora and text collections for biomedical natural language processing
http://compbio.uchsc.edu/corpora/obtaining.shtml
19 years ago by @hotho
show all tags
dataset
nlp
bio
datasetnlpbio
(0)
copydelete
- community post
- history of this post
1Tagged datasets for named entity recognition tasks
http://www.cs.technion.ac.il/~gabr/resources/data/ne_datasets.html
19 years ago by @hotho
show all tags
named
dataset
entity
nlp
nameddatasetentitynlp
(0)
copydelete
- community post
- history of this post
1Document Understanding Conferences
http://www-nlpir.nist.gov/projects/duc/index.html
19 years ago by @schmitz
show all tags
document
dataset
summarization
documentdatasetsummarization
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
13
14
15
⟩
⟩⟩

publications (hide)404
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

3LFM-2b: A Dataset of Enriched Music Listening Events for Recommender Systems Research and Fairness Analysis
M. Schedl, S. Brandl, O. Lesota, E. Parada-Cabaleiro, D. Penz, and N. Rekabsaz. ACM SIGIR Conference on Human Information Interaction and Retrieval, ACM, (March 2022)
a month ago by @sop2-ffzg
show all tags
listening
in
Collaborative
approaches
analysis
Fairness
classification
Content-based
Demographic
dataset
Music
Genre
information
User
records
data
Metadata
from:kamber
systems
Recommender
biases
Algorithmic
Last.fm
Style
recommender
Lyrics
listeninginCollaborativeapproachesanalysisFairnessclassificationContent-basedDemographicdatasetMusicGenreinformationUserrecordsdataMetadatafrom:kambersystemsRecommenderbiasesAlgorithmicLast.fmStylerecommenderLyrics
(0)
copydeleteadd this publication to your clipboard
3Exploration of music collections with audio embeddings
P. Tovstogan. (2022)
a month ago by @sop2-ffzg
show all tags
Content-based
retrieval
Collaborative
similarity
Dataset
Auto-tagging
Deep
Information
Personal
from:kamber
Recommendation
Content-basedretrievalCollaborativesimilarityDatasetAuto-taggingDeepInformationPersonalfrom:kamberRecommendation
(0)
copydeleteadd this publication to your clipboard
3LFM-2b: A Dataset of Enriched Music Listening Events for Recommender Systems Research and Fairness Analysis
M. Schedl, S. Brandl, O. Lesota, E. Parada-Cabaleiro, D. Penz, and N. Rekabsaz. ACM SIGIR Conference on Human Information Interaction and Retrieval, ACM, (March 2022)
a month ago by @kamber
show all tags
Algorithmic
Collaborative
Content-based
Demographic
Fairness
Genre
Last.fm
Lyrics
Metadata
Music
Recommender
Style
User
analysis
approaches
biases
classification
data
dataset
in
information
listening
recommender
records
systems
AlgorithmicCollaborativeContent-basedDemographicFairnessGenreLast.fmLyricsMetadataMusicRecommenderStyleUseranalysisapproachesbiasesclassificationdatadatasetininformationlisteningrecommenderrecordssystems
(0)
copydeleteadd this publication to your clipboard
3Exploration of music collections with audio embeddings
P. Tovstogan. (2022)
a month ago by @kamber
show all tags
Auto-tagging
Collaborative
Content-based
Dataset
Deep
Information
Personal
Recommendation
Auto-taggingCollaborativeContent-basedDatasetDeepInformationPersonalRecommendation
(0)
copydeleteadd this publication to your clipboard
1Annotated Vossian Antonomasia Dataset
M. Schwab, R. Jäschke, and F. Fischer. (2023)
a month ago by @jaeschke
show all tags
dataset
myown
vossanto
datasetmyownvossanto
(0)
copydeleteadd this publication to your clipboard
1A note on predator-prey dynamics in radiocarbon datasets
N. Marom, and U. Wolkowski. bioRxiv, (2024)
2 months ago by @tabularii
show all tags
archaeology
dataset
evolutionary_dynamics
predator-prey
radiocarbon
archaeologydatasetevolutionary_dynamicspredator-preyradiocarbon
(0)
copydeleteadd this publication to your clipboard
2Finding Bipartite Components in Hypergraphs
P. Macgregor, and H. Sun. (2022)
2 months ago by @tobias.koopmann
show all tags
Dataset
Hypergraph
DatasetHypergraph
(0)
copydeleteadd this publication to your clipboard
3Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets
A. Power, Y. Burda, H. Edwards, I. Babuschkin, and V. Misra. (2022)cite arxiv:2201.02177Comment: Correspondence to alethea@openai.com. Code available at: https://github.com/openai/grok.
3 months ago by @tabularii
show all tags
algorithm
dataset
machine_learning
overfitting
algorithmdatasetmachine_learningoverfitting
(0)
copydeleteadd this publication to your clipboard
1PDEBENCH: An Extensive Benchmark for Scientific Machine Learning
M. Takamoto, T. Praditia, R. Leiteritz, D. MacKinlay, F. Alesiani, D. Pflüger, and M. Niepert. (2023)
5 months ago by @annakrause
show all tags
benchmark
dataset
neuralpde
pdebench
benchmarkdatasetneuralpdepdebench
(0)
copydeleteadd this publication to your clipboard
3An Extensible Benchmark Suite for Learning to Simulate Physical Systems
K. Otness, A. Gjoka, J. Bruna, D. Panozzo, B. Peherstorfer, T. Schneider, and D. Zorin. (2021)cite arxiv:2108.07799Comment: Accepted to NeurIPS 2021 track on datasets and benchmarks.
5 months ago by @annakrause
show all tags
benchmark
dataset
neuralpde
todo:read
benchmarkdatasetneuralpdetodo:read
(0)
copydeleteadd this publication to your clipboard
2A Large-Scale Benchmark for the Incompressible Navier-Stokes Equations
Z. Huang, T. Schneider, M. Li, C. Jiang, D. Zorin, and D. Panozzo. (2021)cite arxiv:2112.05309.
5 months ago by @annakrause
show all tags
benchmark
dataset
navierstokes
neuralpde
todo:read
benchmarkdatasetnavierstokesneuralpdetodo:read
(0)
copydeleteadd this publication to your clipboard
3PDEBENCH: An Extensive Benchmark for Scientific Machine Learning
M. Takamoto, T. Praditia, R. Leiteritz, D. MacKinlay, F. Alesiani, D. Pflüger, and M. Niepert. (2022)cite arxiv:2210.07182Comment: 16 pages (main body) + 34 pages (supplemental material), accepted for publication in NeurIPS 2022 Track Datasets and Benchmarks.
5 months ago by @annakrause
show all tags
benchmark
dataset
neuralpde
pde
benchmarkdatasetneuralpdepde
(0)
copydeleteadd this publication to your clipboard
3PERFORMANCE EVALUATION OF MACHINE LEARNING ALGORITHMS FOR INTRUSION DETECTION SYSTEM
S. Tripathy, and B. Behera. Journal of Biomechanical Science and Engineering, (July 2023)
7 months ago by @sudha000
show all tags
(IDS)
CUP-99
False
Intrusion
KDD
ML
alarm
classifiers
dataset
detection
rate
system
(IDS)CUP-99FalseIntrusionKDDMLalarmclassifiersdatasetdetectionratesystem
(1)
copydeleteadd this publication to your clipboard
6DynaBench: A Benchmark Dataset for Learning Dynamical Systems from Low-Resolution Data
A. Dulny, A. Hotho, and A. Krause. Machine Learning and Knowledge Discovery in Databases: Research Track, page 438--455. Cham, Springer Nature Switzerland, (2023)
7 months ago by @hotho
show all tags
2023
benchmark
dataset
deep-learning
dynamical-systems
from:adulny
from:martinr
myown
neural-ode
2023benchmarkdatasetdeep-learningdynamical-systemsfrom:adulnyfrom:martinrmyownneural-ode
(0)
copydeleteadd this publication to your clipboard
6DynaBench: A Benchmark Dataset for Learning Dynamical Systems from Low-Resolution Data
A. Dulny, A. Hotho, and A. Krause. Machine Learning and Knowledge Discovery in Databases: Research Track, page 438--455. Cham, Springer Nature Switzerland, (2023)
7 months ago by @dmir
show all tags
app_eco
app_physics
author:dulny
author:hotho
author:krause
benchmark
dataset
deep-learning
dynamical-systems
from:adulny
from:martinr
myown
neural-ode
research_knowledge
selected
app_ecoapp_physicsauthor:dulnyauthor:hothoauthor:krausebenchmarkdatasetdeep-learningdynamical-systemsfrom:adulnyfrom:martinrmyownneural-oderesearch_knowledgeselected
(0)
copydeleteadd this publication to your clipboard
6DynaBench: A Benchmark Dataset for Learning Dynamical Systems from Low-Resolution Data
A. Dulny, A. Hotho, and A. Krause. Machine Learning and Knowledge Discovery in Databases: Research Track, page 438--455. Cham, Springer Nature Switzerland, (2023)
7 months ago by @adulny
show all tags
benchmark
dataset
deep-learning
dynamical-systems
from:adulny
myown
neural-ode
benchmarkdatasetdeep-learningdynamical-systemsfrom:adulnymyownneural-ode
(0)
copydeleteadd this publication to your clipboard
4ActiveGLAE: A Benchmark for Deep Active Learning with Transformers
L. Rauch, M. Aßenmacher, D. Huseljic, M. Wirth, B. Bischl, and B. Sick. (June 2023)
10 months ago by @ghagerer
show all tags
active-learning
benchmarks
dataset
pre-trained
active-learningbenchmarksdatasetpre-trained
(0)
copydeleteadd this publication to your clipboard
4Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
P. Villalobos, J. Sevilla, L. Heim, T. Besiroglu, M. Hobbhahn, and A. Ho. (2022)cite arxiv:2211.04325.
a year ago by @jaeschke
show all tags
data
dataset
learning
machine
ml
training
datadatasetlearningmachinemltraining
(0)
copydeleteadd this publication to your clipboard
2EMAKG: An Enhanced Version Of The Microsoft Academic Knowledge Graph
L. Pollacci. (2022)cite arxiv:2203.09159.
a year ago by @parismic
show all tags
MAKG
data
dataset
graph
quality
MAKGdatadatasetgraphquality
(0)
copydeleteadd this publication to your clipboard
2Extensible Motion-based Identification of XR Users with Non-Specific Motion
C. Rack, K. Kobs, T. Fernando, A. Hotho, and M. Latoschik. (February 2023)
a year ago by @cschell
show all tags
biometricdata
dataset
myown
virtualreality
biometricdatadatasetmyownvirtualreality
(0)
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
2
3
⟩
⟩⟩