tag :: dataset web

bookmarks (hide)45
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1Unknown Data | Mining and consolidating research dataset metadata on the Web
https://unknowndataproject.github.io/
a year ago by @astrupp
show all tags
crawl
data
dataset
datasets
web
crawldatadatasetdatasetsweb
(0)
copydelete
- community post
- history of this post
1GERiT: German Research Institutions | DFG
GERiT ist ein Informationsportal zu deutschen Forschungseinrichtungen. GERiT richtet sich an Studierende und Forschende aus dem In- und Ausland.
4 years ago by @jaeschke
show all tags
academic
dataset
german
institution
research
university
web
academicdatasetgermaninstitutionresearchuniversityweb
(0)
copydelete
- community post
- history of this post
1German Academic Web | SoBigData.eu
http://www.sobigdata.eu/dataset/german-academic-web
8 years ago by @jaeschke
show all tags
academic
dataset
gaw
german
myown
sobigdata
web
academicdatasetgawgermanmyownsobigdataweb
(0)
copydelete
- community post
- history of this post
1Home · springernature/scigraph Wiki · GitHub
https://github.com/springernature/scigraph/wiki
8 years ago by @hotho
show all tags
bibliographic
dataset
lod
owl
research
science
scigraph
semantic
web
bibliographicdatasetlodowlresearchsciencescigraphsemanticweb
(0)
copydelete
- community post
- history of this post
2Datasets | CNetS
WebScience'14 data challenge datasets
8 years ago by @rwoz
show all tags
dataset
drawing
graph
visualisation
web
datasetdrawinggraphvisualisationweb
(0)
copydelete
- community post
- history of this post
4Web Data Commons
http://webdatacommons.org/
8 years ago by @hotho
show all tags
common
crawl
data
dataset
rdf
relations
semantic
web
commoncrawldatadatasetrdfrelationssemanticweb
(0)
copydelete
- community post
- history of this post
2RecSys Challenge 2015 - Challenge
http://2015.recsyschallenge.com/challenge.html
8 years ago by @hotho
show all tags
2015
challenge
dataset
recsys
session
web
2015challengedatasetrecsyssessionweb
(0)
copydelete
- community post
- history of this post
2Net Data Directory
The Net Data Directory collects and shares information on different sources of data about the Internet. For more about the project, see our about page. To get started, use the search box below, or check out our quick start guide.
8 years ago by @jaeschke
show all tags
data
dataset
directory
internet
monitor
net
web
datadatasetdirectoryinternetmonitornetweb
(0)
copydelete
- community post
- history of this post
4Web Data Commons
http://webdatacommons.org/
10 years ago by @jaeschke
show all tags
commoncrawl
crawl
data
dataset
linked
lod
microformat
open
rdf
semantic
web
commoncrawlcrawldatadatasetlinkedlodmicroformatopenrdfsemanticweb
(0)
copydelete
- community post
- history of this post
1Click Dataset | Center for Complex Networks and Systems Research
http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset/
10 years ago by @hotho
show all tags
click
dataset
indiana
stream
traffic
web
clickdatasetindianastreamtrafficweb
(0)
copydelete
- community post
- history of this post
1Host Link Graph JISC UK Web Domain Dataset (1996-2010)
UK Web Archive Open Data
10 years ago by @jaeschke
show all tags
archive
data
dataset
graph
host
jisc
link
uk
web
archivedatadatasetgraphhostjisclinkukweb
(0)
copydelete
- community post
- history of this post
1JISC UK Web Domain Dataset (1996-2013)
UK Web Archive Open Data
10 years ago by @jaeschke
show all tags
archive
data
dataset
domain
jisc
open
uk
web
archivedatadatasetdomainjiscopenukweb
(0)
copydelete
- community post
- history of this post
1WDC - Hyperlink Graphs
This page provides two large hyperlink graph for public download. The graphs have been extracted from the 2012 and 2014 versions of the Common Crawl web corpera. The 2012 graph covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, the graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. The2014 graph covers 1.7 billion web pages connected by 64 billion hyperlinks. Below we provide instructions on how to download the graphs as well as basic statistics about their topology.
10 years ago by @jaeschke
show all tags
dataset
graph
link
web
datasetgraphlinkweb
(0)
copydelete
- community post
- history of this post
2Webscope from Yahoo! Labs
http://webscope.sandbox.yahoo.com/catalog.php
11 years ago by @thoni
show all tags
dataset
labs
web
yahoo
datasetlabswebyahoo
(0)
copydelete
- community post
- history of this post
3WDC - Hyperlink Graph
http://webdatacommons.org/hyperlinkgraph/
11 years ago by @jil
show all tags
dataset
graph
hyperlink
ir
link
page
web
datasetgraphhyperlinkirlinkpageweb
(0)
copydelete
- community post
- history of this post
3WDC - Hyperlink Graph
This page provides a large hyperlink graph for public download. The graph has been extracted from the Common Crawl 2012 web corpus and covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, this graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. Below we provide instructions on how to download the graph as well as basic statistics about its topology.
11 years ago by @hotho
show all tags
dataset
graph
hyperlink
web
datasetgraphhyperlinkweb
(0)
copydelete
- community post
- history of this post
1A Linked-Data-driven and Semantically-enabled Journal Portal for Scientometrics | www.semantic-web-journal.net
http://www.semantic-web-journal.net/blog/linked-data-driven-and-semantically-enabled-journal-portal-scientometrics
11 years ago by @hotho
show all tags
data
dataset
journal
linked
paper
semantic
statistics
web
datadatasetjournallinkedpapersemanticstatisticsweb
(0)
copydelete
- community post
- history of this post
2Web Observatory Wiki
http://wow.west.webobservatory.org/index.php/Main_Page
12 years ago by @jaeschke
show all tags
dataset
observatory
science
semantic
web
wiki
datasetobservatorysciencesemanticwebwiki
(1)
copydelete
- community post
- history of this post
1blekko donates search data to Common Crawl | blekko
Blekko Blog | get the Latest Updates On SEO, Search Engines, SEO Tools, SEO Tutorials, SEO techniques, SEO APIs and much more
12 years ago by @jaeschke
show all tags
blekko
crawl
dataset
search
web
blekkocrawldatasetsearchweb
(0)
copydelete
- community post
- history of this post
4ICWSM Datasets
http://icwsm.cs.mcgill.ca/
12 years ago by @folke
show all tags
dataset
facebook
icwsm
social
twitter
web
datasetfacebookicwsmsocialtwitterweb
(0)
copydelete
- community post
- history of this post
1The ClueWeb09 Dataset
http://www.lemurproject.org/clueweb09.php/
12 years ago by @jaeschke
show all tags
big
data
dataset
web
bigdatadatasetweb
(0)
copydelete
- community post
- history of this post
5| CommonCrawl
http://commoncrawl.org/
12 years ago by @jaeschke
show all tags
crawling
data
dataset
web
crawlingdatadatasetweb
(0)
copydelete
- community post
- history of this post
2WebBase Project
http://dbpubs.stanford.edu:8091/~testbed/doc2/WebBase/
12 years ago by @jaeschke
show all tags
data
dataset
stanford
web
webbase
datadatasetstanfordwebwebbase
(0)
copydelete
- community post
- history of this post
2Public Data Sets : Amazon Web Services
https://aws.amazon.com/datasets
12 years ago by @jaeschke
show all tags
amazon
data
dataset
web
amazondatadatasetweb
(0)
copydelete
- community post
- history of this post
4ICWSM Datasets
http://icwsm.cs.mcgill.ca/
12 years ago by @hotho
show all tags
dataset
social
web
datasetsocialweb
(0)
copydelete
- community post
- history of this post
4ICWSM Datasets
http://icwsm.cs.mcgill.ca/
12 years ago by @jaeschke
show all tags
dataset
icwsm
social
twitter
web
dataseticwsmsocialtwitterweb
(0)
copydelete
- community post
- history of this post
2WebBase Project
http://dbpubs.stanford.edu:8091/~testbed/doc2/WebBase/
12 years ago by @dbenz
show all tags
dataset
focussed
topic
web
webBase
datasetfocussedtopicwebwebBase
(0)
copydelete
- community post
- history of this post
1Public Data Sets on Amazon Web Services (AWS)
Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications.
13 years ago by @sac
show all tags
amazon
aws
dataset
web
amazonawsdatasetweb
(0)
copydelete
- community post
- history of this post
2Accessing the Data | CommonCrawl
http://www.commoncrawl.org/data/accessing-the-data/
13 years ago by @hotho
show all tags
accessing
commoncrawl
data
web
dataset
accessingcommoncrawldatawebdataset
(0)
copydelete
- community post
- history of this post
2Army of 'socialbots' steal gigabytes of Facebook user data
http://www.theregister.co.uk/2011/11/01/facebook_infiltration_bots/
13 years ago by @hotho
show all tags
data
dataset
facebook
science
social
web
datadatasetfacebooksciencesocialweb
(0)
copydelete
- community post
- history of this post
2Webscope from Yahoo! Labs
http://webscope.sandbox.yahoo.com/catalog.php
13 years ago by @hotho
show all tags
language
search
web
dataset
languagesearchwebdataset
(0)
copydelete
- community post
- history of this post
1Query Representation and Understanding Set - Microsoft Research
The Query Representation and Understanding (QRU) data set contains a set of similar queries that can be used in web research such as query transformation and relevance ranking. QRU contains similar queries that are related to existing benchmark data sets, such as TREC query sets. The QRU data set was created by extracting 100 TREC queries, training a query-generation model and a commercial search engine, generating similar queries from TREC queries with the model, and removal of mistakenly generated queries.
13 years ago by @sac
show all tags
dataset
query
search
web
datasetquerysearchweb
(0)
copydelete
- community post
- history of this post
1Is there a disconnect between Big Data and the Web of Data ? | Paul Miller - The Cloud of Data
http://cloudofdata.com/2010/11/is-there-a-disconnect-between-big-data-and-the-web-of-data/
13 years ago by @zazi
show all tags
Big_Data
Big_Data_vs_Linked_Data
Data_Silos
Dataset
Linked_Data
Web
Big_DataBig_Data_vs_Linked_DataData_SilosDatasetLinked_DataWeb
(0)
copydelete
- community post
- history of this post
3d8taplex
d8taplex helps you discover, visualize and explore data found on the web including time series data
14 years ago by @hotho
show all tags
data
dataset
discovery
exploration
visualization
web
datadatasetdiscoveryexplorationvisualizationweb
(0)
copydelete
- community post
- history of this post
4The ClueWeb09 Dataset
http://boston.lti.cs.cmu.edu/Data/clueweb09/
14 years ago by @dbenz
show all tags
clueweb
dataset
research
web
cluewebdatasetresearchweb
(0)
copydelete
- community post
- history of this post
6Home - CKAN
http://ckan.net/
14 years ago by @hotho
show all tags
dataset
lod
register
semantic
web
datasetlodregistersemanticweb
(0)
copydelete
- community post
- history of this post
6The Linking Open Data cloud diagram
http://richard.cyganiak.de/2007/10/lod/
14 years ago by @hotho
show all tags
cloud
dataset
linked
open
semantic
web
clouddatasetlinkedopensemanticweb
(0)
copydelete
- community post
- history of this post
3Billion Triple Challenge 2010 Dataset
http://km.aifb.kit.edu/projects/btc-2010/
14 years ago by @hotho
show all tags
2010
billion
challenge
dataset
semantic
triple
web
2010billionchallengedatasetsemantictripleweb
(0)
copydelete
- community post
- history of this post
3Webscope from Yahoo! Labs
The Yahoo! Webscope™ Program is a reference library of interesting and scientifically useful datasets for non-commercial use by academics and other scientists. All datasets have been reviewed to conform to Yahoo!'s data protection standards, including strict controls on privacy. We have a number of datasets that we are excited to share with you. Learn how to get involved.
15 years ago by @jaeschke
show all tags
dataset
web
yahoo
datasetwebyahoo
(0)
copydelete
- community post
- history of this post
4The ClueWeb09 Dataset
http://boston.lti.cs.cmu.edu/Data/clueweb09/
15 years ago by @hotho
show all tags
clueweb09
dataset
web
clueweb09datasetweb
(0)
copydelete
- community post
- history of this post
4ICWSM 2009 - International AAAI Conference on Weblogs and Social Media
http://www.icwsm.org/2009/data/
16 years ago by @hotho
show all tags
2009
blog
challenge
conference
data
dataset
social
web
2009blogchallengeconferencedatadatasetsocialweb
(0)
copydelete
- community post
- history of this post
1Web Community Dataset
http://affsys.com/experiments/HT2008/
16 years ago by @hotho
show all tags
community
dataset
ht08
hypertext08
web
communitydatasetht08hypertext08web
(0)
copydelete
- community post
- history of this post
1The QWS Dataset
http://www.uoguelph.ca/~qmahmoud/qws/
17 years ago by @hotho
show all tags
answer
dataset
question
semantic
service
web
answerdatasetquestionsemanticserviceweb
(0)
copydelete
- community post
- history of this post
1Stanford Computer Science
http://cs.stanford.edu/research/project.php?id=121
17 years ago by @hotho
show all tags
crawl
dataset
web
crawldatasetweb
(0)
copydelete
- community post
- history of this post
1Web Information Retrieval / Natural Language Processing Group (WING) - NLP/IR resource page on aye
http://wing.comp.nus.edu.sg/portal/RPNLPIR/
18 years ago by @hotho
show all tags
dataset
information
ir
nlp
resource
retrieval
web
datasetinformationirnlpresourceretrievalweb
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
1
⟩
⟩⟩

publications (hide)9
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

2DSDD: Domain-Specific Dataset Discovery on the Web
H. Zhang, A. Santos, and J. Freire. Proceedings of the 30th ACM International Conference on Information &amp$\mathsemicolon$ Knowledge Management, ACM, (October 2021)
2 years ago by @jaeschke
show all tags
crawling
data
dataset
discovery
unknowndata
web
crawlingdatadatasetdiscoveryunknowndataweb
(0)
copydeleteadd this publication to your clipboard
2Dataset or Not? A Study on the Veracity of Semantic Markup for Dataset Pages
T. Alrashed, D. Paparas, O. Benjelloun, Y. Sheng, and N. Noy. The Semantic Web -- ISWC 2021, page 338--356. Cham, Springer International Publishing, (2021)
2 years ago by @jaeschke
show all tags
dataset
extraction
markup
semantics
semanticweb
unknowndata
web
datasetextractionmarkupsemanticssemanticwebunknowndataweb
(0)
copydeleteadd this publication to your clipboard
3Where are the Datasets? A case study on the German Academic Web Archive
Y. Younes, S. Tiesler, R. Jäschke, and B. Mathiak. Proceedings of the Web Archiving and Digital Libraries Workshop at JCDL 2022, (2022)
2 years ago by @jaeschke
show all tags
2022
academic
crawl
dataset
gaw
german
myown
unknowndata
web
2022academiccrawldatasetgawgermanmyownunknowndataweb
(0)
copydeleteadd this publication to your clipboard
2Web archives as a data resource for digital scholars
E. Vlassenroot, S. Chambers, E. Di Pretoro, F. Geeraert, G. Haesendonck, A. Michel, and P. Mechant. International Journal of Digital Humanities, 1 (1): 85--111 (Apr 1, 2019)
4 years ago by @parismic
show all tags
dataset
digital
resource
social
warc
web
datasetdigitalresourcesocialwarcweb
(0)
copydeleteadd this publication to your clipboard
2PURE: A Dataset of Public Requirements Documents
A. Ferrari, G. Spagnolo, and S. Gnesi. 2017 IEEE 25th International Requirements Engineering Conference (RE), page 502-505. (September 2017)
4 years ago by @parismic
show all tags
dataset
web
datasetweb
(0)
copydeleteadd this publication to your clipboard
2The Web as a Knowledge-base for Answering Complex Questions
A. Talmor, and J. Berant. (2018)cite arxiv:1803.06643Comment: accepted as a long paper at NAACL 2018.
4 years ago by @parismic
show all tags
bert_performance
dataset
instituteclustering
knwoledge
web
bert_performancedatasetinstituteclusteringknwoledgeweb
(0)
copydeleteadd this publication to your clipboard
2Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics
P. Nakov. (2019)cite arxiv:1912.01113Comment: noun compounds, paraphrasing verbs, semantic interpretation, syntax, multi-word expressions, MWEs, noun compound interpretation, noun compound bracketing, prepositional phrase attachment, noun phrase coordination, machine translation.
4 years ago by @parismic
show all tags
dataset
semantic
training
web
datasetsemantictrainingweb
(0)
copydeleteadd this publication to your clipboard
7Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
M. Russell. O'Reilly Media, Sebastopol, Canada, 1. edition, (2011)
13 years ago by @clemensbaier
show all tags
2011
KDD
KDE
Twitter
analysis
book
datamining
dataset
development
socialmedia
visualisation
web
2011KDDKDETwitteranalysisbookdataminingdatasetdevelopmentsocialmediavisualisationweb
(0)
copydeleteadd this publication to your clipboard
3Web Text Corpus for Natural Language Processing.
V. Liu, and J. Curran. EACL, The Association for Computer Linguistics, (2006)
14 years ago by @dbenz
show all tags
corpus
dataset
web
synonym_detection
nlp
corpusdatasetwebsynonym_detectionnlp
(0)
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
⟩
⟩⟩

bookmarks (hide)45 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide)9 displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

browse

related tags

bookmarks (hide)45
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)9
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...