group :: 20dc13 | BibSonomy

bookmarks (hide)36
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1GERiT: German Research Institutions | DFG
GERiT ist ein Informationsportal zu deutschen Forschungseinrichtungen. GERiT richtet sich an Studierende und Forschende aus dem In- und Ausland.
4 years ago by @jaeschke
show all tags
german
web
dataset
university
academic
institution
research
germanwebdatasetuniversityacademicinstitutionresearch
copydelete
- community post
- history of this post
1German Academic Web | SoBigData.eu
http://www.sobigdata.eu/dataset/german-academic-web
8 years ago by @jaeschke
show all tags
myown
german
web
dataset
sobigdata
academic
gaw
myowngermanwebdatasetsobigdataacademicgaw
copydelete
- community post
- history of this post
1Home · springernature/scigraph Wiki · GitHub
https://github.com/springernature/scigraph/wiki
8 years ago by @hotho
show all tags
science
lod
semantic
owl
web
dataset
bibliographic
scigraph
research
sciencelodsemanticowlwebdatasetbibliographicscigraphresearch
copydelete
- community post
- history of this post
4Web Data Commons
http://webdatacommons.org/
8 years ago by @hotho
show all tags
semantic
rdf
web
dataset
common
data
relations
crawl
semanticrdfwebdatasetcommondatarelationscrawl
copydelete
- community post
- history of this post
2RecSys Challenge 2015 - Challenge
http://2015.recsyschallenge.com/challenge.html
8 years ago by @hotho
show all tags
web
dataset
session
2015
challenge
recsys
webdatasetsession2015challengerecsys
copydelete
- community post
- history of this post
2Net Data Directory
The Net Data Directory collects and shares information on different sources of data about the Internet. For more about the project, see our about page. To get started, use the search box below, or check out our quick start guide.
8 years ago by @jaeschke
show all tags
web
dataset
data
monitor
directory
net
internet
webdatasetdatamonitordirectorynetinternet
copydelete
- community post
- history of this post
4Web Data Commons
http://webdatacommons.org/
10 years ago by @jaeschke
show all tags
lod
semantic
rdf
web
dataset
commoncrawl
data
microformat
open
crawl
linked
lodsemanticrdfwebdatasetcommoncrawldatamicroformatopencrawllinked
copydelete
- community post
- history of this post
1Click Dataset | Center for Complex Networks and Systems Research
http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset/
10 years ago by @hotho
show all tags
web
click
dataset
stream
indiana
traffic
webclickdatasetstreamindianatraffic
copydelete
- community post
- history of this post
1Host Link Graph JISC UK Web Domain Dataset (1996-2010)
UK Web Archive Open Data
10 years ago by @jaeschke
show all tags
web
dataset
uk
data
host
archive
link
graph
jisc
webdatasetukdatahostarchivelinkgraphjisc
copydelete
- community post
- history of this post
1JISC UK Web Domain Dataset (1996-2013)
UK Web Archive Open Data
10 years ago by @jaeschke
show all tags
domain
web
dataset
uk
data
archive
open
jisc
domainwebdatasetukdataarchiveopenjisc
copydelete
- community post
- history of this post
1WDC - Hyperlink Graphs
This page provides two large hyperlink graph for public download. The graphs have been extracted from the 2012 and 2014 versions of the Common Crawl web corpera. The 2012 graph covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, the graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. The2014 graph covers 1.7 billion web pages connected by 64 billion hyperlinks. Below we provide instructions on how to download the graphs as well as basic statistics about their topology.
10 years ago by @jaeschke
show all tags
web
dataset
link
graph
webdatasetlinkgraph
copydelete
- community post
- history of this post
3WDC - Hyperlink Graph
This page provides a large hyperlink graph for public download. The graph has been extracted from the Common Crawl 2012 web corpus and covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, this graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. Below we provide instructions on how to download the graph as well as basic statistics about its topology.
11 years ago by @hotho
show all tags
hyperlink
web
dataset
graph
hyperlinkwebdatasetgraph
copydelete
- community post
- history of this post
1A Linked-Data-driven and Semantically-enabled Journal Portal for Scientometrics | www.semantic-web-journal.net
http://www.semantic-web-journal.net/blog/linked-data-driven-and-semantically-enabled-journal-portal-scientometrics
11 years ago by @hotho
show all tags
semantic
journal
web
dataset
paper
data
linked
statistics
semanticjournalwebdatasetpaperdatalinkedstatistics
copydelete
- community post
- history of this post
2Web Observatory Wiki
http://wow.west.webobservatory.org/index.php/Main_Page
12 years ago by @jaeschke
show all tags
science
semantic
web
dataset
observatory
wiki
sciencesemanticwebdatasetobservatorywiki
copydelete
- community post
- history of this post
1blekko donates search data to Common Crawl | blekko
Blekko Blog | get the Latest Updates On SEO, Search Engines, SEO Tools, SEO Tutorials, SEO techniques, SEO APIs and much more
12 years ago by @jaeschke
show all tags
web
dataset
search
crawl
blekko
webdatasetsearchcrawlblekko
copydelete
- community post
- history of this post
4ICWSM Datasets
http://icwsm.cs.mcgill.ca/
12 years ago by @folke
show all tags
icwsm
web
dataset
twitter
social
facebook
icwsmwebdatasettwittersocialfacebook
copydelete
- community post
- history of this post
1The ClueWeb09 Dataset
http://www.lemurproject.org/clueweb09.php/
12 years ago by @jaeschke
show all tags
web
dataset
data
big
webdatasetdatabig
copydelete
- community post
- history of this post
4| CommonCrawl
http://commoncrawl.org/
12 years ago by @jaeschke
show all tags
web
dataset
data
crawling
webdatasetdatacrawling
copydelete
- community post
- history of this post
2WebBase Project
http://dbpubs.stanford.edu:8091/~testbed/doc2/WebBase/
12 years ago by @jaeschke
show all tags
web
dataset
stanford
webbase
data
webdatasetstanfordwebbasedata
copydelete
- community post
- history of this post
2Public Data Sets : Amazon Web Services
https://aws.amazon.com/datasets
12 years ago by @jaeschke
show all tags
web
dataset
amazon
data
webdatasetamazondata
copydelete
- community post
- history of this post

⟨⟨
⟨
1
2
⟩
⟩⟩

publications (hide)3
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

2DSDD: Domain-Specific Dataset Discovery on the Web
H. Zhang, A. Santos, and J. Freire. Proceedings of the 30th ACM International Conference on Information &amp$\mathsemicolon$ Knowledge Management, ACM, (October 2021)
2 years ago by @jaeschke
show all tags
unknowndata
web
dataset
data
discovery
crawling
unknowndatawebdatasetdatadiscoverycrawling
copydeleteadd this publication to your clipboard
2Dataset or Not? A Study on the Veracity of Semantic Markup for Dataset Pages
T. Alrashed, D. Paparas, O. Benjelloun, Y. Sheng, and N. Noy. The Semantic Web -- ISWC 2021, page 338--356. Cham, Springer International Publishing, (2021)
2 years ago by @jaeschke
show all tags
unknowndata
web
dataset
semantics
markup
semanticweb
extraction
unknowndatawebdatasetsemanticsmarkupsemanticwebextraction
copydeleteadd this publication to your clipboard
3Where are the Datasets? A case study on the German Academic Web Archive
Y. Younes, S. Tiesler, R. Jäschke, and B. Mathiak. Proceedings of the Web Archiving and Digital Libraries Workshop at JCDL 2022, (2022)
2 years ago by @jaeschke
show all tags
myown
german
unknowndata
web
dataset
academic
2022
gaw
crawl
myowngermanunknowndatawebdatasetacademic2022gawcrawl
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
⟩
⟩⟩

BibSonomy

bookmarks (hide)36
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1GERiT: German Research Institutions | DFG

1German Academic Web | SoBigData.eu

1Home · springernature/scigraph Wiki · GitHub

4Web Data Commons

2RecSys Challenge 2015 - Challenge

2Net Data Directory

4Web Data Commons

1Click Dataset | Center for Complex Networks and Systems Research

1Host Link Graph JISC UK Web Domain Dataset (1996-2010)

1JISC UK Web Domain Dataset (1996-2013)

1WDC - Hyperlink Graphs

3WDC - Hyperlink Graph

1A Linked-Data-driven and Semantically-enabled Journal Portal for Scientometrics | www.semantic-web-journal.net

2Web Observatory Wiki

1blekko donates search data to Common Crawl | blekko

4ICWSM Datasets

1The ClueWeb09 Dataset

4| CommonCrawl

2WebBase Project

2Public Data Sets : Amazon Web Services

publications (hide)3
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

2DSDD: Domain-Specific Dataset Discovery on the Web

2Dataset or Not? A Study on the Veracity of Semantic Markup for Dataset Pages

3Where are the Datasets? A case study on the German Academic Web Archive

15th Discovery Challenge

browse

related tags

tags

bookmarks (hide)36 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide)3 displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

15th Discovery Challenge

browse

related tags

tags

bookmarks (hide)36
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)3
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...