This is a repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
The Semantic Web isn't just about putting data on the web. It
is about making links, so that a person or machine can explore the web
of data. With linked data, when you have some of it, you can
find other, related, data.
Workshop Topics
Possible topics of the workshop include (but are not limited to):
* Social network analysis
* Bibliometrics
* Community discovery
* Personalization for search and for social interaction
* Recommender systems
* Web mining algorithms
* Applications of social network analysis
* Mining (Collaborative) Tagging Systems (blogs, wikis, etc.)
* Mining social data for multimedia information retrieval
* Opinion mining
Query log data for ad targeting
A WWW2006 paper out of Microsoft Research, "Finding Advertising Keywords on Web Pages" (PDF), claims that query log data is particularly useful for ad targeting.
Specifically, the researchers extracted from MSN query logs the keywords some people used to find a given page. They tested using that as one of many features for ad targeting. In their results, it was one of the most effective features.
Very interesting. It has always been harder to target ads to content than to search results because intent is much less clear.
By using the query log data in this way, the researchers were effectively using the intent of the searchers that arrived at the page as a proxy for the intent of everyone who arrived at the page.
Web search engines have changed our lives - enabling instant access to information about subjects that are both deeply important to us, as well as passing whims. The search engines that provide answers to our search queries also log those queries, in order to improve their algorithms. Academic research on search queries has shown that they can provide valuable information on diverse topics including word and phrase similarity, topical seasonality and may even have potential for sociology, as well as providing a barometer of the popularity of many subjects. At the same time, individuals are rightly concerned about what the consequences of accidental leaking or deliberate sharing of this information may mean for their privacy. In this talk I will cover the applications which have benefited from mining query logs, the risks that privacy can be breached by sharing query logs, and current algorithms for mining logs in a way to prevent privacy breaches.
Data Mining, Analytics, and Databases
Databases are the workhorse of the enterprise today. Searching through databases and finding useful information has become a big computational challenge. Researchers from academia and Microsoft, Oracle, SAP, and many other corporations are looking to CUDA-enabled GPUs to find a scalable solution.
Community Maps is a mapping site for community groups that allows users to add their own information to the map. This can include local events, organisations, planning applications, history and local shops.
Platform for sharing and evaluation of intelligent algorithms. Data mining data, experiments, datasets, performance analysis, data repository, challenges. Research and applications, prediction. Data mining and machine learning
G. Gottlob, C. Koch, R. Baumgartner, M. Herzog, and S. Flesca. Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium
on Principles of Database Systems, June 14-16, 2004, Paris, France, page 1-12. ACM, (2004)
S. Simoff. Proceedings of the MDKM/KDD2000 Workshop on
Multimedia Data Mining, page 104--109. www.cs.ualberta.ca/~zaiane/mdm\_kdd2000/mdm00-15.pdf, (2000)
A. Arasu, and H. Garcia-Molina. Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, California, USA, June 9-12, 2003, page 337-348. ACM, (2003)