<?xml version="1.0" ?>
<!-- This file was exported from BibSonomy, http://www.bibsonomy.org -->

<bibliography>

<biblioentry xreflabel="bernerslee1998uri" id="bernerslee1998uri">
   <authorgroup>
       <author><firstname>Tim</firstname><surname>Berners&#45;Lee</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Cool URIs don&#39;t change</citetitle>

   <publisher>
      <publishername>World Wide Web Consortium</publishername>
   </publisher>



   <pubdate>1998</pubdate>  

</biblioentry>
<biblioentry xreflabel="citeulike:348222" id="citeulike:348222">
   <authorgroup>
       <author><firstname>Allan</firstname><surname>Borodin</surname></author>
       <author><firstname>Gareth</firstname><othername role="mi">O.</othername><surname>Roberts</surname></author>
       <author><firstname>Jeffrey</firstname><othername role="mi">S.</othername><surname>Rosenthal</surname></author>
       <author><firstname>Panayiotis</firstname><surname>Tsaparas</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Link Analysis Ranking Algorithms Theory And Experiments</citetitle>





   <pubdate>2005</pubdate>  
   <abstract>
      <para>The explosive growth and the widespread accessibility of the Web has led to surge of research activity in the area of&#10;information retrieval on the World Wide Web. The seminal papers of Kleinberg [31]&#44; and Brin and Page [9] introduced Link&#10;Analysis Ranking&#44; where hyperlink structures are used to determine the relative authority of a Web page&#44; and produce improved&#10;algorithms for the ranking of Web search results. In this paper we work within the hubs and authorities framework defined&#10;by...
      </para>
   </abstract>
</biblioentry>
<biblioentry xreflabel="citeulike:525472" id="citeulike:525472">
   <authorgroup>
       <author><firstname>A.</firstname><surname>Capocci</surname></author>
       <author><firstname>V.</firstname><othername role="mi">D. P.</othername><surname>Servedio</surname></author>
       <author><firstname>F.</firstname><surname>Colaiori</surname></author>
       <author><firstname>L.</firstname><othername role="mi">S.</othername><surname>Buriol</surname></author>
       <author><firstname>D.</firstname><surname>Donato</surname></author>
       <author><firstname>S.</firstname><surname>Leonardi</surname></author>
       <author><firstname>G.</firstname><surname>Caldarelli</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Preferential attachment in the growth of social networks: the case of Wikipedia</citetitle>





   <pubdate>2006</pubdate>  
   <abstract>
      <para>We present an analysis of the statistical properties and growth of the free on&#45;line encyclopedia Wikipedia. By describing topics by vertices and hyperlinks between them as edges&#44; we can represent this encyclopedia as a directed graph. The topological properties of this graph are in close analogy with that of the World Wide Web&#44; despite the very different growth mechanism. In particular we measure a scale&#38;&#35;x2013;invariant distribution of the in&#38;&#35;x2013; and out&#38;&#35;x2013; degree and we are able to reproduce these features by means of a simple statistical model. As a major consequence&#44; Wikipedia growth can be described by local rules such as the preferential attachment mechanism&#44; though users can act globally on the network.
      </para>
   </abstract>
</biblioentry>
<biblioentry xreflabel="citeulike:111664" id="citeulike:111664">
   <authorgroup>
       <author><firstname>Soumen</firstname><surname>Chakrabarti</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Mining the Web: Analysis of Hypertext and Semi Structured Data</citetitle>

   <publisher>
      <publishername>Morgan Kaufmann</publishername>
   </publisher>



   <pubdate>2002</pubdate>  
   <abstract>
      <para>Mining the Web: Discovering Knowledge from Hypertext Data is the first book devoted entirely to techniques for producing knowledge from the vast body of unstructured Web data. Building on an initial survey of infrastructural issuesincluding Web crawling and indexingChakrabarti examines low&#45;level machine learning techniques as they relate specifically to the challenges of Web mining. He then devotes the final part of the book to applications that unite infrastructure and analysis to bring machine learning to bear on systematically acquired and stored data. Here the focus is on results: the strengths and weaknesses of these applications&#44; along with their potential as foundations for further progress. From Chakrabarti&#39;s workpainstaking&#44; critical&#44; and forward&#45;lookingreaders will gain the theoretical and practical understanding they need to contribute to the Web mining effort.&lt;br&gt;&lt;br&gt;* A comprehensive&#44; critical exploration of statistics&#45;based attempts to make sense of Web Mining.&lt;br&gt;* Details the special challenges associated with analyzing unstructured and semi&#45;structured data.&lt;br&gt;* Looks at how classical Information Retrieval techniques have been modified for use with Web data.&lt;br&gt;* Focuses on today&#39;s dominant learning methods: clustering and classification&#44; hyperlink analysis&#44; and supervised and semi&#45;supervised learning.&lt;br&gt;* Analyzes current applications for resource discovery and social network analysis.&lt;br&gt;* An excellent way to introduce students to especially vital applications of data mining and machine learning technology.&lt;/li&gt;&lt;/ul&gt;
      </para>
   </abstract>
</biblioentry>
<biblioentry xreflabel="citeulike:631058" id="citeulike:631058">
   <authorgroup>
       <author><firstname>S.</firstname><surname>Chakrabarti</surname></author>
       <author><firstname>B.</firstname><othername role="mi">E.</othername><surname>Dom</surname></author>
       <author><firstname>S.</firstname><othername role="mi">R.</othername><surname>Kumar</surname></author>
       <author><firstname>P.</firstname><surname>Raghavan</surname></author>
       <author><firstname>S.</firstname><surname>Rajagopalan</surname></author>
       <author><firstname>A.</firstname><surname>Tomkins</surname></author>
       <author><firstname>D.</firstname><surname>Gibson</surname></author>
       <author><firstname>J.</firstname><surname>Kleinberg</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Mining the Web&#39;s link structure</citetitle>
   <citetitle pubwork="journal">Computer</citetitle>

   <volumenum>32</volumenum> 

   <artpagenums>60&#x2013;67</artpagenums> 
   <pubdate>1999</pubdate>  
   <abstract>
      <para>The Web is a hypertext body of approximately 300 million pages that continues to grow at roughly a million pages per day. Page variation is more prodigious than the data&#39;s raw scale: taken as a whole&#44; the set of Web pages lacks a unifying structure and shows far more authoring style and content variation than that seen in traditional text document collections. This level of complexity makes an &#38;&#35;x0026;ldquo;off&#45;the&#45;shelf&#38;&#35;x0026;rdquo; database management and information retrieval solution impossible. To date&#44; index based search engines for the Web have been the primary tool by which users search for information. Such engines can build giant indices that let you quickly retrieve the set of all Web pages containing a given word or string. Experienced users can make effective use of such engines for tasks that can be solved by searching for tightly constrained key words and phrases. These search engines are&#44; however&#44; unsuited for a wide range of equally important tasks. In particular&#44; a topic of any breadth will typically contain several thousand or million relevant Web pages. How then&#44; from this sea of pages&#44; should a search engine select the correct ones&#45;those of most value to the user&#63; Clever is a search engine that analyzes hyperlinks to uncover two types of pages: authorities&#44; which provide the best source of information on a given topic; and hubs&#44; which provide collections of links to authorities. We outline the thinking that went into Clever&#39;s design&#44; report briefly on a study that compared Clever&#39;s performance to that of Yahoo and AltaVista&#44; and examine how our system is being extended and updated
      </para>
   </abstract>
</biblioentry>
<biblioentry xreflabel="citeulike:542510" id="citeulike:542510">
   <authorgroup>
       <author><firstname>Soumen</firstname><surname>Chakrabarti</surname></author>
       <author><firstname>Mukul</firstname><othername role="mi">M.</othername><surname>Joshi</surname></author>
       <author><firstname>Kunal</firstname><surname>Punera</surname></author>
       <author><firstname>David</firstname><othername role="mi">M.</othername><surname>Pennock</surname></author> 
   </authorgroup>
<citetitle pubwork="article">The structure of broad topics on the web</citetitle>

   <publisher>
      <publishername>ACM Press</publishername>
   </publisher>


   <artpagenums>251&#x2013;262</artpagenums> 
   <pubdate>2002</pubdate>  

</biblioentry>
<biblioentry xreflabel="craswell:esf" id="craswell:esf">
   <authorgroup>
       <author><firstname>N.</firstname><surname>Craswell</surname></author>
       <author><firstname>D.</firstname><surname>Hawking</surname></author>
       <author><firstname>S.</firstname><surname>Robertson</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Effective Site Finding using Link Anchor Information</citetitle>





   <pubdate>2001</pubdate>  

</biblioentry>
<biblioentry xreflabel="citeulike:609165" id="citeulike:609165">
   <authorgroup>
       <author><firstname>Lise</firstname><surname>Getoor</surname></author>
       <author><firstname>Christopher</firstname><othername role="mi">P.</othername><surname>Diehl</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Link mining: a survey</citetitle>
   <citetitle pubwork="journal">SIGKDD Explor. Newsl.</citetitle>
   <publisher>
      <publishername>ACM Press</publishername>
   </publisher>
   <volumenum>7</volumenum> 

   <artpagenums>3&#x2013;12</artpagenums> 
   <pubdate>2005</pubdate>  

</biblioentry>
<biblioentry xreflabel="gleim:wcm" id="gleim:wcm">
   <authorgroup>
       <author><firstname>R.</firstname><surname>Gleim</surname></author>
       <author><firstname>A.</firstname><surname>Mehler</surname></author>
       <author><firstname>M.</firstname><surname>Dehmer</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Web Corpus Mining by instance of Wikipedia</citetitle>
   <citetitle pubwork="journal">Web as Corpus</citetitle>




   <pubdate>2006</pubdate>  

</biblioentry>
<biblioentry xreflabel="citeulike:348187" id="citeulike:348187">
   <authorgroup>
       <author><firstname>Eric</firstname><othername role="mi">J.</othername><surname>Glover</surname></author>
       <author><firstname>Kostas</firstname><surname>Tsioutsiouliklis</surname></author>
       <author><firstname>Steve</firstname><surname>Lawrence</surname></author>
       <author><firstname>David</firstname><othername role="mi">M.</othername><surname>Pennock</surname></author>
       <author><firstname>Gary</firstname><othername role="mi">W.</othername><surname>Flake</surname></author> 
   </authorgroup>
<citetitle pubwork="article">Using Web structure for classifying and describing Web pages</citetitle>





   <pubdate>2002</pubdate>  
   <abstract>
      <para>The structure of the web is increasingly being used to improve organization&#44; search&#44; and analysis of information on the web. For example&#44; Google uses the text in citing documents (documents that link to the target document) for search. We analyze the relative utility of document text&#44; and the text in citing documents near the citation&#44; for classification and description. Results show that the text in citing documents&#44; when available&#44; often has greater discriminative and descriptive power than...
      </para>
   </abstract>
</biblioentry>
</bibliography>
