The CLEVER search engine incorporates several algorithms that make use of the Web's hyperlink structure for discovering high-quality information. It can be exceedingly difficult to locate resources on the World Wide Web that are both high-quality and relevant to a user's informational needs. Traditional automated search methods for locating information on the Web are easily overwhelmed by low-quality and unrelated content. Second generation search engines have to have effective methods for focusing on the most authoritative documents. The rich structure implicit in hyperlinks among Web documents offers a simple, and effective, means to deal with many of these problems. Additional Information: Publications:
Here’s a visualization concept I came up with a while back to look at the way search engines and word-of-mouth affects hit frequency on the iBiblio web-traffic log. iBiblio consists of around 420 sites. Each one of the circles you see represents one of the websites. The size of each pie slice inside grows with respect to the number of hits by individual search engines (see the legend for which ones). The size of the circle grows with respect to the overall number of hits by people other than search engines. Hits are counted by number of unique incoming IP addresses per day. Links get drawn between cliques of websites where more than 1/4th of the unique IP addresses are the same on that day, meaning, more or less, that those sites often share traffic. The total amount of data was around 10TB and the visualization took about a day to process into a static animation. The original is meant to run on a wall-sized (16′x9′) or on our specialized visualization dome.
Data Platform Development The Microsoft Data Platform provides developers with a comprehensive programming framework in which to create data centric solutions that target mobile devices, desktops, Web servers, and enterprise servers. Getting Started with... ADO.NET ADO.NET Data Services ADO.NET Entity Framework LINQ MDAC/WDAC Microsoft Project Code Named "Velocity" SQL Server Driver for PHP SQL Server JDBC Driver SQL Server Native Client XML
The first chapter introduces the problem space in terms of making sense of very large, complex datasets and outlines the vision for visual analytics. T
Gephi is an open-source software for visualizing and analyzing large networks graphs. Gephi uses a 3D render engine to display graphs in real-time and speed up the exploration. Use Gephi to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs.
The announcement this week that Google released a beta version of a robust cloud computing platform called Google App Engine that lets anyone build apps on Googles renowned and highly scalable infrastructure underscored a key trend in the software industry today. Namely that software platforms are moving from their traditional centricity around individually owned and managed computing resources and up into the cloud of the Internet. Googles entry into a space that has been largely dominated so far by Amazon and its Elastic Compute Cloud as well as a few smaller players like Bungee and Heroku has turned the Internet cloud computing space into a fully-fledged industry virtually overnight. What makes these offerings so interesting is their promise to turn enormous amounts of operational competency and accumulated economies of scale (which are enormous in Amazons and Googles cases) into a highly competitive new software platform, akin to Windows or Linux, except entirely hosted off-premises and on the Internet.
Web Schema is a set of extensible schemas that enables webmasters to embed structured data on their web pages for use by search engines and other applications.
Platform for sharing and evaluation of intelligent algorithms. Data mining data, experiments, datasets, performance analysis, data repository, challenges. Research and applications, prediction. Data mining and machine learning
Der Großausfall in der Amazon-Wolke sei nur die Spitze des Eisbergs an Problemen, die Cloud-Computing mit sich bringt, sagte Internetpionier Dave Farber zu ORF.at. Am Montag gab erneut ein Cloud-Anbieter das Geschäftsfeld auf.
Store, share & discover realtime sensor, energy and environment data from objects, devices & buildings around the world. Pachube is a convenient, secure & scalable platform that helps you connect to & build the 'internet of things'
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Community Maps is a mapping site for community groups that allows users to add their own information to the map. This can include local events, organisations, planning applications, history and local shops.
Rapidant is a parallel-TCP-based, S/W High-Speed data transfer platform.
It's purpose is to transfer massive data rapidly by consuming as much available bandwidth as possible.
And this project provides an implementation of the Rapidant protocol using java.
Developers Blog : http://www.facebook.com/rapidant
Key Features
* Fast data transfer based on parallel TCP
* Efficient data transfer using real-time compression of data
* Server-client architecture, in which the server supports multiple clients
* Available either as an independent application or as a library for other application
* Pure java implementation for working on various platforms
Die Welt, in der wir leben, ist voller Daten und sie werden immer mehr. Eine aktuelle Bestandsaufnahme zeigt, wie viel die Menschheit insgesamt speichern, übertragen und berechnen kann. Seit 1986 ist die Speicherkapazität pro Jahr um 23 Prozent gestiegen - die Übertragungsraten um 28, die Rechenleistung sogar um 58 Prozent.