I liked the SODA2009 paper Sorting and Selection in Posets since it makes you look at ordinary sorting in a new way. What does it mean to sort a list? We usually think (correctly) that we take a list of ordered objects and put them into an ordered list. How to extend this to partial orders? We need to re-look at total orders. Say that making a comparison between elements of the ordered set is HARD Then you want to make as few as possible. But you want to, when you are done, have a data structure that makes comparisons easy. Hence we view sorting as follows: Given a set A of n elements from a totally ordered set come up with a data structure of size O(n) such that the operation Given x,y ∈ A which one is bigger? can be done in O(1) time. While setting up the data structure you would like to to do this with as few comparisons as possible. We will assume that comparing two elements of {1,...,n} is easy.
The PageRank algorithm is a great way of using collective intelligence to determine the importance of a webpage. There’s a big problem, though, which is that PageRank is difficult to apply to the web as a whole, simply because the web contains so many webpages. While just a few lines of code can be used to implement PageRank on collections of a few thousand webpages, it’s trickier to compute PageRank for larger sets of pages. The underlying problem is that the most direct way to compute the PageRank of n webpages involves inverting an n \times n matrix. Even when n is just a few thousand, this means inverting a matrix containing millions or tens of millions of floating point numbers. This is possible on a typical personal computer, but it’s hard to go much further. In this post, I describe how to compute PageRank for collections containing millions of webpages. My little laptop easily coped with two million pages, using about 650 megabytes of RAM and a few hours of computation
EXTENSIBLE PARSING & TRANSFORMATION We present the metafront tool for specifying flexible, safe, and efficient syntactic transformations between languages defined by context-free grammars. The transformations are guaranteed to terminate and to map grammatically legal input to grammatically legal output. We rely on a novel parser algorithm, specificity parsing, that is designed to support gradual extensions of a grammar by allowing productions to remain in a natural style and by statically reporting ambiguities and errors in terms of individual productions as they are being added. Our tool may be used as a parser generator in which the resulting parser automatically supports a flexible, safe, and efficient macro processor, or as an extensible lightweight compiler generator for domain-specific languages. We show substantial examples of both kinds.
Following the success of last season, my thoughts obviously turned to whether this can be repeated. Or more accurately, what can be expected from the algorithm betting model in future seasons both in terms of return and the variability of those returns. “Monte Carlo” is the name given to simulations which make use of computer generated random numbers to identify the range of possible outputs a model may generate in the ‘real world’. Each random number generates an input from a user defined probability distribution which is run through the user’s model to produce a simulated outcome on each run. Run this simulation thousands of times and you generate a probability distribution for the output of your model. Assigning a probability distribution to football match results
Select is a detailed description of an implementation of a worst-case linear time algorithm that finds the kth smallest value in an array. It was primarily done as a demonstration of literate programming using the tool noweb. The "final result" generated files are: * An HTML version of the full document. * A PDF version of the full document, suitable for printing. * A compiling C++ source file which implements the algorithm. All of these were generated from the single noweb file select.nw.
The Little Book of Semaphores is a free (in both senses of the word) textbook that introduces the principles of synchronization for concurrent programming. In most computer science curricula, synchronization is a module in an Operating Systems class. OS textbooks present a standard set of problems with a standard set of solutions, but most students don't get a good understanding of the material or the ability to solve similar problems. The approach of this book is to identify patterns that are useful for a variety of synchronization problems and then show how they can be assembled into solutions. After each problem, the book offers a hint before showing a solution, giving students a better chance of discovering solutions on their own. The book covers the classical problems, including "Readers-writers," "Producer-consumer", and "Dining Philosophers." In addition, it collects a number of not-so-classical problems
hdr for video? The work presents a system for automatically producing a wide variety of video enhancements and visual effects. Unlike traditional visual effects software (e.g., After Effects, Shake, Boujou, etc), the system is completely automatic and no manual labor is required from the user. The major limitation of the work is that it can currently handle only videos of static scenes (i.e., videos shot with a moving camera but containing no moving objects in the scene). Efforts are being made to lift this restriction in future work. Applications of the system include: High resolution/definition video, High dynamic range video, Removing objects from a video, Creating painterly (NPR) videos, Video stabilization, Easy video editing Project website: grail.cs.washington.edu/projects/videoenhancement/videoEnhancement.htm