Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Hama (means a hippopotamus in Korean) is a parallel matrix computation package currently in incubation with Apache. It is a library of matrix operations for large-scale processing and development environments as well as a Map/Reduce framework for a large-scale numerical analysis and data mining, that need the intensive computation power of matrix inversion, e.g., linear regression, PCA, SVM and etc. It will be useful for many scientific applications, e.g., physics computations, linear algebra, computational fluid dynamics, statistics, graphic rendering and many more.
Opticks is an expandable remote sensing and imagery analysis software platform that is free and open source. If you've used other commercial tools like: ERDAS IMAGINE, RemoteView, ENVI, or SOCET GXP, then you need to give Opticks a try. Unlike other competing tools, you can add capability to Opticks by creating an extension. Opticks provides the most advanced extension capability of any other remote sensing tool on the market.
This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you’ll learn how to clean data and draw plots—and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You’ll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You’ll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualising, and exploring data.
Streamlining the incident post-mortem process is key to helping teams get the most from their post-mortem time investment and learn from previous issues. Read on to learn why you should conduct post-mortems, best practices to follow, and what blameless post-mortems are all about.
K. Wolff. Proceedings of the 12th International Conference on Conceptual Structures (ICCS 2004), volume 3127 of Lecture Notes in Computer Science, page 126-141. Springer, (2004)