An archive of all O'Reilly data ebooks is available below for free download. Dive deep into the latest in data science and big data, compiled by O'Reilly editors, authors, and Strata speakers.
What is Cross-Validation? In Machine Learning, Cross-validation is a resampling method used for model evaluation to avoid testing a model on the same dataset on which it was trained. This is a common mistake, especially that a separate testing dataset is not always available. However, this usually leads to inaccurate performance measures (as the model…
JASP is an open-source statistics program that is free, friendly, and flexible. Armed with an easy-to-use GUI, JASP allows both classical and Bayesian analyses.
Email marketing & newsletter analytics. Track more than just opening & click rates. Beautiful statistics & reports. Easy to setup. Works with all email service providers.
Training material for Apache Flink™, including environment setup implementation of programs using the Flink APIs, and packaging, execution and monitoring of Flink programs.
Publisher: CRC Press (Taylor and Francis Group)Cat. #: 2880
ISBN: 9780849328800
ISBN 10: 0849328802
Publication Date: March 27, 2009
Number of Pages: 395
Map-Reduce is on its way out. But we shouldn’t measure its importance in the number of bytes it crunches, but the fundamental shift in data processing architectures it helped popularise.
PSPP is a program for statistical analysis of sampled data. It is a Free replacement for the proprietary program SPSS, and appears very similar to it with a few exceptions.
Grace is a WYSIWYG 2D plotting tool for the X Window System and M*tif. Grace runs on practically any version of Unix-like OS. As well, it has been successfully ported to VMS, OS/2, and Win9*/NT/2000/XP (some minor functionality may be missing, though).
Twitter wird sein frisch eingekauftes Echtzeit-DV-System Storm als Open Source veröffentlichen. Damit wird die Technik für die Parallelisierung von Datenbankabfragen für alle verfügbar.
The CLEVER search engine incorporates several algorithms that make use of the Web's hyperlink structure for discovering high-quality information. It can be exceedingly difficult to locate resources on the World Wide Web that are both high-quality and relevant to a user's informational needs. Traditional automated search methods for locating information on the Web are easily overwhelmed by low-quality and unrelated content. Second generation search engines have to have effective methods for focusing on the most authoritative documents. The rich structure implicit in hyperlinks among Web documents offers a simple, and effective, means to deal with many of these problems. Additional Information: Publications:
Gephi is an open-source software for visualizing and analyzing large networks graphs. Gephi uses a 3D render engine to display graphs in real-time and speed up the exploration. Use Gephi to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs.
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
SNAPP is a software tool that allows users to visualize the network of interactions resulting from discussion forum posts and replies. The network visualisations of forum interactions provide an opportunity for teachers to rapidly identify patterns of user behaviour – at any stage of course progression. SNAPP has been developed to extract all user interactions from various commercial and open source learning management systems (LMS) such as BlackBoard (including the former WebCT), and Moodle. SNAPP is compatible for both Mac and PC users and operates in Internet Explorer, Firefox and Safari.
J. Choi, A. Khlif, and E. Epure. Proceedings of the 1st Workshop on NLP for Music and Audio (NLP4MusA), page 23--27. Online, Association for Computational Linguistics, (2020)
J. Choi, A. Khlif, and E. Epure. Proceedings of the 1st Workshop on NLP for Music and Audio (NLP4MusA), page 23--27. Online, Association for Computational Linguistics, (2020)