Zanran helps you to find ‘semi-structured’ data on the web. This is the numerical data that people have presented as graphs and tables and charts. For example, the data could be a graph in a PDF report, or a table in an Excel spreadsheet, or a barchart shown as an image in an HTML page. Put more simply: Zanran is Google for data. At present, we extract tables and images from HTML, PDF and Excel files and will be processing PowerPoint and Word documents in the near future.
M. Sereno. (2015)cite arxiv:1509.05778Comment: 13 pages; LIRA package available from https://cran.r-project.org/web/packages/lira/index.html; further material at http://pico.bo.astro.it/~sereno/; v02: 14 pages, typos corrected, added references to change point analysis. In press on MNRAS.