Abstract
Data visualizations can reveal trends and patterns that are not otherwise
obvious from the raw data or summary statistics. While visualizing
low-dimensional data is relatively straightforward (for example, plotting the
change in a variable over time as (x,y) coordinates on a graph), it is not
always obvious how to visualize high-dimensional datasets in a similarly
intuitive way. Here we present HypeTools, a Python toolbox for visualizing and
manipulating large, high-dimensional datasets. Our primary approach is to use
dimensionality reduction techniques (Pearson, 1901; Tipping & Bishop, 1999) to
embed high-dimensional datasets in a lower-dimensional space, and plot the data
using a simple (yet powerful) API with many options for data manipulation e.g.
hyperalignment (Haxby et al., 2011), clustering, normalizing, etc. and plot
styling. The toolbox is designed around the notion of data trajectories and
point clouds. Just as the position of an object moving through space can be
visualized as a 3D trajectory, HyperTools uses dimensionality reduction
algorithms to create similar 2D and 3D trajectories for time series of
high-dimensional observations. The trajectories may be plotted as interactive
static plots or visualized as animations. These same dimensionality reduction
and alignment algorithms can also reveal structure in static datasets (e.g.
collections of observations or attributes). We present several examples
showcasing how using our toolbox to explore data through trajectories and
low-dimensional embeddings can reveal deep insights into datasets across a wide
variety of domains.
Users
Please
log in to take part in the discussion (add own reviews or comments).