| Authors: |
I K Fodor
|
| URL: |
http://www.osti.gov/energycitations/product.biblio.jsp?osti_id=15002155 |
| Tags: |
imported
|
| Abstract: |
Advances in data collection and storage capabilities during the past
decades have led to an information overload in most sciences. Researchers
working in domains as diverse as engineering, astronomy, biology,
remote sensing, economics, and consumer transactions, face larger
and larger observations and simulations on a daily basis. Such datasets,
in contrast with smaller, more traditional datasets that have been
studied extensively in the past, present new challenges in data
analysis. Traditional statistical methods break down partly because
of the increase in the number of observations, but mostly because
of the increase in the number of variables associated with each
observation. The dimension of the data, is the number of variables
that are measured on each observation. High-dimensional datasets
present many mathematical challenges as well as some opportunities,
and are bound to give rise to new theoretical developments. One
of the problems with high-dimensional datasets is that, in many
cases, not all the measured variables are ''important'' for understanding
the underlying phenomena of interest. While certain computationally
expensive novel methods can construct predictive models with high
accuracy from high-dimensional data, it is still of interest in
many applications to reduce the dimension of the original data prior
to any modeling of the data. In this paper, we described several
dimension reduction methods.
2002 May 09
OSTI IdentifierOSTI ID: 15002155
Report Number(s)UCRL-ID-148494
DOE Contract NumberW-7405-ENG-48
Other Number(s)TRN: US200408%%150
Resource TypeTechnical Report
Resource RelationPBD: 9 May 2002
Research OrgLawrence Livermore National Lab., CA (US)
Sponsoring OrgUS Department of Energy (US)
Subject99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION
SCIENCE ; ACCURACY; ASTRONOMY; BIOLOGY; DATA ANALYSIS; DIMENSIONS;
ECONOMICS; REMOTE SENSING; SIMULATION; STORAGE
Description/
Abstract Advances in data collection and storage capabilities during
the past decades have led to an information overload in most sciences.
Researchers working in domains as diverse as engineering, astronomy,
biology, remote sensing, economics, and consumer transactions, face
larger and larger observations and simulations on a daily basis.
Such datasets, in contrast with smaller, more traditional datasets
that have been studied extensively in the past, present new challenges
in data analysis. Traditional statistical methods break down partly
because of the increase in the number of observations, but mostly
because of the increase in the number of variables associated with
each observation. The dimension of the data, is the number of variables
that are measured on each observation. High-dimensional datasets
present many mathematical challenges as well as some opportunities,
and are bound to give rise to new theoretical developments. One
of the problems with high-dimensional datasets is that, in many
cases, not all the measured variables are ''important'' for understanding
the underlying phenomena of interest. While certain computationally
expensive novel methods can construct predictive models with high
accuracy from high-dimensional data, it is still of interest in
many applications to reduce the dimension of the original data prior
to any modeling of the data. In this paper, we described several
dimension reduction methods.
Country of PublicationUnited States
LanguageEnglish
FormatPDF-FILE: 27 ; SIZE: 1.3 MBYTES pages ; PDFN
System Entry Date2004 Mar 01 |
@techreport{Fodor02DRSurvey,
title = {A Survey of Dimension Reduction Techniques},
author = {I K Fodor},
institution = {Lawrence Livermore National Lab., CA (US)},
url = {http://www.osti.gov/energycitations/product.biblio.jsp?osti_id=15002155},
year = {2002},
abstract = {Advances in data collection and storage capabilities during the past
decades have led to an information overload in most sciences. Researchers
working in domains as diverse as engineering, astronomy, biology,
remote sensing, economics, and consumer transactions, face larger
and larger observations and simulations on a daily basis. Such datasets,
in contrast with smaller, more traditional datasets that have been
studied extensively in the past, present new challenges in data
analysis. Traditional statistical methods break down partly because
of the increase in the number of observations, but mostly because
of the increase in the number of variables associated with each
observation. The dimension of the data, is the number of variables
that are measured on each observation. High-dimensional datasets
present many mathematical challenges as well as some opportunities,
and are bound to give rise to new theoretical developments. One
of the problems with high-dimensional datasets is that, in many
cases, not all the measured variables are ''important'' for understanding
the underlying phenomena of interest. While certain computationally
expensive novel methods can construct predictive models with high
accuracy from high-dimensional data, it is still of interest in
many applications to reduce the dimension of the original data prior
to any modeling of the data. In this paper, we described several
dimension reduction methods.
2002 May 09
OSTI IdentifierOSTI ID: 15002155
Report Number(s)UCRL-ID-148494
DOE Contract NumberW-7405-ENG-48
Other Number(s)TRN: US200408%%150
Resource TypeTechnical Report
Resource RelationPBD: 9 May 2002
Research OrgLawrence Livermore National Lab., CA (US)
Sponsoring OrgUS Department of Energy (US)
Subject99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION
SCIENCE ; ACCURACY; ASTRONOMY; BIOLOGY; DATA ANALYSIS; DIMENSIONS;
ECONOMICS; REMOTE SENSING; SIMULATION; STORAGE
Description/
Abstract Advances in data collection and storage capabilities during
the past decades have led to an information overload in most sciences.
Researchers working in domains as diverse as engineering, astronomy,
biology, remote sensing, economics, and consumer transactions, face
larger and larger observations and simulations on a daily basis.
Such datasets, in contrast with smaller, more traditional datasets
that have been studied extensively in the past, present new challenges
in data analysis. Traditional statistical methods break down partly
because of the increase in the number of observations, but mostly
because of the increase in the number of variables associated with
each observation. The dimension of the data, is the number of variables
that are measured on each observation. High-dimensional datasets
present many mathematical challenges as well as some opportunities,
and are bound to give rise to new theoretical developments. One
of the problems with high-dimensional datasets is that, in many
cases, not all the measured variables are ''important'' for understanding
the underlying phenomena of interest. While certain computationally
expensive novel methods can construct predictive models with high
accuracy from high-dimensional data, it is still of interest in
many applications to reduce the dimension of the original data prior
to any modeling of the data. In this paper, we described several
dimension reduction methods.
Country of PublicationUnited States
LanguageEnglish
FormatPDF-FILE: 27 ; SIZE: 1.3 MBYTES pages ; PDFN
System Entry Date2004 Mar 01},
owner = {mgrani}, timestamp = {2006.05.11},
keywords = {imported }
}