Stellar Spectral Classification
An analysis of spectral data
and fundamental parameters of stars.
- Centro de Investigaciones de Astronomia, Venezuela.
For details see:
J.Stock and M.J.Stock. Quantitative Stellar Spectral Classification.
Revista Mexicana de Astronomia y Astrofisica
, 34, 143-156, 1999.
Linear analysis of the spectral data space by PCA
The PCA - Principal Component Analysis is a linear technique to achieve
orthogonal directions of highest variance in the data space.
The two dimensional subspace spanned by the first two features of PCA are plotted.
The first feature is the direction of highest variance in the spectral space.
The second feature is the direction of highest variance in the othogonal subspace
to the first feature.
|| The colors illustrate different spectral type groups:|
| cyan || - ||very early|
| red|| - ||early |
| magenta || - ||stars
which can not be classified exactly because of the blue and red spectrum|
| blue || - ||late |
| green|| - ||very late |
| black|| - ||stars with very low temperature |
Fundamental parameters to first feature of PCA
The first feature of PCA is plotted against the fundamental parameters.
The most interesting diagram is on the left
where the first feature is plotted against the absolute magnitude.
There are some similarities to a Herzsprung-Russel diagram.
There exist something like a 'main sequence' and below a cluster
which is maybe represented by 'giants'.
Nonlinear analysis of spectral data space by NLPCA
NLPCA - Nonlinear Principal Component Analysis is a nonlinear feature extraction technique
similar to PCA except that the features can be nonlinear.
In the middle the data is plotted in the space spanned by the
first three features of linear PCA.
On the left again the data is plotted and additional the first three features of PCA
are plotted as grids. At each grid two features are plotted and the third is set to zero.
The grids represent the new coordinate system after the transformation.
The linear PCA does not describe the characteristics of the data very well. On the right
the first three features of NLPCA are plotted. The NLPCA describes the data much better.
Artificial neural networks are used for the NLPCA transformation,
Prediction of fundamental parameters
The aim is to predict fundamental parameters from measured spectral data. This means
that for a given 19-dimensional spectral data vector the respectiv value of a fundamental
parameter shall be predicted by a suitable method. Since such a method shall be used
for new unknown stars, a estimation of the prediction error is significant. For that a
method of the more general cross-validation
is used. At this method a single data is omitted from the data set. The prediction
methode is adjusted by the residual data and tested on the omitted data. The result
is a test error of a data unknown for the prediction method. This process is repeated
for each data. The estimated prediction error is the mean over all single test square errors.
Below the predictions of the single test data are plotted against there known values.
Such a plot is called a scatter plot. The optimal result is if the data lie on a straight
line implying that the predicted values coincide with the known values.
First row: a linear
methode is used for estimating the fundamental parameters, Second row: a nonlinear neural
network is used.
www.matthias-scholz.de, last modified: October 12, 2000