Choosing data vectors representing a huge data set. Kohonen's SOM applied to the Kefallinia erosion data.

Citation:

Bartkowiak, A., Vassilopoulos, A., & Evelpidou, N. (2003). Choosing data vectors representing a huge data set. Kohonen's SOM applied to the Kefallinia erosion data.. In 1st International Conference on Environmental Research & Assessment.

Abstract:

We consider a large set of data comprising N=3422 data vectors, each containing observations on p=3 variables. We find for these data representative data vectors. We do it by employing the methodology of Kohonen's self-organizing maps. The found representative data vectors are called codebook vectors. In particular we analyze two collections (assemblages) of codebook vectors counting m=275 and m=120 elements. The quantity of the representation is measured by evaluating two errors: the quantization error q1 and the topological error q2. We show for our data that the magnitude of these errors depends on the way the original data were
standardized. After a thorough graphical analysis of the results we came to the conclusion that codebook vectors obtained from data standardized by range yield a little better representation as those do which were obtained from data standardized by variance. None of the representations is satisfactory from our point of view.

Niki Evelpidou

Professor, Geology and Geoenvironment

Choosing data vectors representing a huge data set. Kohonen's SOM applied to the Kefallinia erosion data.

Citation:

Abstract:

Recent Publications

Publications per year

National and Kapodistrian University of Athens