TY - JOUR

T1 - Discrete Eckart-Young theorem for integer matrices

AU - Lin, Matthew M.

PY - 2011/12/1

Y1 - 2011/12/1

N2 - The well-known Eckart-Young theorem asserts t hat the truncated singular value decomposition, obtained by discarding all but the first k largest singular values and their corresponding left and right singular vectors, is the best rank-k approximation in the sense of least squares to the original matrix. In other words, singular values alone serve well as unambiguous indicators of proximity to the data matrix. Unlike continuous data, the decomposition of a matrix with discrete data which is subject to the requirement that its approximations have the same type of data is a harder task and it is even harder when it comes to ranking these approximations. This work generalizes the notion of singular value decomposition via a sequence of variational formulations to discrete-type data. The process itself can guarantee neither the orthogonality, as is expected of discrete data, nor the ordering of best approximations. However, at the end of the undertaking, it is shown that a quantity analogous to the singular values and a truncated low rank factorization for discrete data analogous to the truncated singular value decomposition for continuous data are attainable. Our empirical study shows the applicability of our method to cluster analysis and pattern discovery using real-life data.

AB - The well-known Eckart-Young theorem asserts t hat the truncated singular value decomposition, obtained by discarding all but the first k largest singular values and their corresponding left and right singular vectors, is the best rank-k approximation in the sense of least squares to the original matrix. In other words, singular values alone serve well as unambiguous indicators of proximity to the data matrix. Unlike continuous data, the decomposition of a matrix with discrete data which is subject to the requirement that its approximations have the same type of data is a harder task and it is even harder when it comes to ranking these approximations. This work generalizes the notion of singular value decomposition via a sequence of variational formulations to discrete-type data. The process itself can guarantee neither the orthogonality, as is expected of discrete data, nor the ordering of best approximations. However, at the end of the undertaking, it is shown that a quantity analogous to the singular values and a truncated low rank factorization for discrete data analogous to the truncated singular value decomposition for continuous data are attainable. Our empirical study shows the applicability of our method to cluster analysis and pattern discovery using real-life data.

UR - http://www.scopus.com/inward/record.url?scp=84856252787&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84856252787&partnerID=8YFLogxK

U2 - 10.1137/10081099X

DO - 10.1137/10081099X

M3 - Article

AN - SCOPUS:84856252787

VL - 32

SP - 1367

EP - 1382

JO - SIAM Journal on Matrix Analysis and Applications

JF - SIAM Journal on Matrix Analysis and Applications

SN - 0895-4798

IS - 4

ER -