Abstract:Locally linear embedding is a kind of very competitive nonlinear dimensionality reduction with good representational capacity for a broader range of manifolds and high computational efficiency. However, they are based on the assumption that the whole data manifolds are evenly distributed so that they determine the neighborhood for all points with the same neighborhood size. Accordingly, they fail to nicely deal with most real problems that are unevenly distributed. This paper presents a new approach that takes the general conceptual framework of Hessian locally linear embedding so as to guarantee its correctness in the setting of local isometry to an open connected subset but dynamically determines the local neighborhood size for each point. This approach estimates the approximate geodesic distance between any two points by the shortest path in the local neighborhood graph, and then determines the neighborhood size for each point by using the relationship between its local estimated geodesic distance matrix and local Euclidean distance matrix. This approach has clear geometry intuition as well as the better performance and stability to deal with the sparsely sampled or noise contaminated data sets that are often unevenly distributed. The conducted experiments on benchmark data sets validate the proposed approach.