clustering - How to interpret the clusplot in R - Cross Validated The clusplot uses PCA to draw the data It uses the first two principal components to explain the data You can read more about it here Making sense of principal component analysis, eigenvectors eigenvalues Principal components are the (orthogonal) axes that along them the data has the most variability, if your data is 2d then using two principal components can explain the whole variability
Choosing the right linkage method for hierarchical clustering I am performing hierarchical clustering on data I've gathered and processed from the reddit data dump on Google BigQuery My process is the following: Get the latest 1000 posts in r politics Gat
How to interpret dendrogram height for clustering by correlation clus <- hcluster(df, method = 'corr') And this is the plot of clus: This df is actually one of 69 cases I'm doing cluster analysis on To come up with a cutoff point, I have looked at several dendograms and played around with the h parameter in cutree until I was satisfied with a result that made sense for most cases That number was k = 5
Resources for learning about multiple-target techniques? I am looking for resources (books, lecture notes, etc ) about techniques that can handle data that have multiple-targets (Ex: three dependent variable: 2 discrete and 1 continuous) Does anyone h