Does Kmeans Normalize Features Automatically In Sklearn
Solution 1:
One differentiates data preprocessing (normalization, binning, weighting etc) and machine learning algorithms application. Use sklearn.preprocessing
for data preprocessing. Moreover, data can be preprocessed in chain by different preprocessors.
As for K-means, often it is not sufficient to normalize only mean. One normalizes data equalizing variance along different features as K-means is sensitive to variance in data, and features with larger variance have more emphasis on result. So for K-means, I would recommend using StandardScaler
for data preprocessing.
Don't forget also that k-means results are sensitive to the order of observations, and it is worth to run algorithm several times, shuffling data in between, averaging resulting clusters and running final evaluations with those averaged clusters centers as starting points.
Post a Comment for "Does Kmeans Normalize Features Automatically In Sklearn"