国产高清色播视频免费看,韩国免费毛片,在线观看麻豆国产精品

亚洲视频二区_亚洲欧洲日本天天堂在线观看_日韩一区二区在线观看_中文字幕不卡一区

公告：魔扣目錄網為廣大站長提供免費收錄網站服務，提交前請做好本站友鏈：【網站目錄：http://www.430618.com 】，免友鏈快審服務（50元/站），

網站：51998
待審：31
小程序：12
文章：1030137
會員：747

首頁 > 新聞資訊 > IT業界 >正文

t-SNE算法的原理和Python代碼實現詳解

發布時間：2024-03-08 22:27:10 作者：網友整理

T分布隨機鄰域嵌入(t-SNE)，是一種用于可視化的無監督機器學習算法，使用非線性降維技術，根據數據點與特征的相似性，試圖最小化高維和低維空間中這些條件概率(或相似性)之間的差異，以在低維空間中完美表示數據點。

因此，t-SNE擅長在二維或三維的低維空間中嵌入高維數據以進行可視化。需要注意的是，t-SNE使用重尾分布來計算低維空間中兩點之間的相似度，而不是高斯分布，這有助于解決擁擠和優化問題。而且離群值不影響t-SNE。

t-SNE算法步驟

1.找出高維空間中相鄰點之間的成對相似性。

2.根據高維空間中點的成對相似性，將高維空間中的每個點映射到低維映射。

3.使用基于Kullback-Leibler散度(KL散度)的梯度下降找到最小化條件概率分布之間的不匹配的低維數據表示。

4.使用Student-t分布計算低維空間中兩點之間的相似度。

MNIST數據集上實現t-SNE的Python代碼

導入模塊

# Importing Necessary Modules.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.manifold import TSNE
from sklearn.preprocessing import StandardScaler

登錄后復制

讀取數據

# Reading the data using pandas
df = pd.read_csv('mnist_train.csv')

# print first five rows of df
print(df.head(4))

# save the labels into a variable l.
l = df['label']

# Drop the label feature and store the pixel data in d.
d = df.drop("label", axis = 1)

登錄后復制

數據預處理

# Data-preprocessing: Standardizing the data
from sklearn.preprocessing import StandardScaler

standardized_data = StandardScaler().fit_transform(data)
print(standardized_data.shape)

登錄后復制

輸出

# TSNE
# Picking the top 1000 points as TSNE
# takes a lot of time for 15K points
data_1000 = standardized_data[0:1000, :]
labels_1000 = labels[0:1000]

model = TSNE(n_components = 2, random_state = 0)
# configuring the parameters
# the number of components = 2
# default perplexity = 30
# default learning rate = 200
# default Maximum number of iterations
# for the optimization = 1000

tsne_data = model.fit_transform(data_1000)

# creating a new data frame which
# help us in plotting the result data
tsne_data = np.vstack((tsne_data.T, labels_1000)).T
tsne_df = pd.DataFrame(data = tsne_data,
columns =("Dim_1", "Dim_2", "label"))

# Plotting the result of tsne
sn.FacetGrid(tsne_df, hue ="label", size = 6).map(
plt.scatter, 'Dim_1', 'Dim_2').add_legend()

plt.show()

登錄后復制

分享到：

標簽：算法的概念