如何降维_CDA答疑社区

啊啊啊啊啊吖

2018-12-06 阅读量: 750

如何降维

有时候，数据的“真实”（或有用的）维度与我们掌握的数据维度并不相符。

现在，已有一个去均值的矩阵 X，我们想问，最能抓住数据最大变差的方向是什么？

具体来说，给定一个方向 d（一个绝对值为 1 的向量），矩阵的每行 x 在方向 d 的扩展是点

积 dot(x, d)。并且如果将每个非零向量 w 的绝对值大小调整为 1，则它们每个都决定了

一个方向：

def direction(w):
mag = magnitude(w)
return [w_i / mag for w_i in w]

因此，已知一个非零向量 w，我们可以计算 w 方向上的方差：

def directional_variance_i(x_i, w):
"""the variance of the row x_i in the direction determined by w"""
return dot(x_i, direction(w)) ** 2
def directional_variance(X, w):
"""the variance of the data in the direction determined w"""
return sum(directional_variance_i(x_i, w)
for x_i in X)

在这点上，我们可以通过对 remove_projection 的结果重复这个过程来找到其他的主成分

在更高维的数据集中，我们可以通过迭代找到我们所需的任意数目的主成分：

def principal_component_analysis(X, num_components):
components = []
for _ in range(num_components):
component = first_principal_component(X)
components.append(component)
X = remove_projection(X, component)
return components

然后再将原数据转换为由主成分生成的低维空间中的点：

def transform_vector(v, components):
return [dot(v, w) for w in components]
def transform(X, components):
return [transform_vector(x_i, components) for x_i in X]

0.0000

关注作者

发表评论

暂无数据

CDA考试动态

CDA报考指南

推荐帖子