首页
学习
活动
专区
圈层
工具
发布
    • 综合排序
    • 最热优先
    • 最新优先
    时间不限
  • 来自专栏繁依Fanyi 的专栏

    距离度量 —— 标准化欧氏距离 (Standardized EuclideanDistance)

    一、概述 前面我们提到了 欧式距离,而这里提到的 标准化欧氏距离 (Standardized EuclideanDistance) 是针对 欧式距离 的一种改进。 标准化欧式距离(Standardized EuclideanDistance)主要针对变量 x 进行了修改。使其变成了标准化变量。

    2K10编辑于 2023-05-07
  • 来自专栏深度应用

    [机器学习从零到壹sklearn]·0.1.1·线性拟合

    = X_scaler.transform(X_train) standardized_y_train = y_scaler.transform(y_train.values.reshape(-1,1) ).ravel() standardized_X_test = X_scaler.transform(X_test) standardized_y_test = y_scaler.transform(y_test.values.reshape (-1,1)).ravel() print ("mean:", np.mean(standardized_X_train, axis=0), np.mean(standardized_y_train , axis=0)) # mean should be ~0 print ("std:", np.std(standardized_X_train, axis=0), np.std(standardized_y_train , y=standardized_y_train) pred_train = (lm.predict(standardized_X_train) * np.sqrt(y_scaler.var_)) +

    64730发布于 2019-06-27
  • 来自专栏书山有路勤为径

    Day and Night Image data Standardization

    image_list: image = item[0] label = item[1] # Standardize the image standardized_im it's one hot encoded label to the full, processed list of image data standard_list.append((standardized_im , binary_label)) return standard_list # Standardize all training images STANDARDIZED_LIST = standardize(IMAGE_LIST) Visualize the standardized data # Select an image by index image_num = 0 selected_image = STANDARDIZED_LIST[image_num][0] selected_label = STANDARDIZED_LIST[image_num][1] # Display image

    55020发布于 2018-08-28
  • 来自专栏书山有路勤为径

    Image Representation & Classification(图像表示与分类)

    [image_num][0] test_label = STANDARDIZED_LIST[image_num][1] # Convert to HSV hsv = cv2.cvtColor(test_im = helpers.standardize(IMAGE_LIST) # Visualize the standardized data # Display a standardized image and its label # Select an image by index image_num = 0 selected_image = STANDARDIZED_LIST[image_num][0] selected_label = STANDARDIZED_LIST[image_num][1] # Display image and data about it plt.imshow(selected_image = helpers.standardize(TEST_IMAGE_LIST) # Shuffle the standardized test data random.shuffle(STANDARDIZED_TEST_LIST

    64520发布于 2018-08-27
  • 来自专栏红色石头的机器学习之路

    【完结篇】专栏 | 基于 Jupyter 的特征工程手册:特征降维

    = model.transform(train_set) standardized_test = model.transform(test_set) # 开始压缩特征 compressor = PCA ) # 在训练集上训练 transformed_trainset = compressor.transform(standardized_train) # 转换训练集 (20000,5) # 即我们从8 个主成分中选取了前5个主成分,而这前5个主成分可以保证保留原特征中90%的方差 transformed_testset = compressor.transform(standardized_test = model.transform(train_set) standardized_test = model.transform(test_set) # 开始压缩特征 compressor = LDA , train_y) # 在训练集上训练 transformed_trainset = compressor.transform(standardized_train) # 转换训练集 (20000,2

    55110编辑于 2022-01-16
  • 来自专栏润风拂过存甘霖

    conceptnet-numberbatch: 结合常识知识的词向量 - 概述及使用

    为了查到相关的词向量,我们不能直接用词语查找,而是必须要使用原库提供的text_to_uri.py进行转换 下载该文件或下载该库,在其同一路径下,可以这样调用它 from text_to_uri import standardized_uri standardized_uri('en', 'a test phrase') '/c/en/test_phrase' standardized_uri('en', '100 yuan') '/c/en /###_yuan' standardized_uri('zh', '你好') '/c/zh/你好' 如果我们有一段话,需要获得每个词的向量又要如何操作呢? import numpy as np from text_to_uri import standardized_uri from nltk import word_tokenize sentence = "hello world" words = word_tokenizer(sentence) for wd in words: concept = standardized_uri("en", wd

    1.5K20发布于 2020-12-01
  • 来自专栏DotNet NB && CloudNative

    AI应用实战课学习总结(4)医疗数据可视化

    first 10 features selected_features = X.columns[:10] # Standardizing X scaler = StandardScaler() X_standardized = scaler.fit_transform(X) # Extracting the standardized values of the selected features X_selected_standardized = X_standardized[:, :10] 然后,绘制箱线图: # Plotting the boxplot for the standardized values of the selected as sns # 导入Seaborn # 绘制小提琴图 plt.figure(figsize=(12, 8)) sns.violinplot(data=pd.DataFrame(X_selected_standardized 绘制相关性热图,仍然使用Seaborn来绘制: # 绘制相关性热图 correlation_matrix = pd.DataFrame(X_selected_standardized, columns

    42910编辑于 2025-02-27
  • 来自专栏PPV课数据科学社区

    KNN算法在保险业精准营销中的应用

    > standardized.X=scale(Caravan[,-86]) > mean(standardized.X[,sample(1:85,1)]) [1] -2.047306e-18 > var (standardized.X[,sample(1:85,1)]) [1] 1 > mean(standardized.X[,sample(1:85,1)]) [1] 1.182732e-17 > var (standardized.X[,sample(1:85,1)]) [1] 1 > mean(standardized.X[,sample(1:85,1)]) [1] -3.331466e-17 > var (standardized.X[,sample(1:85,1)]) [1] 1 可见随机抽取一个标准化后的变量,基本都是均值约为0,标准差为1。 > #前1000观测作为测试集,其他当训练集 > test <- 1:1000 > train.X <- standardized.X[-test,] > test.X <- standardized.X

    1.6K60发布于 2018-04-23
  • 来自专栏Y-StarryDreamer

    [自然语言处理|NLP]NLP在语言标准化的应用:从原理到实践

    "# 同义词替换standardized_text = synonym_replacement(original_text)print("Original Text:", original_text)print ("Standardized Text:", standardized_text)3.2 语法规范化与文本校正NLP技术可以帮助规范语法结构,提高文本的书写质量。 "# 文本校正standardized_text = correct_text(original_text)print("Original Text:", original_text)print("Standardized Text:", standardized_text)3.3 文化敏感性处理NLP在语言标准化中还能处理文化差异,根据上下文选择最合适的表达方式,以确保信息传递在不同文化间更为准确、得体。

    977100编辑于 2023-11-23
  • 来自专栏Python编程爱好者

    归一化 完全总结!!

    (X_train) X_test_standardized = scaler.transform(X_test) # 标准化的SVM模型 model_standardized = SVC(kernel ='linear') model_standardized.fit(X_train_standardized, y_train) y_pred_standardized = model_standardized.predict (X_test_standardized) accuracy_standardized = accuracy_score(y_test, y_pred_standardized) # 结果展示 print (f'Accuracy with non-standardized data: {accuracy_non_standardized}') print(f'Accuracy with standardized [:, 0], X_train_standardized[:, 1], c=y_train) plt.title("Distribution of Standardized Data") plt.xlabel

    2.2K20编辑于 2024-06-27
  • 来自专栏PaddlePaddle

    基于飞桨复现 SRGAN 模型,对图像进行超分辨率重构

    _384 = standardized(sample_imgs_384) # input sample_imgs_96 = im_resize(sample_imgs_384,96,96 ) sample_imgs_standardized_96 = standardized(sample_imgs_96) # vgg19 sample_imgs_224 = im_resize (sample_imgs_384,224,224) sample_imgs_standardized_224 = standardized(sample_imgs_224) # loss ) sample_imgs_standardized_96 = standardized(sample_imgs_96) # vgg19 sample_imgs_224 = im_resize (sample_imgs_384,224,224) sample_imgs_standardized_224 = standardized(sample_imgs_224) # loss

    1.1K20发布于 2020-11-06
  • 来自专栏一些有趣的Python案例

    「超级干货大放送」机器学习十二种经典模型实例

    ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.tight_layout() # plt.savefig ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.tight_layout() # plt.savefig ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.tight_layout() # plt.savefig ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.tight_layout() # plt.savefig ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.tight_layout() # plt.savefig

    1.1K30发布于 2021-02-02
  • 来自专栏归海刀刀

    [Pinnacle 21] SD1212 --STRESN does not equal --STRESC

    SD1212 PPSTRESN does not equal PPSTRESC Consistency SD1212 FDAB031 --STRESN does not equal --STRESC Standardized Result in Numeric Format (--STRESN) variable value should be equal Standardized Result in Character Format (--STRESC) variable value, when Standardized Result in Character Format (--STRESC) variable

    47810编辑于 2023-11-26
  • 来自专栏Listenlii的生物信息笔记

    两个宏基因组数据库:TerrestrialMetagenomeDB和HumanMetagenomeDB

    Ref: TerrestrialMetagenomeDB:a public repository of curated and standardized metadata for terrestrial D626–D632, https://doi.org/10.1093/nar/gkz994 HumanMetagenomeDB: a public repository of curated and standardized

    1.1K20编辑于 2022-03-31
  • 来自专栏Datawhale专栏

    机器学习之sklearn基础教程!

    ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.show() ? ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.show() ? ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.show() ? ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.show() ? ]') plt.ylabel('petal width [standardized]') plt.legend(loc='upper left') plt.show() ?

    85310发布于 2020-08-17
  • 来自专栏红色石头的机器学习之路

    专栏 | 基于 Jupyter 的特征工程手册:特征选择(五)

    否则回归系数大小无法比较 from sklearn.preprocessing import StandardScaler model = StandardScaler() model.fit(train_set) standardized_train = model.transform(train_set) standardized_test = model.transform(test_set) clf = LogisticRegression 否则回归系数大小无法比较 from sklearn.preprocessing import StandardScaler model = StandardScaler() model.fit(train_set) standardized_train = model.transform(train_set) standardized_test = model.transform(test_set) clf = LinearSVR(C = 0.0001 , random_state = 123) # C控制正则效果的大小,C越大,正则效果越弱 clf.fit(standardized_train, train_y) np.round(clf.coef

    57410编辑于 2022-01-16
  • 来自专栏芋道源码1024

    分享 15 个好用 + 实用的 Chrome 扩展

    Standardized Screenshot 6. Clear Cache 7. 翻译侠 8. 图流 9. 阅读模式 10. Octotree 11. Enhanced Github 12. Standardized Screenshot 一个非常好用的截图扩展,自动加上 macOS 的标题栏、以及阴影,配合微博图床一键上传根本不用保存在本地。 ? 链接:Standardized Screenshot 6.

    86130发布于 2019-06-21
  • 来自专栏拓端tecdat

    R语言有RStan的多维验证性因子分析(CFA)

    * x9\n x5 ~~ f * x5\nx6 ~~ f * x6\nx7 ~~ f * x7\nx8 ~~ f * x8\nx9 ~~ f * x9", dat, std.lv = TRUE ), standardized * x9\n x1 ~~ f * x1\nx2 ~~ f * x2\nx3 ~~ f * x3\nx4 ~~ f * x4\nx5 ~~ f * x5\n dat, std.lv = TRUE ), standardized 0.672 0.003 0.059 0.552 0.673 0.786 556 1.007 # 为了进行比较,lavaan负载为 parameterEstimates(cfa.lav.fit, standardized item_vars[9] 0.575 0.004 0.085 0.410 0.575 0.739 532 1.008 # parameterEstimates(cfa.lav.fit, standardized betas[7] 4.185 0.002 0.066 4.054 4.186 4.315 1791 1.001 # lavaan: parameterEstimates(cfa.lav.fit, standardized

    1K30发布于 2020-11-11
  • 来自专栏磐创AI技术团队的专栏

    建立一个完全没有机器学习的图像分类器

    standard_list = [] for item in image_list: image = item[0] label = item[1] standardized_im , binary_label)) return standard_list # 标准化所有训练图像 STANDARDIZED_LIST = preprocess(IMAGE_LIST) 代码是不言自明的 # 加载测试数据 TEST_IMAGE_LIST = load_dataset("images/test/") # 标准化测试数据 STANDARDIZED_TEST_LIST = preprocess (TEST_IMAGE_LIST) # 随机化数据 random.shuffle(STANDARDIZED_TEST_LIST) # 查找给定测试集中所有错误分类的图像 MISCLASSIFIED = get_misclassified_images(STANDARDIZED_TEST_LIST, threshold=99) # 准确率计算 total = len(STANDARDIZED_TEST_LIST

    85220发布于 2021-04-21
  • 来自专栏磐创AI技术团队的专栏

    CFXplorer: 生成反事实解释的Python包

    - The first DataFrame contains the standardized features of the training data. - The second DataFrame contains the standardized features of the test data. """ scaler.transform(x_train) scaled_x_test = scaler.transform(x_test) # Create a new DataFrame with standardized features standardized_train = pd.DataFrame(scaled_x_train) standardized_test = pd.DataFrame( scaled_x_test) return standardized_train, standardized_test 现在训练决策树模型。

    50210编辑于 2024-06-07
领券