首页
学习
活动
专区
圈层
工具
发布
    • 综合排序
    • 最热优先
    • 最新优先
    时间不限
  • 来自专栏NewBeeNLP

    KDD2021 | 华为AutoDis:连续特征的Embedding学习框架

    现有的处理方式由于其硬离散化(hard discretization)的方式,通常suffer from low model capacity。 1.3 Discretization Discretization即将连续特征进行离散化,是工业界最常用的方法。 2)LD (Logarithm Discretization):对数离散化,其计算公式如下: 3)TD (Tree-based Discretization):基于树模型的离散化,如使用GBDT+LR 2.2 Automatic Discretization Automatic Discretization模块可以对连续特征进行「自动的离散化」,实现了「离散化过程的端到端训练」。 而上式可以看作是一种软离散化(soft discretization)。对于温度系数 ,当其接近于0时,得到的分桶概率分布接近于one-hot,当其接近于无穷时,得到的分桶概率分布近似于均匀分布。

    2.2K10发布于 2021-12-02
  • 来自专栏秋枫学习笔记

    KDD'21「华为」数值型特征embedding方法

    Discretization 离散化的方式就是将连续特征转换为离散特征(例如分桶)。将域内的特征进行离散化,然后在进行转换。 常用的离散化函数有以下三种: EDD/EFD(Equal Distance/Frequency Discretization) 等距or等频离散化。 LD (Logarithm Discretization) log离散化也是比较常用的离散化方法,公式如下: \widehat{x}_{j}=d_{j}^{L D}\left(x_{j}\right)= \text { floor }\left(\log \left(x_{j}\right)^{2}\right) TD (Tree-based Discretization). Automatic Discretization 自动离散化。上述离散化方法我们可以称之为硬离散化(hard discretization),是完全限定好条件,然后将值固定划分到一个区域内。

    1.9K20编辑于 2022-09-19
  • 来自专栏CreateAMind

    使用 Langevin 扩散对流形进行采样和估计

    Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Imposing no restrictions beyond a nominal level of smoothness on ϕ, first-order error bounds, in discretization

    24910编辑于 2024-03-25
  • 来自专栏深度学习框架

    Using side features: feature preprocessing

    While there are many ways in which we can do this, discretization and standardization are two common Discretization Another common transformation is to turn a continuous feature into a number of categorical To do this, we first need to establish the boundaries of the buckets we will use for discretization. embeddings: timestamp_embedding_model = tf.keras.Sequential([ tf.keras.layers.experimental.preprocessing.Discretization self.timestamp_embedding = tf.keras.Sequential([ tf.keras.layers.experimental.preprocessing.Discretization

    58130发布于 2021-07-30
  • 来自专栏hsdoifh biuwedsy

    Mutual information

    Information (NMI) Range [0,1] Large = high correlation Small = low correlation -understand the role of data discretization in computing (normalised) mutual information Variable discretization Doman knowledge: assign thresholds

    41730发布于 2021-05-19
  • 来自专栏AI异构

    神经网络架构搜索——可微分搜索(DAAS)

    本文基于DARTS搜索离散化后性能损失严重的问题,提出了离散化感知架构搜索,通过添加损失项(Discretization Loss)以缓解离散带来的准确性损失。 论文题目:Discretization-Aware Architecture Search 开源代码:https://github.com/sunsmarterjie/DAAS。

    1.3K30发布于 2020-09-14
  • 来自专栏生信小驿站

    一文解决基因表达数据的聚类转换

    col = col_name[i] Data = f(col) aprioriData = pd.concat([aprioriData,Data],axis=1) # In[*] discretization_d = pd.concat([aprioriData,data['Class']],axis =1) # In[*] discretization_d.head() data.head() Out[

    77310发布于 2019-12-02
  • 来自专栏后台技术底层理解

    树状数组解析

    (int[] nums) { List<Integer> resultList = new ArrayList<Integer>(); // 去重排序 discretization pos -= lowBit(pos); } return ret; } // 去重排序 private void discretization

    1.2K30发布于 2020-09-10
  • 来自专栏图像处理与模式识别研究所

    欧拉方程的动力与流体耦合系统解决激波管问题。

    etpfix = 0.90; % {#} Harten's sonic entropy fix value, {0} no entropy fix %% Space Discretization = cutfunc(x,xa,xb); % Physical Cut Function % h0 = ones(size(x)); % h0 = zeros(size(x)); %% Discretization of the Velocity Space % Microscopic Velocity Discretization (using Discrete Ordinate Method) % that

    33610编辑于 2022-05-28
  • 来自专栏图像处理与模式识别研究所

    线性一维平流方程解决以下PDE的程序。

    Parameters cfl = 0.8; % CFL = a*dt/dx tend = 0.2; % End time a = 0.5; % Scalar wave speed %% Domain Discretization = -0.5; a_m = min(0,a); a_p = max(0,a); dx = 0.01; cfl = 0.9; dt = cfl*dx/abs(a); t_end = 0.4; %% Discretization a = 0.5; a_m = min(0,a); a_p = max(0,a); dx = 0.01; cfl = 0.9; dt = cfl*dx/abs(a); t_end = 0.6; %% Discretization clear all; close all; clc; %% Parameters dx = 1/200; % using 200 points cfl = 0.9; t_end = 0.5; %% Discretization

    59020编辑于 2022-05-28
  • 来自专栏计算机视觉战队

    量化新方法 | 模型压缩6倍,无需重训练

    Discretization of a matrix in quadratic functional binary optimization. Dokl. 3、Description of discretization procedure ? ? Linear partition 在这种情况下,我们将分布的区间划分为相同的区间: ? 4、Discretization results for random numbers 测试了两种分布的离散化过程,即高斯和拉普拉斯分布。

    91110发布于 2021-03-13
  • 来自专栏数据结构与算法

    SPOJ COT2 - Count on a tree II(树上莫队)

    return belong[l] < belong[rhs.l]; } }q[MAXN]; vector<int>v[MAXN]; int a[MAXN], date[MAXN]; void Discretization N; i++) a[i] = date[i] = read(); for(int i = 1; i <= N * 2; i++) belong[i] = i / block + 1; Discretization

    86030发布于 2018-07-04
  • 来自专栏C++

    【算法提高篇】(七)权值线段树 + 离散化:值域爆炸?这波操作直接拿捏!

    1e5 + 10; int a[N], tmp[N]; // a是原数组,tmp是辅助数组 int n; // 数据个数 int cnt; // 离散化后的数值个数 // 离散化核心函数 void discretization struct node { int l, r; LL cnt; } tr[N << 2]; int a[N], tmp[N]; int n, discnt; // discnt:discretization query(p << 1, x, y); if (y > mid) res += query(p << 1 | 1, x, y); return res; } // 离散化函数 void discretization int main() { // 读入数据 cin >> n; for (int i = 1; i <= n; i++) cin >> a[i]; // 离散化 discretization res += query(p << 1, x, y); if (y > mid) res += query(p << 1 | 1, x, y); return res; } void discretization

    13010编辑于 2026-02-25
  • 来自专栏kalifaの日々

    决策树分类鸢尾花数据集python实现

    informationEntropy = getInformationEntropy(num,length) #print(informationEntropy) # In[105]: #离散化特征一的值 def discretization getRazors(): a = [] for i in range(len(iris.feature_names)): print(i) a.append(discretization

    1.5K20发布于 2019-04-01
  • 来自专栏机器学习AI算法工程

    SSD物体检测模型Keras版

    SSD将输出一系列离散化(discretization)的bounding boxes,这些bounding boxes是在不同层次(layers)上的feature maps上生成的,并且有着不同的aspect

    1K10发布于 2019-10-28
  • 来自专栏数据结构与算法

    树上莫队算法

    return belong[l] < belong[rhs.l]; } }q[MAXN]; vector<int>v[MAXN]; int a[MAXN], date[MAXN]; void Discretization N; i++) a[i] = date[i] = read(); for(int i = 1; i <= N * 2; i++) belong[i] = i / block + 1; Discretization

    82330发布于 2018-07-04
  • 来自专栏图灵技术域

    特征离散化与选择EPSO算法详解

    参考文献 文章:“A New Representation in PSO for Discretization-Based Feature Selection” 作者:Binh Tran, Student

    1.1K40发布于 2021-05-21
  • 来自专栏hsdoifh biuwedsy

    Classification and regression techniques: decision tree and knn

    Use as many partitions as distinct values Use as many partitions as distinct values Discretization Use as many partitions as distinct values Use as many partitions as distinct values Discretization

    63520发布于 2021-05-19
  • 来自专栏wywwzjj 的技术博客

    ACM 常用小 Trick

    ::max_element(a , a + n) << endl;// [a , a+n) cout << *std::min_element(a , a + n) << endl; // discretization

    32930编辑于 2023-05-09
  • 来自专栏深度学习|机器学习|歌声合成|语音合成

    R语言实战:心血管病分析实例

    install.packages("maxstat") # install.packages("survminer") # install.packages("survival") install.packages("discretization (ggplot2) library(mice) library(pROC) library(maxstat) library(survminer) library(survival) library(discretization

    1K30发布于 2021-01-14
领券