blocks|key|6005|text|LM的目标是学习一种语言符号序列上的概率分布。|type|blockquote|depth|inlineStyleRanges|entityRanges|data|6006|也就是说，学习P(w_1,...,w_N)+(资源)。|unstyled|offset|length|style|CODE|6007|这个建模可以通过以下方式完成|6008|根据前面的单词预测下一个单词：P(w_i+%7C+w_1,...,w_{i-1})|ordered-list-item|6009|预测给定中心词(跳克)的相邻词：P(w_{i%2Bk}%7C+w_i),+k+\in+\{-2,+-1,+1,+2\}，或|6010|预测中心词给定的相邻词(CBOW或连续包字)：P(w_i%7C+w_{i-2},w_{i-1},w_{i%2B1},+w_{i%2B2})或其他设计。|6011|深层模型需要编码器吗？从tensor2tensor的ptb代码中，我发现深层模型不包含编码器。|6012|是。现代LM解决方案(都是深层次的)试图找到一种编码(嵌入)，以帮助他们预测下一个，邻居，或中心词尽可能接近。然而，单词编码可以用作其他模型的常量输入。ptb.py代码调用text_encoder.TokenTextEncoder来接收这样的字编码。|6013|有编码器和没有编码器都可以执行LM任务？|6014|LM任务也可以不用编码器来处理。例如，我们可以使用相邻单词的频率表来建立一个模型(N-图建模)；例如，所有对(我们，?)出现10K次，配对(我们，can)出现100次，所以P(can+_我们)=+0.01。然而，编码器是现代LM解决方案的核心。|entityMap|0|INLINETEX|mutability|IMMUTABLE|teX|P(w_1,...,w_N)|1|LINK|MUTABLE|url|https://arxiv.org/abs/1602.02410|2|P(w_i+%7C+w_1,...,w_{i-1})|3|https://arxiv.org/abs/1310.4546|4|P(w_{i%2Bk}%7C+w_i),+k+\in+\{-2,+-1,+1,+2\}|5|https://arxiv.org/abs/1301.3781|6|P(w_i%7C+w_{i-2},w_{i-1},w_{i%2B1},+w_{i%2B2})|7|https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/ptb.py|8|https://en.wikipedia.org/wiki/N-gram#n-gram_models^0|0|7|E|7|E|0|N|2|1|0|0|F|O|F|O|2|0|G|13|8|2|3|G|13|4|0|N|14|C|4|5|N|14|6|0|0|2E|T|24|8|7|0|0|15|5|8^^$0|@$1|2|3|4|5|6|7|1P|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|1Q|8|@$E|1R|F|1S|G|H]]|9|@$E|1T|F|1U|1|1V]|$E|1W|F|1X|1|1Y]]|A|$]]|$1|I|3|J|5|D|7|1Z|8|@]|9|@]|A|$]]|$1|K|3|L|5|M|7|20|8|@$E|21|F|22|G|H]]|9|@$E|23|F|24|1|25]]|A|$]]|$1|N|3|O|5|M|7|26|8|@$E|27|F|28|G|H]]|9|@$E|29|F|2A|1|2B]|$E|2C|F|2D|1|2E]]|A|$]]|$1|P|3|Q|5|M|7|2F|8|@$E|2G|F|2H|G|H]]|9|@$E|2I|F|2J|1|2K]|$E|2L|F|2M|1|2N]]|A|$]]|$1|R|3|S|5|6|7|2O|8|@]|9|@]|A|$]]|$1|T|3|U|5|D|7|2P|8|@$E|2Q|F|2R|G|H]]|9|@$E|2S|F|2T|1|2U]]|A|$]]|$1|V|3|W|5|6|7|2V|8|@]|9|@]|A|$]]|$1|X|3|Y|5|D|7|2W|8|@]|9|@$E|2X|F|2Y|1|2Z]]|A|$]]]|Z|$10|$5|11|12|13|A|$14|15]]|16|$5|17|12|18|A|$19|1A]]|1B|$5|11|12|13|A|$14|1C]]|1D|$5|17|12|18|A|$19|1E]]|1F|$5|11|12|13|A|$14|1G]]|1H|$5|17|12|18|A|$19|1I]]|1J|$5|11|12|13|A|$14|1K]]|1L|$5|17|12|18|A|$19|1M]]|1N|$5|17|12|18|A|$19|1O]]]]

<blockquote>
 The goal of LM is to learn a probability distribution over sequences
 of symbols pertaining to a language.
</blockquote>

That is, to learn $P(w_1,...,w_N)$ (<a href="https://arxiv.org/abs/1602.02410" rel="nofollow noreferrer">resource</a>).

This modeling can be accomplished by 

<ol>
<li>Predicting the next word given the previous words: $P(w_i | w_1,...,w_{i-1})$, or</li>
<li>Predicting the neighbor words given the center word (<a href="https://arxiv.org/abs/1310.4546" rel="nofollow noreferrer">Skip-gram</a>): $P(w_{i+k}| w_i), k \in \{-2, -1, 1, 2\}$, or </li>
<li>Predicting the center word given the neighbor words (<a href="https://arxiv.org/abs/1301.3781" rel="nofollow noreferrer">CBOW</a> or Continuous Bag-of-Words): $P(w_i| w_{i-2},w_{i-1},w_{i+1}, w_{i+2})$, or other designs.</li>
</ol>

<blockquote>
 Does the deep model need the encoder? From the ptb code of
 tensor2tensor, I find the deep model do not contains the encoder.
</blockquote>

Yes. Modern LM solutions (all deep ones) try to find an encoding (embedding) that helps them to predict the next, neighbor, or center words as close as possible. However, a word encoding can be used as a constant input to other models. The <a href="https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/ptb.py" rel="nofollow noreferrer">ptb.py code</a> calls <code>text_encoder.TokenTextEncoder</code> to receive such word encodings.

<blockquote>
 Both with-encoder and without-encoder can do the LM task?
</blockquote>

LM task can be tackled without encoders too. For example, we can use frequency tables of adjacent words to build a model (<a href="https://en.wikipedia.org/wiki/N-gram#n-gram_models" rel="nofollow noreferrer">n-gram modeling</a>); e.g. all pairs (We, ?) appeared 10K times, pair (We, can) appeared 100 times, so P(can | We) = 0.01. However, encoder is the core of modern LM solutions.

Language model(LM) is the task of predicting the next word.

Does the deep model need the encoder? From the ptb code of tensor2tensor, I find the deep model do not contains the encoder.

Or both with-encoder and without-encoder can do the LM task?

The principle of LM deep model

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

教程

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云智能顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云AI代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

功能1上新10个字符

功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符。

功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符。

功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符

功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符

功能4上新

文章&问答评论现已支持表情

全新交互，全新视觉，新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能，全面提升创作效率和体验

社区富文本编辑器全新改版！诚邀体验～ 

精选全网热门MCP server，让你的AI更好用 🚀

💥开发者 MCP广场重磅上线！

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

聚焦“写作效率、视觉美观与运行性能”三方面进行全面升级，为您提供更高效、稳定的创作环境

社区富文本&Markdown编辑器全新改版上线，欢迎大家体验!

诚挚邀请您参与本次调研，分享您的真实使用感受与建议。您的反馈至关重要，感谢您的支持与参与！

社区新版编辑器体验调研

语言模型(LM)是预测下一个单词的任务。深层模型需要编码器吗？从tensor2tensor的ptb代码中，我发现深层模型不包含编码器。或者有编码器和没有编码器都可以执行LM任务？

问LM深模型原理
EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问LM深模型原理EN