- 1、本文档共24页,可阅读全部内容。
- 2、有哪些信誉好的足球投注网站(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
Dynamic topic models-英文文献
TOPIC MODELS
DAVID M. BLEI
PRINCETON UNIVERSITY
JOHN D. LAFFERTY
CARNEGIE MELLON UNIVERSITY
1. INTRODUCTION
Scientists need new tools to explore and browse large collections of schol-
arly literature. Thanks to organizations such as JSTOR, which scan and
index the original bound archives of many journals, modern scientists can
search digital libraries spanning hundreds of years. A scientist, suddenly
faced with access to millions of articles in her field, is not satisfied with
simple search. Effectively using such collections requires interacting with
them in a more structured way: finding articles similar to those of interest,
and exploring the collection through the underlying topics that run through
it.
The central problem is that this structure—the index of ideas contained
in the articles and which other articles are about the same kinds of ideas—is
not readily available in most modern collections, and the size and growth
rate of these collections preclude us from building it by hand. To develop
the necessary tools for exploring and browsing modern digital libraries, we
require automated methods of organizing, managing, and delivering their
contents.
In this chapter, we describe topic models, probabilistic models for uncov-
ering the underlying semantic structure of a document collection based on a
hierarchical Bayesian analysis of the original texts Blei et al. (2003); Grif-
fiths and Steyvers (2004); Buntine and Jakulin (2004); Hofmann (1999);
Deerwester et al. (1990). Topic models have been applied to many kinds
of documents, including email ?, scientific abstracts Griffiths and Steyvers
(2004); Blei et al. (2003), and newspaper archives Wei and Croft (2006).
By discovering patterns of word use and connecting documents that e
您可能关注的文档
- Bounds on Multiprocessing Timing Anomalies-英文文献.pdf
- Breaking and Fixing the Needham-Schroeder Public-Key Protocol using FDR-英文文献.pdf
- Bid, ask and transaction prices in a specialist market with heterogeneously informed traders-英文文献.pdf
- Calibrated geometries-英文文献.pdf
- Calibrating noise to sensitivity in private data analysis-英文文献.pdf
- Capacity of Fading Channels with Channel Side Information-英文文献.pdf
- Capacity of multi-antenna Gaussian channels-英文文献.pdf
- Centrality in social networks conceptual clarification-英文文献.pdf
- Characteristics of a human cell line transformed by DNA from human adenovirus type 5-英文文献.pdf
- Classification in the KL-ONE knowledge representation system-英文文献.pdf
- 金融产品2024年投资策略报告:积极适应市场风格,行为金融+机器学习新发现.pdf
- 交运物流2024年度投资策略:转型十字路,峰回路又转(2023120317).pdf
- 建材行业2024年投资策略报告:板块持续磨底,重点关注需求侧复苏.pdf
- 宏观2024年投资策略报告:复苏之路.pdf
- 光储氢2024年投资策略报告:复苏在春季,需求的非线性增长曙光初现.pdf
- 公用环保2024年投资策略报告:电改持续推进,火电盈利稳定性有望进一步提升.pdf
- 房地产2024年投资策略报告:聚焦三大工程,静待需求修复.pdf
- 保险2024年投资策略报告:资产负债匹配穿越利率周期.pdf
- 政策研究2024年宏观政策与经济形势展望:共识与分歧.pdf
- 有色金属行业2024年投资策略报告:新旧需求共振&工业原料受限,构筑有色大海星辰.pdf
文档评论(0)