我正在尝试获取model.visualize_topics()为我的BERTopic主题分析项目在图形上放置的文档的坐标信息。有没有办法查看该函数的源代码并保存用于更高级分析的坐标?
我发现下面的代码是visualize_topics()函数的源代码。但是里面没有任何关于坐标的信息。
def visualize_topics(self,
topics: List[int] = None,
top_n_topics: int = None,
width: int = 650,
height: int = 650) -> go.Figure:
""" Visualize topics, their sizes, and their corresponding words
This visualization is highly inspired by LDAvis, a great visualization
technique typically reserved for LDA.
Arguments:
topics: A selection of topics to visualize
top_n_topics: Only select the top n most frequent topics
width: The width of the figure.
height: The height of the figure.
Examples:
To visualize the topics simply run:
```python
topic_model.visualize_topics()
```
Or if you want to save the resulting figure:
```python
fig = topic_model.visualize_topics()
fig.write_html("path/to/file.html")
```
"""
check_is_fitted(self)
return plotting.visualize_topics(self,
topics=topics,
top_n_topics=top_n_topics,
width=width,
height=height)发布于 2022-12-01 07:46:48
为了在BERTopic中实现模块化,对主要功能进行了划分。您可以在BERTopic here中找到有关绘图的所有信息,更具体地说,您可以找到plotting.visualize_topics here的代码。
尽管如此,这一职能的一部分是创建以下坐标系统:
embeddings = topic_model.c_tf_idf_.toarray()[indices]
embeddings = MinMaxScaler().fit_transform(embeddings)
embeddings = UMAP(n_neighbors=2, n_components=2, metric='hellinger').fit_transform(embeddings)这里,它采用c-TF-以色列国防军对所有主题的表示,对它们进行缩放,最后使用使用Hellinger距离的UMAP进行维数约简,将表示映射到二维空间。但是,需要注意的一点是,在UMAP中没有设置random_state,这导致了一个随机过程。
https://stackoverflow.com/questions/74566683
复制相似问题