是否有任何已添加注释的可用生物医学数据集?我正在学习如何对生物医学文本进行注释,特别是为了消除歧义。但我对其他用途的注解持开放态度。
发布于 2019-07-05 13:45:56
这里有一些语料库给你
| Entity | Corpus | Type | Size (sentences) |
|------------------|-----------------------------|------------|------------------|
| Gene and Protein | GENETAG [7] | Sentences | 20000 |
| | JNLPBA [6] (from GENIA [8]) | Abstracts | 22402 |
| | FSUPRGE [9] | Abstracts | ≈29447* |
| | PennBioIE [10] | Abstracts | ≈22877* |
| Species | OrganismTagger Corpus [11] | Full texts | 9863 |
| | Linnaeus Corpus [12] | Full texts | 19491 |
| Disorders | SCAI Disease [13] | Abstracts | ≈3640* |
| | EBI Disease [14] | Sentences | 600 |
| | Arizona Disease (AZDC) [15] | Sentences | 2500 |
| | BioText [16] | Abstracts | 3655 |
| Chemical | SCAI IUPAC [17] | Sentences | 20300 |
| | SCAI General [18] | Sentences | 914 |
| Anatomy | AnEM1 | Sentences | 4700 |
| Miscellaneous | CellFinder2 | Full texts | 2100 |https://stackoverflow.com/questions/56896882
复制相似问题