Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation
| Item Type: | Dataset |
|---|---|
| Title: | Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation |
| Date: | 2017 |
| Creator: |
Panchenko, Alexander ; Ruppert, Eugen ; Faralli, Stefano ORCID: 0000-0003-3684-8815 ; Ponzetto, Simone Paolo ORCID: 0000-0001-7484-2049 ; Biemann, Chris
|
| Divisions: | School of Business Informatics and Mathematics > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-) |
| DDC Classification: |
004 Computer science, internet |
|---|---|
| Abstract: | This dataset contains the models for interpretable Word Sense Disambiguation (WSD) that were employed in Panchenko et al. (2017; the paper can be accessed at https://www.lt.informatik.tu-darmstadt.de/fileadmin/user_upload/Group_LangTech/publications/EACL_Interpretability___FINAL__1_.pdf). The files were computed on a 2015 dump from the English Wikipedia. Their contents: Induced Sense Inventories: wp_stanford_sense_inventories.tar.gz This file contains 3 inventories (coarse, medium fine) Language Model (3-gram): wiki_text.3.arpa.gz This file contains all n-grams up to n=3 and can be loaded into an index Weighted Dependency Features: wp_stanford_lemma_LMI_s0.0_w2_f2_wf2_wpfmax1000_wpfmin2_p1000.gz This file contains weighted word--context-feature combinations and includes their count and an LMI significance score Distributional Thesaurus (DT) of Dependency Features: wp_stanford_lemma_BIM_LMI_s0.0_w2_f2_wf2_wpfmax1000_wpfmin2_p1000_simsortlimit200_feature expansion.gz This file contains a DT of context features. The context feature similarities can be used for context expansion For further information, consult the paper and the companion page: http://jobimtext.org/wsd/ (English) |
| External Identifier for Data: | https://zenodo.org/records/485151 |
| URL: | https://madata.bib.uni-mannheim.de/602/ |
|---|---|
| Access (Controlled): | Only Metadata |
| License (Controlled): | Creative Commons: CC-BY | Attribution 4.0 (recommended) |
| Related Publication(s) in MADOC: | Panchenko, Alexander und Ruppert, Eugen und Faralli, Stefano und Ponzetto, Simone Paolo und Biemann, Chris (2017), Unsupervised does not mean uninterpretable : the case for word sense induction and disambiguation |
Full text not available from this repository.
| Date Deposited: | 30 Mar 2026 15:08 |
|---|---|
| Last Modified: | 30 Mar 2026 15:08 |
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
![]() |
View Item |

