Website | Imprint | Privacy Policy | Print

Mannheim Research Data

Login

Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation

Item Type:	Dataset
Title:	Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation
Date:	2017
Creator:	Panchenko, Alexander ; Ruppert, Eugen ; Faralli, Stefano ORCID: 0000-0003-3684-8815 ; Ponzetto, Simone Paolo ORCID: 0000-0001-7484-2049 ; Biemann, Chris
Divisions:	School of Business Informatics and Mathematics > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-)

DDC Classification:	004 Computer science, internet
Abstract:	This dataset contains the models for interpretable Word Sense Disambiguation (WSD) that were employed in Panchenko et al. (2017; the paper can be accessed at https://www.lt.informatik.tu-darmstadt.de/fileadmin/user_upload/Group_LangTech/publications/EACL_Interpretability___FINAL__1_.pdf). The files were computed on a 2015 dump from the English Wikipedia. Their contents: Induced Sense Inventories: wp_stanford_sense_inventories.tar.gz This file contains 3 inventories (coarse, medium fine) Language Model (3-gram): wiki_text.3.arpa.gz This file contains all n-grams up to n=3 and can be loaded into an index Weighted Dependency Features: wp_stanford_lemma_LMI_s0.0_w2_f2_wf2_wpfmax1000_wpfmin2_p1000.gz This file contains weighted word--context-feature combinations and includes their count and an LMI significance score Distributional Thesaurus (DT) of Dependency Features: wp_stanford_lemma_BIM_LMI_s0.0_w2_f2_wf2_wpfmax1000_wpfmin2_p1000_simsortlimit200_feature expansion.gz This file contains a DT of context features. The context feature similarities can be used for context expansion For further information, consult the paper and the companion page: http://jobimtext.org/wsd/ (English)
External Identifier for Data:	https://zenodo.org/records/485151

URL:	https://madata.bib.uni-mannheim.de/602/
Access (Controlled):	Only Metadata
License (Controlled):	Creative Commons: CC-BY \| Attribution 4.0 (recommended)
Related Publication(s) in MADOC:	Panchenko, Alexander und Ruppert, Eugen und Faralli, Stefano und Ponzetto, Simone Paolo und Biemann, Chris (2017), Unsupervised does not mean uninterpretable : the case for word sense induction and disambiguation

Full text not available from this repository.

Date Deposited:	30 Mar 2026 15:08
Last Modified:	30 Mar 2026 15:08

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

View Item

View Item

Mannheim Research Data is powered by EPrints 3 which is developed by the School of Electronics and Computer Science at the University of Southampton. More information and software credits.