Data for paper: "Evaluating Resource-Lean Cross-Lingual Embedding Models in Unsupervised Retrieval"
Item Type: | Dataset |
---|---|
Title: | Data for paper: "Evaluating Resource-Lean Cross-Lingual Embedding Models in Unsupervised Retrieval" |
Date: | 21 January 2021 |
Creator: | Litschko, Robert and Glavaš, Goran |
DDC Classification: |
004 Computer science, internet |
---|---|
Abstract: | Cross-lingual embeddings (CLE) allow for cross-lingual natural language processing and information retrieval. Recently, a wide variety of resource-lean projection-based models for inducing CLEs appeared, requiring limited or no bilingual supervision. Despite potential usefulness in downstream IR and NLP tasks, these CLE models have almost exclusively been evaluated on word translation tasks. In this work, we provide a comprehensive comparative evaluation of projection-based CLE models for both sentence-level and document-level Cross-lingual Information Retrieval (CLIR). We hope our work serves as a guideline for choosing the right model for CLIR practitioners. |
URL: | https://madata.bib.uni-mannheim.de/360/ |
---|---|
DOI: | https://doi.org/10.7801/360 |
Availability (Controlled): | Download |
File | Filename / Infos | Link |
---|---|---|
Archive
Filename: europarl.tar.gz |
Download (35MB)
|
|
Archive
Filename: muse.tar.gz |
Download (3GB)
|
|
Archive
Filename: proc.tar.gz |
Download (4GB)
|
|
Archive
Filename: cca.tar.gz |
Download (4GB)
|
|
Archive
Filename: icp.tar.gz |
Download (4GB)
|
|
Archive
Filename: rcsls.tar.gz |
Download (3GB)
|
|
Archive
Filename: procb.tar.gz |
Download (4GB)
|
|
Archive
Filename: vecmap.tar.gz |
Download (4GB)
|
Depositing User: | Robert Litschko |
---|---|
Date Deposited: | 22 Jan 2021 15:37 |
Last Modified: | 29 Feb 2024 20:38 |
Actions (login required)
View Item |