Web Data Commons - Product Data Corpus
| Item Type: | Dataset |
|---|---|
| Title: | Web Data Commons - Product Data Corpus |
| Alternative Title: | Product Data Corpus for product matching and product feature extraction |
| Date: | 2015 |
| Creator: | Bizer, Christian ; Petrovski, Petar ; Meusel, Robert ; Primpeli, Anna |
| Divisions: | School of Business Informatics and Mathematics > Information Systems V: Web-based Systems (Bizer 2012-) |
| DDC Classification: |
004 Computer science, internet |
|---|---|
| Keywords: | product corpus |
| Abstract: | A product data corpus containing over 5.6 million product records retrieved from the most visited 32 shopping websites based on the ranking provided by Alexa. The provided corpus evolves around three different product categories: Mobile Phones, Headphones and Televisions. |
| URL: | https://madata.bib.uni-mannheim.de/216/ |
|---|---|
| DOI: | https://doi.org/10.7801/216 |
| Access (Controlled): | Download |
| Access: | Data is available as NQuads and WARC files. Download source: http://data.dws.informatik.uni-mannheim.de/productcrawl/crawl-data-general/ |
| Related Publication(s) in MADOC: | Meusel Robert und Primpeli Anna und Meilicke Christian und Paulheim Heiko und Bizer Christian (2015), Exploiting microdata annotations to consistently categorize product offers at web scale |
| External URL for Other Related Materials: | http://webdatacommons.org/productcorpus/index.html |
| Project: |
Project Title: Web Data Commons - Product Corpus Project Description: Creation of input data to support and evaluate product matching and product feature extraction methods. |
Full text not available from this repository.
| Date Deposited: | 19 May 2017 11:28 |
|---|---|
| Last Modified: | 16 Jun 2017 10:46 |
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
![]() |
View Item |
