On Generating Efficient Data Summaries for Logistic Regression: A Coreset-based Approach

Nery Riquelme-Granada; Khuong An Nguyen; Zhiyuan Luo

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

On Generating Efficient Data Summaries for Logistic Regression: A Coreset-based Approach

Topics: Data and Information Quality for Big Data ; Information Quality; New Computational Models for Big Data ; New Data Standards

In Proceedings of the 9th International Conference on Data Science, Technology and Applications DATA - Volume 1, 78-89, 2020

Authors: Nery Riquelme-Granada ; Khuong An Nguyen and Zhiyuan Luo

Affiliation: Department of Computer Science, Royal Holloway University of London, Egham, Surrey, TW20 0EX, U.K.

Keyword(s): Coresets, Data Summaries, Logistic Regression, Large-data, Computing Time.

Abstract: In the era of datasets of unprecedented sizes, data compression techniques are an attractive approach for speeding up machine learning algorithms. One of the most successful paradigms for achieving good-quality compression is that of coresets: small summaries of data that act as proxies to the original input data. Even though coresets proved to be extremely useful to accelerate unsupervised learning problems, applying them to supervised learning problems may bring unexpected computational bottlenecks. We show that this is the case for Logistic Regression classification, and hence propose two methods for accelerating the computation of coresets for this problem. When coresets are computed using our methods on three public datasets, computing the coreset and learning from it is, in the worst case, 11 times faster than learning directly from the full input data, and 34 times faster in the best case. Furthermore, our results indicate that our accelerating approaches do not degrade the em pirical performance of coresets. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.157

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Riquelme-Granada, N., Nguyen, K. A., Luo and Z. (2020). On Generating Efficient Data Summaries for Logistic Regression: A Coreset-based Approach. In Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-440-4; ISSN 2184-285X, SciTePress, pages 78-89. DOI: 10.5220/0009823200780089

@conference{data20,
author={Nery Riquelme{-}Granada and Khuong An Nguyen and Zhiyuan Luo},
title={On Generating Efficient Data Summaries for Logistic Regression: A Coreset-based Approach},
booktitle={Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA},
year={2020},
pages={78-89},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009823200780089},
isbn={978-989-758-440-4},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA
TI - On Generating Efficient Data Summaries for Logistic Regression: A Coreset-based Approach
SN - 978-989-758-440-4
IS - 2184-285X
AU - Riquelme-Granada, N.
AU - Nguyen, K.
AU - Luo, Z.
PY - 2020
SP - 78
EP - 89
DO - 10.5220/0009823200780089
PB - SciTePress