Evolving Simple and Accurate Symbolic Regression Models via Asynchronous Parallel Computing

Sambo, Aliyu Sani and Azad, R. Muhammad Atif and Kovalchuk, Yevgeniya and Indramohan, Vivek and Shah, H. (2021) Evolving Simple and Accurate Symbolic Regression Models via Asynchronous Parallel Computing. Applied Soft Computing. ISSN 1568-4946

Preview

Text
APGP_Journal_Elsevier.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB)

Official URL: https://www.sciencedirect.com/science/article/abs/...

Abstract

In machine learning, reducing the complexity of a model can help to improve its computational efficiency and avoid overfitting. In genetic programming (GP), the model complexity reduction is often achieved by reducing the size of evolved expressions. However, previous studies have demonstrated that the expression size reduction does not necessarily prevent model overfitting.

Therefore, this paper uses the evaluation time --- the computational time required to evaluate a GP model on data --- as the estimate of model complexity.
The evaluation time depends not only on the size of evolved expressions but also their composition, thus acting as a more nuanced measure of model complexity than the expression size alone.
To discourage complexity, this study employs a novel method called asynchronous parallel GP (APGP) that introduces a race condition in the evolutionary process of GP; the race offers an evolutionary advantage to the simple solutions when their accuracy is competitive.

To evaluate the proposed method, it is compared to the standard GP (GP) and GP with bloat control (GP+BC) methods on six challenging symbolic regression problems. APGP produced models that are significantly more accurate (on 6/6 problems) than those produced by both GP and GP+BC. In terms of complexity control, APGP prevailed over GP but not over GP+BC; however, GP+BC produced simpler solutions at the cost of test-set accuracy. Moreover, APGP took a significantly lower number of evaluations than both GP and GP+BC to meet a target training fitness in all tests.

Our analysis of the proposed APGP also involved: (1) an ablation study that separated the proposed measure of complexity from the race condition in APGP and (2) the study of an initialization scheme that encourages functional diversity in the initial population that improved the results for all the GP methods.

These results question the overall benefits of bloat control and endorse the employment of both the evaluation time as an estimate of model complexity and the proposed APGP method for controlling it.

Item Type:

Article

Identification Number:

https://doi.org/10.1016/j.asoc.2021.107198

Dates:

Date	Event
15 February 2021	Accepted
23 February 2021	Published Online

Uncontrolled Keywords:

Genetic programming Model complexity Parallel computing Evaluation time

Subjects:

CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
CAH11 - computing > CAH11-01 - computing > CAH11-01-05 - artificial intelligence

Divisions:

Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology
Faculty of Health, Education and Life Sciences > School of Health Sciences

Depositing User:

Atif Azad

Date Deposited:

17 Feb 2021 09:24

Last Modified:

22 Mar 2023 12:00

URI:

https://www.open-access.bcu.ac.uk/id/eprint/10988

Actions (login required)

View Item

Tools

CORE (COnnecting REpositories)

Research

In this section...

Birmingham City University

Level of study...

Browse Undergraduate Courses

Browse Postgraduate Taught Courses

Browse Postgraduate Research Courses

Browse Foundation Courses

Browse Short Courses

Browse Part-time Courses

Browse Jan/Feb Starts

Browse Online Learning Courses

Search Undergraduate Courses...

More in this section...

How to
Graduate help and advice

More in this section...

More in this section

Prospective students

Making an application

More in this section...

More in this section...

Get Money

Get Knowledge

Get Connected

More in this section...

More in this section...

Evolving Simple and Accurate Symbolic Regression Models via Asynchronous Parallel Computing

Abstract

Actions (login required)

Research