Eureqa Desktop General Reference

Eureqa Error Metrics

Error Metrics specify what type of error to measure when comparing and optimizing solutions. For example, you may wish to minimize "Squared Error" if your data has normally distributed noise, or "Logarithmic Error" if it contains many outliers.

The list below describes some of the fitness metrics available in Formulize. All fitness metrics are normalized based on the target values in the data set.

 Error Metric Calculation Description and Comments Mean Absolute Error Minimizes the mean of the absolute value of residual errors, mean(abs(error)). Assumes noise follows a double exponential distribution. Mean Squared Error Minimizes the mean of the squared residual errors. Assumes noise follows a normal distribution. R2 Goodness of Fit Where SStot is proportional to the total variance, and SSres is the residual sum of squares (proportional to the unexplained variance). Maximizes the R2 explained variance, similar to the squared error but normalized by the scale of the output values. Correlation Coefficient Maximizes the correlation coefficient, normalized covariance. Scale and offset invariant, models the "shape" of the data. Maximum Error Minimizes the single highest error of the residuals. Use to minimize the worst case error or to force algorithm to model a small residual feature. Logarithm Error Minimizes the squashed error log(1 + |error|) Median Error Minimizes the median error value Interquartile Absolute Error Similar to median error, minimizes the mean absolute error of the middle 50% error values Signed difference Minimizes the left-hand-side minus the right-hand-side, including the sign, toward negative infinity Hybrid Correlation/Error Special combination of correlation and absolute error (experimental) Area Under ROC Error [AUC] Maximizes the area under the ROC curve. Use only for classification problems where the target variable is always equal to 0 or 1. Log Loss Error Penalizes for being too confident in wrong prediction. Use for clasification problems. Hinge Loss Error Linearly penalize wrong predictions. Use for classification problems. Rank Correlation The Spearman's Rank Correlation (correlation of putting things in the same order). This is the same as the Pearson Correlation of Ranks. Use to build scoring functions, when you don't care about the exact values, only the order. Slope Absolute Error The mean absolute error of the deltas between rows for the target and the model. Use for time series where you are trying to predict the changes of one time period to the next.