Time Series with sporadic data (spare part)
• Hi eureqa users,

I am quite new to the subject of data analizing and statistics.

I am strugeling with using eureqa for analizing values of two time streams. My theorie is, that there is a corelation between these two time streams.

The time stream values are for a spare part article, for which I would like to create a sales forecast.
One time series holds the consumption data meaning that this part has been used as component of a new machine, the other holds sales data as sparepart from the past.
The 120 rows represent the months of the last 10 years.

I have run eureqa for about 100hrs using all building blocks. I know, that for time series the history building block is quite useful, but I wanted to use the full "power" of eureqa.
I did not use the outlier correction as the data is sporadic and has a lot of 0-values and I wanted to try it with the original data first.

I have also weighted the distant past for this spare part, as the further away the consumption point-of-time for a component is, the more likely it is that it is going to be sold as a spare part.

eureqa has found an equation of size 83 and a MAE 0.228 for the target expression S = f(C).

My first question is, what is your advice for getting a better result? Is that the best approach?

My second question is, what would I have to do in order to use the final equation for a forecast?
Should the target expression be like this?  S = f(delay(S, 1), C)
With this expression I would be able to forecast one period of S only, correct?
I want to use the sales value (S) and the consumption value (C) for the forecast.

Attached please find the eureqa data set.

Regards,
Lars

 Tweet
• Hi Lars,

I did play around with your data a little. Here are some suggestions for getting better results
1. Include past values of Sales in the model -- so your sales forcast has more to go on this looks using a target expression that looks like S = f(delay(S, 1), delay(C, 1))
2. Add any more any columns (variables) you could add other than past consumption?

Should the target expression be like this?  S = f(delay(S, 1), C)

If you have enabled the delay building block, this target expression will ensure that any use of S in solutions will be at minimum delayed by one (aka at some time in the past);

So for example

S = delay(S,1) + C             -- possible
S = delay(S,2) + C             -- possible
S = delay(S,1) + delay(C,1) -- possible
S = S + C                         -- NOT possible

Hope that helps,
Andrew

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!