Eureqa unfortunately does not allow users to create and use their own custom building blocks. Can you tell me more detail about your building block and what use cases you would use it for?
The building block, I am looking for is power(). The one in Eureqa right now, works this way- (variable)^variable. I want it to accept, (variable)^constant.
Please consider this a very strong, but respectful request to add user-defined building blocks (including constants) to Eureqa. For the use case where potentially-complicated physical relationships are to be derived from data, the ability to inject domain expertise will be vital. I recognize that the "data science" angle is a key marketing message behind Eureqa these days, but if we go back to the idea of using Eureqa to help derive new physics-driven relationships ( http://creativemachines.cornell.edu/natural_laws ), it will be unlikely for Eureqa to go beyond non-physics-based curve fits in the majority of cases if we can't give it more pointers and constraints than it presently allows.
To illustrate, my area of research involves gasdynamics. Suppose I have a problem where isentropic flow parameter ratios, or the total pressure ratio across a shock play roles in the outcome.
You can see in these equations that there are specific, non-trivial exponents, and algebraically complicated equations.
If I try Nutonian on an exact set of numbers generated from say, the equation for the normal shock pressure ratio as a function of Mach number, it will reliably find gorgeous curve fits over the span of my numbers, with negligible numerical error. However, the formulae are very different from the equation on that second link for pt1/pt0. Physical insight cannot be gained from that great curve fit.
On the other hand, if I can define the building block for Normal Shock Pressure Ratio, or a building block to specifically exponentiate to the power of (gamma + 1)/(2(gamma - 1)) where gamma is a constant (similar to what Jayashree asked for), there's a fighting chance the formulas Eureqa finds are actually tied to the physics rather than just being great curve fits.
If we want to understand what's really going on from the answers Eureqa finds, rather than just having numerical correlations, it's not apparent how we can get there without be able to "teach" Eureqa about some of the complicated relationships that may be embedded.
By the way, what Eureqa does, it does very impressively. Thank you very much for making it available to academics.
Great question - we've definitely heard this request before. One workaround that we've suggested in the past is to set your target expression with the existing relationship that you already know exists, then add on + f(...) at the end of the expression to model the remainder of the variance.
As an extremely simple example, say you already know that your data should roughly fit a quadratic function. y = x^2+f1()*x + f2() + f3(x). That way, you begin by teaching Eureqa the relationship that you already know, but throw in additional term to model the part that isn't captured by that quadratic function.
Something else you may want to consider is creating new derived columns with the derived quantities (Normal Shock Pressure Ratio in your case) that are specific to your problem domain.
By doing so, you will inform Eureqa that it is important and relatively low complexity for your models.