The IRIS AutoML Provider includes feature engineering to clean and remodel data sets. It would be beneficial to use the first part of the AutoML provider separately to transform datasets. For e.g. analytics without using machine learning afterwards. Instead of the SQL statement "TRAIN MODEL" e.g. "TRANSFORM Example.Demo" to get back a transformed dataset for further work.
Thank you for submitting the idea. The status has been changed to "Planned or In Progress".
This is not a commitment; plans are subject to change. Stay tuned!
@Felix Vetter, you have a comment on your idea. Please answer to help your idea to be promoted.
Great idea, and we have thought we could expose the feature engineering in different ways, including a plugin capability to register code that could be run either before or after the feature engineering that IntegratedML does, so users could customize the process. I think for this, we may need the new syntax, or we could have a USING flag such that "TRAIN MODEL dontTrain FROM Example.Demo USING {'featureExtraction': 'mySchema.newTableName'}" and in that case ignore the model name and only do a "CREATE TABLE ... AS SELECT ..." . One caveat is that feature engineering sometimes results in a table for ML training that has more than the 999 column limit for SQL tables in IRIS, so in that case we would fail? Or have another using {'featureExtractionFileOutput':'/path/to/dataframe/output'}... ?