Hi, Arrikto team, thank you in advance for your help. Hopefully I am explaining this adequately,
I ran into an issue, that I'm a bit puzzled with:
I used the Kale setup on the GCP marketplace deployment of Minikf to compile a pipeline having 2 python function - based stages: one simple data preprocessing stage and one typical parameterized training task function.
It appeared that the pipeline compiled successfully, but the Kale extension froze - up with the caption "gathering suggestions" after submitting it to Katib.
When I click on the Katib AutoMl experiments tab and open the experiment that was created, I see the error message "Couldn't find information for the underlying Trials". It never shows a completed [or failed] trial, any logs, etc. The experiment creates, but just stops.
Here is why I am a bit puzzled:
If I create a run under the KFP experiments tab (not the Katib AutoML experiments tab) from the same experiment and manually run it with a set of parameter values, the run completes in a few minutes and returns the metric I expected from the run.
When I tried to manually create a new run in Katib from the original experiment sing the evolutionary algorithm as a tuner (as a tuner as a NAS), I get prompted for 3 parameters that are not among my pipelines parameters, but do appear to be NAS parameters. These are the unrecognized parameters: learningRate (My pipeline has learning_rate as the lr), numLayers, which is not a parameter to my task, and one other. No matter what I select for these parameters, Katib rejects the run.
The details of this pipeline's configuration were sent to the support email, but until the community edition of my project open sources in a about a month, I can't post this on the community support portal, as it contains trade secret information. It is available in the email history for Arrikto support staff to refer to.
Please sign in to leave a comment.