Differences

This shows you the differences between two versions of the page.

--- wiki:ai:ml-pipeline-test [2025/06/06 13:19] – ddehamer
+++ wiki:ai:ml-pipeline-test [2025/06/06 13:46] (current) – ddehamer
@@ Line 45: / Line 45: @@
 print("Files in output dir:", os.listdir(args.model_output))
 </code>
-Download
   NOTE:  The print statements on the end were for troubleshooting and shouldn't be there for production runs.
@@ Line 186: / Line 184: @@
 ml_client.jobs.stream(pipeline_job.name)
 </code>
-Download
   NOTE:  This is ran from the Notebook, not from a python script.  At least not without changes.
@@ Line 423: / Line 419: @@
 | Logistics | Delivery delay prediction, route optimization |
 | Cybersecurity | Threat classification, anomaly detection |
+===== Reusability =====
+===== ✅ Reusable As-Is If: =====
+You are solving **the same kind of problem** (e.g., binary classification using logistic regression) and the following stay consistent:
+  * **Input data structure**: New datasets have the same column names:
+    * ''feature1'', ''feature2'', ''label''
+  * **Preprocessing logic**: You still just sum ''feature1 + feature2'' to create ''feature_sum''
+  * **Model type**: You're still using a ''LogisticRegression'' model from scikit-learn
+  * **Output format**: You expect the model to be saved as ''model.joblib''
+==== In this case: ====
+✅ You only need to change the **CSV file** and re-register it as a new version of ''sample-csv-data'', then update the pipeline call with the new version:
+<code ->
+pythonCopyEditinput_data=Input(type=AssetTypes.URI_FILE, path="azureml:sample-csv-data:5")
+</code>
+----
+==== 🔄 Requires Changes If: ====
+Your pipeline needs to be adapted for a different data structure or task. Here’s when you'd need to modify the scripts:
+=== 🔁 If your data columns change: ===
+  * You'll need to update:
+    * ''prep.py'' to transform new columns appropriately
+    * ''train.py'' to use the correct feature and label columns
+    * Possibly retrain on different targets (multi-class, regression, etc.)
+=== 🔁 If your model type changes: ===
+  * If you switch from ''LogisticRegression'' to ''XGBoost'', ''RandomForest'', or a neural network:
+    * Update ''train.py'' to import and instantiate the new model
+    * Possibly adjust hyperparameters and training logic
+=== 🔁 If your pipeline steps change: ===
+  * Want to add validation?
+  * Want to split data into train/test?
+  * Want to evaluate model metrics?
+    * You’ll need new component scripts and return more outputs (e.g., ''metrics.json'')
+=== 🔁 If your deployment format changes: ===
+  * If your consumers expect ONNX or TensorFlow SavedModel instead of ''joblib'', you’ll need to:
+    * Serialize the model differently
+    * Possibly update the pipeline to convert formats
+----
+==== 🧰 To Make it Highly Reusable: ====
+You can make the pipeline truly production-grade and reusable by:
+^ Feature ^ How to Do It ^
+| Parametrize column names | Add ''--feature_cols'' and ''--label_col'' arguments |
+| Generalize preprocessing | Add preprocessing config file or flags |
+| Model selector | Add ''--model_type'' argument (''logistic'', ''xgb'', etc.) |
+| Versioned output naming | Return ''model_output'' with model name + timestamp |
+| Dynamic data input | Register new data via CLI, UI, or pipeline parameter |
+----
+==== ✅ Summary ====
+^ Scenario ^ Reusable? ^ What to Change ^
+| Same data structure and model type | ✅ | Just update the input dataset version |
+| Same structure, different model | 🔁 | Modify ''train.py'' only |
+| Different data columns or prediction target | 🔁 | Modify ''prep.py'' and ''train.py'' |
+| More complex workflow (e.g., evaluation, deployment) | 🔁 | Add steps and new component scripts |
+===== How to Deploy Model =====
+==== ✅ High-Level Overview ====
+  - **Prepare Scoring Script (''score.py'')**
+  - **Create Inference Environment**
+  - **Register the Trained Model**
+  - **Create an Online Endpoint**
+  - **Deploy the Model to the Endpoint**
+  - **Test the Deployed Service**
+====== Errors Encountered During Session ======
+===== 🔁 Environment Definition Issue =====
+==== ❌ Problem: ====
+The ''conda_file'' was passed as a multi-line string instead of a dictionary. Azure ML interpreted it as a file path, resulting in a ''FileNotFoundError''.
+==== ✅ Solution: ====
+The ''conda_file'' was rewritten as a Python dictionary inside the ''Environment()'' constructor, which Azure ML correctly interpreted and registered.
+----
+===== 🔁 Dataset Reference Issue =====
+==== ❌ Problem: ====
+When submitting the pipeline, Azure ML failed to resolve the dataset because the dataset path was given as ''"sample-csv-data:4"'' without the required ''azureml:'' prefix. This caused a ''ValidationException'' about a missing asset version.
+==== ✅ Solution: ====
+The dataset path was updated to use the full Azure ML URI syntax: ''"azureml:sample-csv-data:4"'', resolving the issue.
+----
+===== 🔁 Output Not Persisted =====
+==== ❌ Problem: ====
+Even though the ''train.py'' script wrote a ''model.joblib'' file, Azure ML did not surface the output in the UI or download tools.
+==== ✅ Root Cause: ====
+The output directory was not explicitly registered in the pipeline job, and Azure ML silently discarded it.
+==== ✅ Solution: ====
+The pipeline job was updated to explicitly register ''model_output'' using ''pipeline_job.outputs[...]''. Additionally, a unique name was generated for the training component to avoid using cached versions that might not include the output.
+----
+===== 🔁 Missing Script Execution =====
+==== ❌ Problem: ====
+The ''train.py'' file executed but produced no logs or output.
+==== ✅ Root Cause: ====
+The wrong ''train.py'' file (outside of the ''/src'' folder) was being edited, and Azure ML was executing an outdated or incorrect version.
+==== ✅ Solution: ====
+The correct file (''/src/train.py'') was updated with ''print()'' statements to confirm execution. After correcting this, output logs began appearing as expected.
+----
+===== 🔁 Scoping Error in train.py =====
+==== ❌ Problem: ====
+Print statements accessing ''args.model_output'' were placed outside the ''main()'' function, resulting in a ''NameError''.
+==== ✅ Solution: ====
+The logging and ''print()'' statements were moved inside the ''main()'' function, ensuring access to the ''args'' object.
+----
+===== 🔁 Model Download Error =====
+==== ❌ Problem: ====
+An attempt to use the ''overwrite=True'' parameter in ''ml_client.jobs.download()'' caused a ''TypeError'' because that parameter is unsupported in the Azure ML v2 SDK.
+==== ✅ Solution: ====
+The ''overwrite'' parameter was removed, and if needed, the local folder was deleted manually before calling ''download()'' again.
+----
+===== 🔁 Silent Step Failure Due to Typo =====
+==== ❌ Problem: ====
+The dataset path was mistyped as ''"asureml:"'' instead of ''"azureml:"'', causing the ''prep_step'' to fail silently with no user-code execution.
+==== ✅ Solution: ====
+The typo was corrected, and the step executed normally once a valid dataset path was provided.
+----
+===== ✅ Final Outcome =====
+After resolving these issues:
+  * The pipeline executed end-to-end
+  * The model output was persisted and downloadable
+  * Logs confirmed proper script execution
+  * The deployment strategy was outlined, ready for API-based use
 [[ai_knowledge|AI Knowledge]]

Combined Cloud Managed Services

Site Tools

Differences

Page Tools