This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| wiki:ai:cli-workspace-deployment-upload [2025/06/04 15:46] – ddehamer | wiki:ai:cli-workspace-deployment-upload [2025/06/04 15:51] (current) – ddehamer | ||
|---|---|---|---|
| Line 252: | Line 252: | ||
| In **Azure Machine Learning (Azure ML)**, datasets used for training can be in a variety of formats. The platform is flexible, but depending on how you plan to use the dataset (e.g., for AutoML, custom training, or pipelines), certain formats are more appropriate. | In **Azure Machine Learning (Azure ML)**, datasets used for training can be in a variety of formats. The platform is flexible, but depending on how you plan to use the dataset (e.g., for AutoML, custom training, or pipelines), certain formats are more appropriate. | ||
| - | ==== ✅ Commonly Supported Formats | + | === ✅ Commonly Supported Formats === |
| ^ Format ^ Description ^ Notes ^ | ^ Format ^ Description ^ Notes ^ | ||
| Line 277: | Line 277: | ||
| ---- | ---- | ||
| - | ==== ✅ Where to Store ==== | + | === ✅ Where to Store === |
| Datasets must be accessible to Azure ML compute. Typical storage options: | Datasets must be accessible to Azure ML compute. Typical storage options: | ||
| Line 314: | Line 314: | ||
| ---- | ---- | ||
| - | ==== ✅ Option 1: Export a Table to CSV Using SQL (INTO OUTFILE) | + | === ✅ Option 1: Export a Table to CSV Using SQL (INTO OUTFILE) === |
| <code -> | <code -> | ||
| Line 333: | Line 333: | ||
| ---- | ---- | ||
| - | ==== ✅ Option 2: Use mysql CLI and sed/awk (CSV-like output) | + | === ✅ Option 2: Use mysql CLI and sed/awk (CSV-like output) === |
| <code -> | <code -> | ||
| Line 344: | Line 344: | ||
| ---- | ---- | ||
| - | ==== ✅ Option 3: Python Script to Dump Entire Database to CSV Files (per Table) | + | === ✅ Option 3: Python Script to Dump Entire Database to CSV Files (per Table) === |
| If you want a full database export (one CSV per table): | If you want a full database export (one CSV per table): | ||
| Line 369: | Line 369: | ||
| ---- | ---- | ||
| - | ==== 🔐 Permissions Reminder | + | === 🔐 Permissions Reminder === |
| If you're getting errors with '' | If you're getting errors with '' | ||
| Line 392: | Line 392: | ||
| ---- | ---- | ||
| - | ===== ✅ 1. Via Azure ML Studio (Web UI) ===== | + | === ✅ 1. Via Azure ML Studio (Web UI) === |
| - | ==== 📍 Steps: | + | == 📍 Steps: == |
| - Go to Azure ML Studio | - Go to Azure ML Studio | ||
| Line 409: | Line 409: | ||
| ---- | ---- | ||
| - | ===== ✅ 2. Via Python SDK ===== | + | === ✅ 2. Via Python SDK === |
| - | ==== 🔧 Example: Upload a CSV as a tabular dataset | + | == 🔧 Example: Upload a CSV as a tabular dataset == |
| <code -> | <code -> | ||
| Line 442: | Line 442: | ||
| ---- | ---- | ||
| - | ===== ✅ 3. Via Azure CLI ===== | + | === ✅ 3. Via Azure CLI === |
| <code -> | <code -> | ||
| Line 456: | Line 456: | ||
| ---- | ---- | ||
| - | ===== ✅ 4. From Azure Blob Storage or External URL ===== | + | === ✅ 4. From Azure Blob Storage or External URL === |
| If your data is already in Azure Blob: | If your data is already in Azure Blob: | ||
| Line 478: | Line 478: | ||
| ---- | ---- | ||
| - | ===== 🔄 Upload Types ===== | + | === 🔄 Upload Types === |
| ^ Type ^ Use Case ^ | ^ Type ^ Use Case ^ | ||
| Line 508: | Line 508: | ||
| ---- | ---- | ||
| - | ===== ✅ Goal ===== | + | === ✅ Goal === |
| You want to **input a character name** and retrieve **associated details** from your dataset. This suggests: | You want to **input a character name** and retrieve **associated details** from your dataset. This suggests: | ||
| Line 517: | Line 517: | ||
| ---- | ---- | ||
| - | ===== ✅ Step-by-Step Plan (Using Azure ML + OpenAI or Embeddings) | + | === ✅ Step-by-Step Plan (Using Azure ML + OpenAI or Embeddings) === |
| - | ==== Step 1: Understand Your Data ==== | + | == Step 1: Understand Your Data == |
| I'll examine your dataset. | I'll examine your dataset. | ||
| Line 535: | Line 535: | ||
| ---- | ---- | ||
| - | ===== ✅ ML-Based Query System Options | + | === ✅ ML-Based Query System Options === |
| Since you're aiming to **query by character** and get full details, you have two main paths: | Since you're aiming to **query by character** and get full details, you have two main paths: | ||
| Line 541: | Line 541: | ||
| ---- | ---- | ||
| - | ==== 🔹 Option 1: Traditional Filter-Based Query ==== | + | == 🔹 Option 1: Traditional Filter-Based Query == |
| No ML needed — just build a UI or API to filter rows where '' | No ML needed — just build a UI or API to filter rows where '' | ||
| Line 549: | Line 549: | ||
| ---- | ---- | ||
| - | ==== 🔹 Option 2: AI-Powered Search (Semantic or Fuzzy Matching) | + | == 🔹 Option 2: AI-Powered Search (Semantic or Fuzzy Matching) == |
| Use **OpenAI embeddings** + **Azure ML** to let the user type: | Use **OpenAI embeddings** + **Azure ML** to let the user type: | ||
| Line 557: | Line 557: | ||
| **Steps:** | **Steps:** | ||
| - | === Step 1: Generate Embeddings for Each Row === | + | = Step 1: Generate Embeddings for Each Row = |
| Use OpenAI' | Use OpenAI' | ||
| Line 571: | Line 571: | ||
| ---- | ---- | ||
| - | ===== ✅ Recommendation | + | === ✅ Recommendation === |
| Would you prefer: | Would you prefer: | ||
| Line 581: | Line 581: | ||
| I can generate code or deploy scripts for either path. Which would you like? | I can generate code or deploy scripts for either path. Which would you like? | ||
| - | == You said: == | + | = You said: = |
| I would like to us ML for this. Do you have suggestions for some example that would be more suited to using ML with this dataset? | I would like to us ML for this. Do you have suggestions for some example that would be more suited to using ML with this dataset? | ||
| Line 593: | Line 593: | ||
| ---- | ---- | ||
| - | ===== ✅ 1. Price Estimation (Regression) | + | === ✅ 1. Price Estimation (Regression) === |
| - | ==== 📌 Problem: | + | == 📌 Problem: == |
| Predict '' | Predict '' | ||
| - | ==== ✅ Why it fits: ==== | + | == ✅ Why it fits: == |
| * You have labeled numeric data. | * You have labeled numeric data. | ||
| * It's useful for appraising unlisted items. | * It's useful for appraising unlisted items. | ||
| - | ==== 🔧 ML Type: ==== | + | == 🔧 ML Type: == |
| * **Regression** (e.g., Linear Regression, XGBoost, AutoML) | * **Regression** (e.g., Linear Regression, XGBoost, AutoML) | ||
| Line 610: | Line 610: | ||
| ---- | ---- | ||
| - | ===== ✅ 2. Category Prediction (Classification) | + | === ✅ 2. Category Prediction (Classification) === |
| - | ==== 📌 Problem: | + | == 📌 Problem: == |
| Predict the '' | Predict the '' | ||
| - | ==== ✅ Why it fits: ==== | + | == ✅ Why it fits: == |
| * Helps auto-label new entries. | * Helps auto-label new entries. | ||
| * Enables predictive organization/ | * Enables predictive organization/ | ||
| - | ==== 🔧 ML Type: ==== | + | == 🔧 ML Type: == |
| * **Classification** | * **Classification** | ||
| Line 627: | Line 627: | ||
| ---- | ---- | ||
| - | ===== ✅ 3. Clustering Characters by Similarity (Unsupervised) | + | === ✅ 3. Clustering Characters by Similarity (Unsupervised) === |
| - | ==== 📌 Problem: | + | == 📌 Problem: == |
| Group similar characters or figures automatically to find patterns or suggest groupings. | Group similar characters or figures automatically to find patterns or suggest groupings. | ||
| - | ==== ✅ Why it fits: ==== | + | == ✅ Why it fits: == |
| * Explores hidden structure. | * Explores hidden structure. | ||
| * Could be visualized or used to build recommendation systems. | * Could be visualized or used to build recommendation systems. | ||
| - | ==== 🔧 ML Type: ==== | + | == 🔧 ML Type: == |
| * **Clustering (e.g., KMeans)** using embeddings of text descriptions. | * **Clustering (e.g., KMeans)** using embeddings of text descriptions. | ||
| Line 644: | Line 644: | ||
| ---- | ---- | ||
| - | ===== ✅ 4. Anomaly Detection | + | === ✅ 4. Anomaly Detection === |
| - | ==== 📌 Problem: | + | == 📌 Problem: == |
| Find collectibles that are priced much higher/ | Find collectibles that are priced much higher/ | ||
| - | ==== ✅ Why it fits: ==== | + | == ✅ Why it fits: == |
| * Helps find mispriced entries or rare items. | * Helps find mispriced entries or rare items. | ||
| - | ==== 🔧 ML Type: ==== | + | == 🔧 ML Type: == |
| * **Anomaly Detection (Isolation Forest, One-Class SVM)** | * **Anomaly Detection (Isolation Forest, One-Class SVM)** | ||
| Line 660: | Line 660: | ||
| ---- | ---- | ||
| - | ===== ✅ 5. Embedding-Based Semantic Search (LLM-enhanced ML) ===== | + | === ✅ 5. Embedding-Based Semantic Search (LLM-enhanced ML) === |
| - | ==== 📌 Problem: | + | == 📌 Problem: == |
| Let users search using natural language and get intelligent matches. | Let users search using natural language and get intelligent matches. | ||
| - | ==== ✅ Why it fits: ==== | + | == ✅ Why it fits: == |
| * Transforms character/ | * Transforms character/ | ||
| * Allows vector similarity search over the dataset. | * Allows vector similarity search over the dataset. | ||
| - | ==== 🔧 Tools: | + | == 🔧 Tools: == |
| * Azure OpenAI (for embeddings) | * Azure OpenAI (for embeddings) | ||
| Line 679: | Line 679: | ||
| ---- | ---- | ||
| - | ===== 🧠 Suggestion | + | === 🧠 Suggestion === |
| To get started with a real ML training task: | To get started with a real ML training task: | ||
| - | ==== ▶️ Start with Price Estimation using Regression: | + | == ▶️ Start with Price Estimation using Regression: == |
| * **Input**: '' | * **Input**: '' | ||
| Line 690: | Line 690: | ||
| Would you like me to generate a full training script (e.g., in Python with scikit-learn or in Azure ML SDK) for this? | Would you like me to generate a full training script (e.g., in Python with scikit-learn or in Azure ML SDK) for this? | ||
| - | == You said: == | + | = You said: = |
| Yes | Yes | ||
| Line 698: | Line 698: | ||
| Your regression model has been successfully trained to predict '' | Your regression model has been successfully trained to predict '' | ||
| - | ==== ✅ Root Mean Squared Error (RMSE): ~30.15 | + | == ✅ Root Mean Squared Error (RMSE): ~30.15 == |
| This means that, on average, the model' | This means that, on average, the model' | ||
| Line 704: | Line 704: | ||
| ---- | ---- | ||
| - | ===== 🧪 Next Steps You Can Take ===== | + | === 🧪 Next Steps You Can Take === |
| - | ==== 🔧 Improvements | + | == 🔧 Improvements == |
| * Use more advanced models (e.g., **XGBoost**, | * Use more advanced models (e.g., **XGBoost**, | ||