Tools

Feature Scaling

2025-12-17 0 views admin

Feature Scaling

Source: Dev.to

🧑‍💻 Feature Scaling Made Simple ## 🌱 What is Feature Scaling? ## ⚙️ Why Do We Need It? ## 📏 Common Methods of Feature Scaling ## 🧩 Simple Example ## 📊 Feature Scaling in Python ## 🚀 Key Takeaways ## ✨ Final Note If you’re new to machine learning, you’ll often hear the term “feature scaling.” Don’t worry—it’s not as scary as it sounds! Let’s break it down step by step. 👉 Feature scaling is the process of bringing all features (columns of data) to a similar range so that no one feature dominates unfairly. Let’s say we have two features: After Min‑Max Scaling (0–1): Now both features are on the same scale, making them easier to compare. Make sure you have scikit-learn installed: pip install scikit-learn 🧑‍💻 What this code does: Think of feature scaling like adjusting the volume levels in a song. If one instrument is too loud, you won’t hear the others properly. Scaling balances the “volume” of your features so the algorithm hears them all clearly. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: import numpy as np from sklearn.preprocessing import MinMaxScaler, StandardScaler # Example data: Height (cm) and Income ($) data = np.array([ [150, 20000], [160, 50000], [170, 100000], [180, 200000], [190, 500000] ]) print("Original Data:\n", data) # 🔹 Min-Max Scaling (0–1 range) minmax_scaler = MinMaxScaler() data_minmax = minmax_scaler.fit_transform(data) print("\nMin-Max Scaled Data:\n", data_minmax) # 🔹 Standardization (mean=0, std=1) standard_scaler = StandardScaler() data_standard = standard_scaler.fit_transform(data) print("\nStandardized Data:\n", data_standard) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import numpy as np from sklearn.preprocessing import MinMaxScaler, StandardScaler # Example data: Height (cm) and Income ($) data = np.array([ [150, 20000], [160, 50000], [170, 100000], [180, 200000], [190, 500000] ]) print("Original Data:\n", data) # 🔹 Min-Max Scaling (0–1 range) minmax_scaler = MinMaxScaler() data_minmax = minmax_scaler.fit_transform(data) print("\nMin-Max Scaled Data:\n", data_minmax) # 🔹 Standardization (mean=0, std=1) standard_scaler = StandardScaler() data_standard = standard_scaler.fit_transform(data) print("\nStandardized Data:\n", data_standard) COMMAND_BLOCK: import numpy as np from sklearn.preprocessing import MinMaxScaler, StandardScaler # Example data: Height (cm) and Income ($) data = np.array([ [150, 20000], [160, 50000], [170, 100000], [180, 200000], [190, 500000] ]) print("Original Data:\n", data) # 🔹 Min-Max Scaling (0–1 range) minmax_scaler = MinMaxScaler() data_minmax = minmax_scaler.fit_transform(data) print("\nMin-Max Scaled Data:\n", data_minmax) # 🔹 Standardization (mean=0, std=1) standard_scaler = StandardScaler() data_standard = standard_scaler.fit_transform(data) print("\nStandardized Data:\n", data_standard) - Imagine you’re comparing the height of people (in centimeters) and their income (in dollars). - Heights might range from 150–200, while incomes could range from 10,000–100,000. - Because the numbers are on very different scales, some algorithms might think income is way more important than height—just because the values are bigger. - Many machine learning algorithms (like K‑Nearest Neighbors, Gradient Descent, Neural Networks) calculate distances or gradients. If features are on different scales, results can be misleading. - Scaling makes training faster and improves accuracy. - It ensures fair comparison between features. - Height: [150, 160, 170, 180, 190] - Income: [20,000, 50,000, 100,000, 200,000, 500,000] - Height → [0, 0.25, 0.5, 0.75, 1] - Income → [0, 0.06, 0.16, 0.36, 1] - Creates a small dataset with Height and Income. - Applies Min‑Max Scaling → values between 0 and 1. - Applies Standardization → values centered around 0 with unit variance. - Prints results so you can see the difference - Feature scaling = putting features on similar ranges. - It prevents one feature from dominating just because of larger numbers. - Use Min‑Max for bounded data, Standardization for normal distributions, and Robust Scaling for data with outliers.

🏷️ Tags

how-totutorialguidedev.toaimachine learningneural networknetworkpython