Supervised learning is a machine learning technique where the model is trained on labeled data. This means that each training example is paired with an output label. One of the most common algorithms used in supervised learning is Linear Regression. Let's look at an example of how Linear Regression can be used to predict house prices.
Problem Statement
We want to predict the price of a house based on its size (in square feet). We have a dataset with historical data of house sizes and their corresponding prices.
Sample Data
Here's a small sample of the dataset:
House Size (sq ft) | Price ($) |
---|---|
850 | 200,000 |
900 | 220,000 |
1000 | 240,000 |
1200 | 280,000 |
1500 | 320,000 |
Step-by-Step Solution
- Load the Data: Import the necessary libraries and load the dataset.
- Visualize the Data: Plot the data to understand the relationship between house size and price.
- Train the Model: Use Linear Regression to train a model on the data.
- Make Predictions: Use the trained model to predict prices for new house sizes.
Code Example
Here is a Python code example using the scikit-learn
library:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
# Sample data
X = np.array([850, 900, 1000, 1200, 1500]).reshape(-1, 1)
y = np.array([200000, 220000, 240000, 280000, 320000])
# Create and train the model
model = LinearRegression()
model.fit(X, y)
# Make predictions
new_house_sizes = np.array([1100, 1300, 1600]).reshape(-1, 1)
predicted_prices = model.predict(new_house_sizes)
# Plot the data and the regression line
plt.scatter(X, y, color='blue', label='Actual Prices')
plt.plot(X, model.predict(X), color='red', label='Regression Line')
plt.scatter(new_house_sizes, predicted_prices, color='green', label='Predicted Prices')
plt.xlabel('House Size (sq ft)')
plt.ylabel('Price ($)')
plt.legend()
plt.show()
Explanation
- Load the Data: We create arrays for the house sizes (X) and their corresponding prices (y).
- Train the Model: We create an instance of
LinearRegression
and fit it to our data. - Make Predictions: We predict prices for new house sizes (1100, 1300, 1600 sq ft).
- Visualize the Data: We plot the original data points, the regression line, and the predicted prices.
Result
The plot generated by the above code will show:
- Blue dots representing the actual prices from the training data.
- A red line representing the linear regression line fit to the training data.
- Green dots representing the predicted prices for the new house sizes.
Conclusion
This simple example demonstrates how supervised learning, specifically Linear Regression, can be used to predict house prices based on the size of the house. By training the model on historical data, we can make predictions on new data, which is the core idea behind supervised learning.
0 Comments