Machine Learning

Giving computers the ability to learn without being programmed

The three types of machine learning

Supervised Learning

Learning from labeled data so the model predict unseen or future data

Regression

Predicted value: Predicting a continuous numeric value (an infinite number of outcomes possible)
Example: Predicting house prices

Linear Regression

Classification

Predicting the outcome (Y) based on the input (X)
Predicted value: Discrete categorical values
Limited number of outcomes possible
Example: Breast cancer detection

Unsupervised Learning

Finding interesting patterns in unlabeled data
It’s not about finding the correct answer for every input, but rather the algorithm discovering on its own what patterns or structures it might fit into

Clustering

Grouping similar data points together

Google news
DNA microarray
Grouping customers
Algorithms
- K-means
- DBScan(Density-Based Spatial Cluster)

The first key steps to implementing Linear Regression are

Define a Cost Function
- Definition: Measures the difference between the predicted values obtained using the model’s parameters and the actual values from the training data to quantify the error or loss of the model’s predictions
- Minimize value of the cost function by finding the optimal values for the parameters

\(\begin{align} J(w,b)=\frac{1}{2m}(\displaystyle\sum_{i=1}^{m}(f_{w,b}(x^{(i)})-y^{(i)})^2 \end{align}\)
w and b are parameters of the model, adjusted as the model learns from the data. They’re also referred to as “coefficients” or “weights”

How to measure the performance of a classification model

The performance of a classification model is typically measured using metrics such as accuracy, precision, recall, and F1 score

\(\begin{align} Accuracy = \frac{TP+TN}{Total\,number\,of\,predictions} \end{align}\)
The proportion of instances that are correctly predicted across all predictions

\(\begin{align} Precision = \frac{TP}{TP+FP}\\ \end{align}\)
The proportion of true positives among the values predicted as True by the model

\(\begin{align} Recall = \frac{TP}{TP+FN} \end{align}\)
The proportion of true predictions made by the model that are actually true

\(\begin{align} F1 Score(Accuracy + Recall) \end{align}\)

Gradient descent

Basic techniques widely used in machine learning, including advanced neural network models.
Algorithms to find the minimum value of a cost function by changing the Weight and Bias (parameters) until the cost function has a minimum value.
If the cost function is not bow-shaped or hammock-shaped, there may be more than one possible minimum.
Finding the global or local minimum of the function.
Depending on the starting point and the shape of the cost function, you may have local minima with different gradients, so initial conditions are important
For linear regression, parameters are often set to zero initially

Gradient descent algorithm

\(\begin{aligned} &w = w - \alpha \frac{\partial}{\partial w} J(w, b)\\\\ &\alpha: \text{Learning rate (size)} (\alpha \text{ always has a positive value})\\\\ &\frac{\partial}{\partial w} J(w, b): \text{Derivative (descent direction)}\\\\ \end{aligned}\)

The partial derivative of the Cost Function with respect to weight indicates the gradient, and the sign of this gradient determines how weights are adjusted. If the gradient is positive, reduce the weight to decrease the value of the cost function;
if the gradient is negative, increase the weight to decrease the cost function value

Reference:
Coursera - Supervised Machine Learning: Regression and Classification

Raschka, S., & Mirjalili, V. (2017). Python machine learning : machine learning and deep learning with Python, scikit-learn, and TensorFlow. In Packt Publishing eBooks. http://202.62.95.70:8080/jspui/handle/123456789/12650

https://sumniya.tistory.com/26