Machine Learning Lecture Notes Overview
Machine Learning Lecture Notes Overview
Deploying a machine learning model involves training the model, exporting it as a serialized file (e.g., .pkl), integrating it into an API using tools like Flask or FastAPI, and finally deploying it on platforms such as AWS or Firebase for public access. These steps ensure the model is accessible and integrates with end-user applications .
Activation functions in neural networks are essential for introducing non-linearity into the model, allowing it to learn complex patterns. They determine the output of neurons by transforming the weighted sum of inputs. Common activation functions include Sigmoid, ReLU, and Tanh, each impacting learning rate and model performance differently .
NLP tasks face challenges such as handling language ambiguity, context understanding, and managing large unstructured data. These are mitigated using techniques like tokenization, stopword removal, stemming, vectorization models (e.g., BERT), and utilizing custom architectures for specific language tasks, improving accuracy and contextual understanding .
PCA reduces dimensionality by transforming correlated features into a set of linearly uncorrelated variables called principal components. This process helps in retaining the maximum variance present in the data, reducing computational cost, and eliminating noise, thereby improving the efficiency of the model .
Reinforcement learning differs as it involves an agent interacting with an environment and learning to maximize rewards through trial and error, in contrast to supervised learning's reliance on labeled data and unsupervised learning's focus on uncovering patterns without guidance. This unique approach is suitable for dynamic environments .
The training/validation/test split is crucial for assessing a model's performance under different conditions. The training set is used to fit the model, the validation set for tuning hyperparameters, and the test set provides an unbiased evaluation of the final model. This process helps prevent overfitting and ensures the model generalizes well to new data .
Gradient descent optimizes the cost function, specifically the Mean Squared Error (MSE), by iteratively updating model parameters (slope and intercept) to minimize error. The update rule adjusts parameters based on the gradient of the error, allowing the model to converge towards optimal values .
CNNs are predominantly used for image-related tasks such as face recognition, in medical diagnostic applications for X-ray classification, and more broadly for image classification tasks. This makes them valuable in healthcare, security, and media management sectors .
Overfitting occurs when a model learns the training data too well, including its noise and outliers, leading to poor generalization to new data. This can be mitigated through techniques such as cross-validation, using simpler models, regularization, and ensuring an adequate split between training, validation, and test datasets .
Supervised learning requires labeled input and output data, focusing on predicting outcomes as in regression or classification. In contrast, unsupervised learning only involves input data with the goal of finding patterns or structures, with techniques such as K-Means and PCA .