Skip to main content

Machine Learning Techniques and Training

  

            

    Machine Learning Techniques and Training





Machine Learning is a broad field and we can split it up into three different categories, Supervised Learning, Unsupervised Learning, and Reinforcement Learning. There are many different tasks we can solve with these. 



Supervised Learning refers to when we have class labels in the dataset and we use these to build the classification model. What this means is when we receive data, it has labels that say what the data represents. In a previous example, we had a table with labels such as age or sex. 


With Unsupervised Learning, we don't have class labels and we must discover class labels from unstructured data. This could involve things such as deep learning looking at pictures to train models. Things like this are typically done with something called clustering. Reinforcement Learning is a different subset, and what this does is it uses a reward function to penalize bad actions or reward good actions. Breaking down Supervised Learning, we can split it up into three categories, Regression, Classification and Neural Networks. Regression models are built by looking at the relationships between features x and the result y where y is a continuous variable. 


Essentially, Regression estimates continuous values. Neural Networks refer to structures that imitate the structure of the human brain. Classification on the other hand, focuses on discrete values it identifies. We can assign discrete class labels y based on many input features x. In a previous example, given a set of features x, like beats per minute, body mass index, age and sex, the algorithm classifies the output y as two categories, True or False, predicting whether the heart will fail or not. In other Classification models, we can classify results into more than two categories. 


For example, predicting whether a recipe is for an Indian, Chinese, Japanese, or Thai dish. Some forms of classification include decision trees, support vector machines, logistic regression, and random forests. With Classification, we can extract features from the data. The features in this example would be beats per minute or age. Features are distinctive properties of input patterns that help in determining the output categories or classes of output. Each column is a feature, and each row is a data point. Classification is the process of predicting the class of given data points. Our classifier uses some training data to understand how given input variables relate to that class. 


What exactly do we mean by training? Training refers to using a learning algorithm to determine and develop the parameters of your model. While there are many algorithms to do this, in layman's terms, if you're training a model to predict whether the heart will fail or not, that is True or False values, you will be showing the algorithm some real-life data labeled True, then showing the algorithm again, some data labeled False, and you will be repeating this process with data having True or False values, that is whether the heart actually failed or not. 


The algorithm modifies internal values until it has learned to tell from data that indicates heart failure that is True or not, that is False. With Machine Learning, we typically take a dataset and split it into three sets, Training, Validation and Test sets. The Training subset is the data used to train the algorithm. The Validation subset is used to validate our results and fine-tune the algorithm's parameters. The Testing data is the data the model has never seen before and used to evaluate how good our model is. We can then indicate how good the model is using terms like, accuracy, precision and recall.



Avinash C. Pillai

Technology Director

syniverse® 

The world’s most connected company™ 

Website / Twitter / LinkedIn/ connected company™  


Comments

Popular posts from this blog

What is Cybersecurity Risk? Definition & Factors to Consider

  Cybersecurity risk has become a leading priority for organizations as they embrace digital transformation and leverage advanced technology solutions to drive business growth and optimize efficiencies. Additionally, many organizations are increasingly reliant on third-party and   fourth-party vendors   or programs.  In this post, we’ll explore what cybersecurity risk is and take a look at some key cybersecurity risk factors that organizations across all industries should keep in mind as they build and refine their   cybersecurity risk management strategy .   What is cybersecurity risk? Cybersecurity risk refers to   potential threats and vulnerabilities   in digital systems. It encompasses the likelihood of a cyberattack compromising data or systems, leading to financial,   reputational , or operational damage. A few examples of cybersecurity risks include   ransomware ,   malware ,   insider threats ,   phishing attacks ...

How To Manage Dell Servers using OMSA – OpenManage Server Administrator On Linux

OMSA is a web based application to manage DELL PowerEdge Servers. Using OMSA you can perform proactive system monitoring, system diagnosis, troubleshoot hardware issues and configure RAID etc., You can also view and manage hardware’s embedded system management (ESM) log using OMSA. This is an jumpstart guide that explains how to install Dell OMSA on Linux. I have also provided few screenshots of DELL OMSA web application. 1. Download DELL OMSA Go to DELL support website -> click on “Drivers & Downloads” -> choose your server model (in my case, I selected PowerEdge 2850) -> choose the operating system -> scroll-down and expand ‘Systems Management’ -> Click on ‘OpenManage Server Administrator Managed Node’ -> Click on OM_5.5.0_ManNode_A00.tar.gz to download it. You can directly download it to your Linux server as shown below. # mkdir -p /usr/local/omsa # cd /usr/local/omsa # wget http://ftp.us.dell.com/sysman/OM_5.5.0_ManNode_A00.tar.gz 2....

Adding a New Hard Drive to Linux using LVM

I did this on RHEL 5.5. You should have the new drive created and added to the machine before you start anything.First you need to fdisk the drive like any other drive in linux. One key step is that you need to change the type; option ?t? to ?8e? for lvm. root@itsme ~# fdisk /dev/sdc Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel. Changes will remain in memory only,until you decide to write them. After that, of course, the previous content won't be recoverable. The number of cylinders for this disk is set to 7832. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) Command (m for help): n Command action e ex...