Skip to content

Having Fun with ML


Machine Learning Notes

This page is dedicated to my personal notes, suggestions, and small projects to use machine learning. Here, I’m just recording anything I’m doing in ML area likes some useful notes, practice, small projects or publications. Hence, instead of writing these notes on some local text processing file, I’m just recording them on this page. If for some reason you (as a reader) think that the information is not useful, just skip the page. The page is not supposed to be a reference page.

Useful Links, Videos, and Resources for Quick Learning

  • If you are a super beginner in this field, I suggest you start with this YouTube channel. Particularly, you can start with DeepLearning Simplified series of videos.
  • If you want to know all aspects of Deep learning, from beginner to advanced, I suggest you start with the Deep Learning Specialization course in Coursera.


Some quick notes

  • Deep learning is all about the neural network (NN).
  • NN consists of structured layers. Input layer, the output layer, and hidden layers.
  • The activation of the nodes depends on the weight of the edge and the activation bias. We train the network to get as accurate as possible the value of w and bias.
  • The process of improving the neural network accuracy is called training.
  • To train the NN, the output of the forward propagation is compared to the output that is known to be correct. The cost is the difference between the two.
  • NN is suitable when the pattern gets complex.
  • Pattern = Simple==>>Support Vector Machine (SVM) classification.
  • Pattern = Moderate ==>> Deep learning outperform
  • Pattern = Complex ==>> Deep Network is the only practical choice.
  • Deep nets can break down the complex patterns into simpler patterns.

How to choose a Deep Net?

If we have labeled data:

DBN = Deep belief net
RNTN = Recursive neural tensile network.

If we have unlabeled data:

RBM = Restricted boatsman machine

  • Back Propagation: The process used to train a neural network. You will run the training, and you may go for a problem called vanishing gradient or called exploding gradient.
  • When you training a neural net, you continuously calculate a cost value. The cost will make a turning to the weight and biases
  • Gradient: rates at which the cost changes with respect to weight and bias. When the gradient is large, the network will train quickly, and when the gradient is small, the network trains slowly.
  • DBN = needs only a small tabled dataset for training which is very important for real-world applications.
  • CNN = Convolutional neural network.
  • RELU = Rectified Linear Unit, allowed the network to be trained without a harmful slowdown in the crucial early layers.
  • A typical deep CNN has three layers (1) Convolution (2) RELU, and (3) Pooling layers and they are repeated several times.
  • RNN = Recurrent Neural Network = used when the patterns of the data changed with the time.
  • Feedforward Neural Network = Signals are feed for one direction from input to output.
  • In RNN the output of the layer is added to the next input and fed back into the same layer which is typically the only layer in the network.


  • When the network grows in size, training RNN is very hard and the decay of information through time.
  • Gating – Is a method to decide when to forget an input and when to remember it for future time steps.
  • GRU and LSTM are the most important gating methods nowadays.
  • Recursive Neural Tensor Network (RNTN): Recursive neural tensor networks (RNTNs) are neural nets useful for natural-language processing. They have a tree structure with a neural net at each node. You can use recursive neural tensor networks for boundary segmentation, to determine which word groups are positive and which are negative. The same applies to sentences as a whole.
  • RNTN has three basic components, a parent group which is the root and the child group which is the leaf. Each group is simply a collection of neurons.

  • As an example of the ML use case, you can see or
  • We have Deep learning platform that can help to build deep learning on top, like IBM Watson studio.
  • We have two choices to use Deep learning, platform or library.
  • In the platform: you don’t need to know about the low-level programming. But, you have limitation by the platform functionality.
  • Deep Net library: increase flexibility and options, but you need to know to code. There are different libraries like – commercial – Deeplearning4j, Torch, Caffe, – Free educational- Theano library.

Activation Functions

There are three activation functions:

  1. Sigmoid
  2. Tanh: If you have data in a negative range, Tanh may be better.
  3. ReLU: the most used activation function today. Because the computational complexity is less and the neural network tend to converge faster.

  • So, it is better to start with the ReLU for the input and hidden layers and for the output layer, it depends on the task. If you are in linear regressor, use a linear output unit, and if you are creating a classifier, use softmax or sigmoid.
  • NN ideally tends to fit any data set, but some datasets are containing a lot of noise.
  • We can divide the data set into the Training and Test data set.


Practicing Machine Learning with Real-world Problems

On his website, “Jason Brownlee” provides several useful guidelines and advice to implement ML to solve real-world problems. To establish any ML project for problem-solving, there are five essential steps:

  1. Define the problem
  2. Prepare data
  3. Evaluate algorithms
  4. Improve results
  5. Write-up results