Machine Learning Diagnostics

Large/Small Neural Networks

For a large amount of training data, use large networks.

Applying Machine Learning

Below the steps to apply machine learning to a business problem.

1-Collect DataSet

2-Prepare DataSet

3-Select & Design Model

4-Train Model

5-Evaluate Model

6-Test Model

Train/Dev/Test Sets

Training set: a set of examples used for training a ML model.

Dev (Validation or Cross-Validation) set: a set of examples used to tune hyper-parameters of a ML model.

Test set: a set of examples used to evaluate how well the model does with data outside the training/Dev set.

For deep learning algorithms, we no longer need to follow the common rules of (70%, 30%) ratios for Train and Test sets or (70%, 15%, 15%) ratios for Train, Dev and Test sets. Splitting data into 98% as training set and 2% as test set could be an acceptable option.

Error Analysis

Let us define first the following acronyms:

TP: True positive

FP: False positive

TN: True negative

FN: False negative


(TP + TN) / (TP + TN + FP + FN)


TP / (TP + FP)


TP / (TP + FN)

F1 score (best error metric)

2 (Precision * Recall) /(Precision + Recall)

Accuracy alone is a bad measure when data is skewed.

Leave a Reply

Your email address will not be published. Required fields are marked *