Matthews Correlation Coefficient Calculator

Calculate the Matthews Correlation Coefficient from a confusion matrix or binary label arrays, with a full suite of classification metrics.

๐Ÿงฎ Matthews Correlation Coefficient (MCC)

Enter the counts from your binary classifier's confusion matrix.

True Positives (TP)
False Positives (FP)
False Negatives (FN)
True Negatives (TN)

Enter binary labels separated by commas or spaces. Accepted values: 0/1, yes/no, true/false, positive/negative.

Actual Labels
Predicted Labels
MCC
Interpretation
Accuracy
Balanced Accuracy
Precision (PPV)
Recall (Sensitivity)
Specificity (TNR)
NPV
F1 Score
Cohen's Kappa
Confusion matrix used: TP=-- | FP=-- | FN=-- | TN=-- | Total=--

๐Ÿงฎ What is the Matthews Correlation Coefficient?

The Matthews Correlation Coefficient (MCC) is a measure of the quality of binary classification models. It was introduced by biochemist Brian Matthews in 1975 to evaluate predictions of protein secondary structure and has since been adopted as a gold-standard metric in machine learning, medical diagnostics, bioinformatics, and any domain where binary classification quality needs to be measured rigorously. The MCC ranges from -1 to +1: a value of +1 represents a perfect classifier, 0 represents a classifier no better than random guessing, and -1 represents a perfectly inverted classifier that always predicts the wrong class.

MCC is used across a wide range of real-world applications. In machine learning, it evaluates models for spam detection, fraud detection, disease diagnosis, and churn prediction. In medicine, it measures the quality of diagnostic tests against ground truth labels (positive/negative for a condition). In bioinformatics, it benchmarks gene expression classifiers and protein structure predictions. In software testing, it assesses defect prediction models. Unlike simpler metrics, MCC is especially valuable when class distributions are highly imbalanced, because it incorporates all four cells of the confusion matrix rather than focusing only on one class or one type of error.

A common misconception is that accuracy is sufficient to evaluate a binary classifier. On a dataset with 99% negative samples, a model that always predicts negative achieves 99% accuracy despite zero predictive ability. Its MCC, however, equals 0, correctly signalling no predictive correlation. Another misconception is that the F1 score is equivalent to MCC. F1 ignores True Negatives entirely, which makes it blind to the model's performance on the negative class. MCC penalises poor performance on either class symmetrically, making it strictly more informative than both accuracy and F1 for imbalanced problems.

This calculator accepts two input formats: a confusion matrix (TP, FP, FN, TN counts) for when you already have aggregated results, and raw binary label arrays for when you have lists of actual and predicted values. Both formats compute the same 10-metric output: MCC, accuracy, balanced accuracy, precision (PPV), recall (sensitivity), specificity (TNR), negative predictive value (NPV), F1 score, and Cohen's kappa.

๐Ÿ“ MCC Formula

MCC  =  (TP × TN − FP × FN) ÷ √[(TP+FP)(TP+FN)(TN+FP)(TN+FN)]
TP = True Positives (correctly predicted positive cases)
TN = True Negatives (correctly predicted negative cases)
FP = False Positives (negative cases incorrectly predicted as positive)
FN = False Negatives (positive cases incorrectly predicted as negative)
Convention: If any factor in the denominator is 0, MCC = 0
Range: -1 (perfect inverse) to 0 (random) to +1 (perfect classifier)

The numerator TP x TN - FP x FN measures the difference between correct and incorrect predictions in a balanced way across both classes. The denominator normalises this difference by the geometric mean of the four marginal totals of the confusion matrix, ensuring the result lies in [-1, +1] regardless of class balance or total sample count. The formula is equivalent to the Pearson product-moment correlation coefficient applied to two binary variables (actual and predicted labels coded as 0 and 1).

Related Metrics
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Balanced Accuracy = (Sensitivity + Specificity) / 2
Precision (PPV) = TP / (TP + FP)
Recall (Sensitivity) = TP / (TP + FN)
Specificity (TNR) = TN / (TN + FP)
F1 Score = 2 x Precision x Recall / (Precision + Recall)

๐Ÿ“– How to Use This Calculator

Using Confusion Matrix Mode and Raw Labels Mode

1
Choose input mode -- Select Confusion Matrix if you already have TP, TN, FP, FN counts. Select Raw Labels to paste actual and predicted binary arrays directly.
2
Enter your data -- In Confusion Matrix mode, type the count for each cell. Values must be zero or positive integers. In Raw Labels mode, enter labels separated by commas or spaces using 0/1, yes/no, true/false, or positive/negative.
3
Click Calculate MCC -- The results panel shows MCC with interpretation, plus 9 companion metrics: accuracy, balanced accuracy, precision, recall, specificity, NPV, F1, and Cohen's kappa.

๐Ÿ’ก Example Calculations

Example 1 -- Balanced Dataset, Good Classifier

TP=90, FP=10, FN=5, TN=95 (200 total samples, balanced classes)

1
Numerator: TP x TN - FP x FN = 90 x 95 - 10 x 5 = 8,550 - 50 = 8,500
2
Denominator: sqrt((90+10)(90+5)(95+10)(95+5)) = sqrt(100 x 95 x 105 x 100) = sqrt(99,750,000) = 9,987.5
3
MCC = 8,500 / 9,987.5 = 0.8511 (Strong prediction). Accuracy = 185/200 = 92.5%
MCC = 0.8511 • Accuracy = 92.50% • F1 = 0.9231
Try this example →

Example 2 -- Imbalanced Dataset, Degenerate Classifier

TP=0, FP=0, FN=50, TN=950 (model always predicts negative; 95% class imbalance)

1
Accuracy = (0 + 950) / 1000 = 95.0% -- appears excellent
2
MCC numerator: 0 x 950 - 0 x 50 = 0. Denominator includes factor (TP + FP) = 0, so by convention MCC = 0
3
MCC = 0 correctly signals no predictive ability. Accuracy = 95% is misleading. F1 = 0 (also correct). Balanced Accuracy = 50.0%.
MCC = 0 • Accuracy = 95.00% (misleading) • Balanced Accuracy = 50.00%
Try this example →

Example 3 -- Medical Diagnostic Test

Disease screening: TP=80, FP=20, FN=15, TN=385 (500 patients)

1
Numerator: 80 x 385 - 20 x 15 = 30,800 - 300 = 30,500
2
Denominator: sqrt((80+20)(80+15)(385+20)(385+15)) = sqrt(100 x 95 x 405 x 400) = sqrt(1,539,000,000) = 39,230
3
MCC = 30,500 / 39,230 = 0.7775. Sensitivity = 80/95 = 84.2%, Specificity = 385/405 = 95.1%
MCC = 0.7775 • Sensitivity = 84.21% • Specificity = 95.06%
Try this example →

Example 4 -- Negative MCC (Inverted Classifier)

TP=5, FP=90, FN=95, TN=10 (classifier is systematically wrong)

1
Numerator: TP x TN - FP x FN = 5 x 10 - 90 x 95 = 50 - 8,550 = -8,500
2
Denominator: sqrt((5+90)(5+95)(10+90)(10+95)) = sqrt(95 x 100 x 100 x 105) = 9,987.5
3
MCC = -8,500 / 9,987.5 = -0.8511. Accuracy = 15/200 = 7.5%. Flipping predictions would give MCC = +0.8511.
MCC = -0.8511 (Strong inverse prediction; swap labels to fix)
Try this example →

โ“ Frequently Asked Questions

What is the Matthews Correlation Coefficient (MCC)?+
The Matthews Correlation Coefficient is a single-number summary of a binary classifier's performance ranging from -1 to +1. It was introduced by Brian Matthews in 1975 for protein structure prediction and measures the correlation between actual and predicted binary outcomes across all four confusion matrix cells (TP, TN, FP, FN). A value of +1 means perfect prediction, 0 means no better than random guessing, and -1 means every prediction is wrong.
What is the MCC formula?+
MCC = (TP x TN - FP x FN) divided by the square root of (TP + FP)(TP + FN)(TN + FP)(TN + FN). TP = true positives, TN = true negatives, FP = false positives, FN = false negatives. If the denominator equals zero (because any of its four factors is zero), MCC is defined as 0 by convention. This convention handles degenerate cases where the model predicts only one class.
Why is MCC considered better than accuracy for imbalanced datasets?+
Accuracy counts all correct predictions equally, which makes it misleadingly high when one class dominates. A model that always predicts the majority class on a 99/1 dataset achieves 99% accuracy with zero predictive ability. Its MCC is 0, which correctly reflects no correlation. MCC uses all four confusion matrix cells and produces a balanced score regardless of how skewed the class distribution is, making it the recommended primary metric for fraud detection, disease screening, rare event prediction, and other imbalanced tasks.
What is a good MCC value for a machine learning model?+
MCC values above 0.7 indicate strong predictive performance and are considered good in most applications. Values between 0.5 and 0.7 indicate moderate performance. Values between 0.3 and 0.5 indicate weak but statistically meaningful association. Values below 0.3 are generally considered poor. Values near 0 suggest the model has no useful predictive ability. The threshold for an acceptable MCC depends on the application: medical diagnostics typically requires MCC above 0.8, while fraud detection might accept 0.5 given data complexity.
What is the difference between MCC and the F1 score?+
F1 score is the harmonic mean of Precision and Recall and ignores True Negatives entirely. This makes F1 high when the model correctly identifies positives but fails on negatives. MCC includes all four confusion matrix cells including TN, so it penalises models that perform poorly on either class. On balanced datasets the two metrics often rank models similarly. On imbalanced datasets with a large negative class, MCC provides a more conservative and complete assessment of model quality.
Is MCC the same as the Phi Coefficient?+
Yes. The MCC and the Phi Coefficient are mathematically identical. The Phi Coefficient is used in statistics to measure association between two binary variables in a 2x2 contingency table. MCC is the term preferred in machine learning and bioinformatics. Both formulas produce the same number when given the same TP, TN, FP, and FN values. The Phi Coefficient is also numerically equal to the Pearson product-moment correlation coefficient computed on binary (0/1) coded variables.
What does a negative MCC value mean?+
A negative MCC means the classifier is predicting the wrong class more often than it should by chance. An MCC of -0.5 means there is a moderate systematic tendency to predict incorrectly. An MCC of -1 means the classifier always predicts the wrong class (a perfect inverse classifier). In practice, a clearly negative MCC usually indicates inverted class labels (the positive and negative class definitions are swapped), a bug in the prediction pipeline, or a severely miscalibrated model.
What is Cohen's kappa and how does it relate to MCC?+
Cohen's kappa measures inter-rater agreement between two raters correcting for chance agreement. For binary classification it measures how much better the classifier performs compared to a random classifier with the same marginal distributions. Both kappa and MCC account for class imbalance, but they use different formulas and are not mathematically equivalent. Research by Chicco and Jurman (2020) showed that MCC is generally more informative than kappa for evaluating binary classifiers, particularly for severely imbalanced datasets.
Can MCC be used for multi-class classification?+
Yes, but not directly with this calculator. For problems with more than two classes, the MCC generalises to the multiclass MCC (sometimes called R_K or the multiclass correlation coefficient). It is computed from a K x K confusion matrix using a more complex formula. Several research papers (including Gorodkin 2004 and Jurman et al. 2012) have formalised the multiclass MCC. This calculator covers binary classification only (two-class problems with one positive and one negative class).
What is balanced accuracy and when should I use it instead of MCC?+
Balanced accuracy is the arithmetic mean of Sensitivity (recall on the positive class) and Specificity (recall on the negative class), equal to (Sensitivity + Specificity) / 2. It ranges from 0 to 1 and is easier to interpret than MCC but less sensitive to severely imbalanced confusions. MCC accounts for all four confusion cells simultaneously, while balanced accuracy only considers the per-class recall rates. MCC is generally preferred in machine learning research. Balanced accuracy is more common in clinical performance evaluation where sensitivity and specificity have direct clinical interpretations.
How do I interpret the confusion matrix in this calculator?+
The confusion matrix has four cells. True Positives (TP): actual positives correctly predicted as positive. True Negatives (TN): actual negatives correctly predicted as negative. False Positives (FP): actual negatives incorrectly predicted as positive (Type I error). False Negatives (FN): actual positives incorrectly predicted as negative (Type II error). For a medical test, TP = sick patients correctly diagnosed, TN = healthy patients correctly cleared, FP = healthy patients falsely alarmed, FN = sick patients missed.
How is MCC calculated from raw labels instead of a confusion matrix?+
To calculate MCC from raw labels, compare each actual label to the corresponding predicted label and count the four outcomes: TP (both 1), TN (both 0), FP (actual 0, predicted 1), FN (actual 1, predicted 0). Once you have these counts, apply the MCC formula. This calculator's Raw Labels mode does this automatically when you paste comma-separated binary arrays. Labels can be entered as 0/1, yes/no, true/false, or positive/negative.