How is MCC calculated from a confusion matrix?

MCC = (TP x TN - FP x FN) divided by the square root of (TP + FP)(TP + FN)(TN + FP)(TN + FN). If any denominator factor equals zero, MCC is defined as 0 by convention. TP = true positives, TN = true negatives, FP = false positives, FN = false negatives.

Why is MCC better than accuracy for imbalanced datasets?

Accuracy counts all correct predictions equally, so a model that always predicts the majority class scores high accuracy even with zero predictive ability. For example, with 95% negative samples, a model that always predicts negative achieves 95% accuracy but MCC of 0. MCC accounts for all four confusion matrix cells and produces a meaningful score near 0 for such degenerate classifiers, making it far more reliable when class distribution is skewed.

What is the difference between MCC and F1 score?

F1 score is the harmonic mean of Precision and Recall and ignores True Negatives entirely. This means F1 can be high even when the model performs poorly on the negative class. MCC includes TN in its formula, so it penalises poor performance on either class. For balanced datasets, F1 and MCC tend to agree. For imbalanced datasets, MCC is the more conservative and arguably more honest metric.

Is the Matthews Correlation Coefficient the same as the Phi Coefficient?

Yes. The MCC and the Phi Coefficient are mathematically identical. Both use the same formula applied to a 2x2 contingency table. The Phi Coefficient is the standard term in statistics for measuring association between two binary categorical variables. MCC is the term used in machine learning and bioinformatics. They produce exactly the same numerical result from the same TP, TN, FP, FN counts.

What does it mean when MCC is 0?

An MCC of exactly 0 means the model's predictions are uncorrelated with the true labels. The model performs no better than random class assignment. This can happen either because the model genuinely has no predictive power, or because the denominator of the MCC formula equals zero (which occurs when the model always predicts one class, producing either zero TP, zero TN, or both). By convention, MCC is defined as 0 in the degenerate denominator case.

Yes. MCC ranges from -1 to +1. A negative MCC means the model is systematically predicting the wrong class more often than it should by chance. An MCC of -1 is a perfect negative classifier: every positive is predicted negative and every negative is predicted positive. In practice, a strongly negative MCC usually indicates that the class labels were accidentally inverted or the model was trained incorrectly.

Matthews Correlation Coefficient Calculator

Q: What is the Matthews Correlation Coefficient?

The Matthews Correlation Coefficient (MCC) is a measure of the quality of a binary classification model. It was introduced by biochemist Brian Matthews in 1975 for evaluating protein structure predictions. MCC ranges from -1 to +1, where +1 is a perfect classifier, 0 is no better than random guessing, and -1 is a perfectly inverse classifier. It is derived from all four cells of the confusion matrix (TP, TN, FP, FN) and is considered the most informative single metric for imbalanced classification problems.

Q: What is a good MCC value?

MCC values above 0.7 indicate strong predictive performance. Values in the 0.5 to 0.7 range indicate moderate performance. Values between 0.3 and 0.5 indicate weak but statistically meaningful association. Values below 0.3 or near 0 suggest the model has little predictive ability beyond chance. Negative MCC values indicate the model consistently predicts the wrong class, which is worse than random.

Calculate the Matthews Correlation Coefficient from a confusion matrix or binary label arrays, with a full suite of classification metrics.

Enter the counts from your binary classifier's confusion matrix.

True Positives (TP)

False Positives (FP)

False Negatives (FN)

True Negatives (TN)

Enter binary labels separated by commas or spaces. Accepted values: 0/1, yes/no, true/false, positive/negative.

Actual Labels

Predicted Labels

MCC

—

Interpretation

—

Accuracy

—

Balanced Accuracy

—

Precision (PPV)

—

Recall (Sensitivity)

—

Specificity (TNR)

—

NPV

—

F1 Score

—

Cohen's Kappa

—

Confusion matrix used: TP=-- | FP=-- | FN=-- | TN=-- | Total=--

🧮 What is the Matthews Correlation Coefficient?

The Matthews Correlation Coefficient (MCC) is a measure of the quality of binary classification models. It was introduced by biochemist Brian Matthews in 1975 to evaluate predictions of protein secondary structure and has since been adopted as a gold-standard metric in machine learning, medical diagnostics, bioinformatics, and any domain where binary classification quality needs to be measured rigorously. The MCC ranges from -1 to +1: a value of +1 represents a perfect classifier, 0 represents a classifier no better than random guessing, and -1 represents a perfectly inverted classifier that always predicts the wrong class.

MCC is used across a wide range of real-world applications. In machine learning, it evaluates models for spam detection, fraud detection, disease diagnosis, and churn prediction. In medicine, it measures the quality of diagnostic tests against ground truth labels (positive/negative for a condition). In bioinformatics, it benchmarks gene expression classifiers and protein structure predictions. In software testing, it assesses defect prediction models. Unlike simpler metrics, MCC is especially valuable when class distributions are highly imbalanced, because it incorporates all four cells of the confusion matrix rather than focusing only on one class or one type of error.

A common misconception is that accuracy is sufficient to evaluate a binary classifier. On a dataset with 99% negative samples, a model that always predicts negative achieves 99% accuracy despite zero predictive ability. Its MCC, however, equals 0, correctly signalling no predictive correlation. Another misconception is that the F1 score is equivalent to MCC. F1 ignores True Negatives entirely, which makes it blind to the model's performance on the negative class. MCC penalises poor performance on either class symmetrically, making it strictly more informative than both accuracy and F1 for imbalanced problems.

This calculator accepts two input formats: a confusion matrix (TP, FP, FN, TN counts) for when you already have aggregated results, and raw binary label arrays for when you have lists of actual and predicted values. Both formats compute the same 10-metric output: MCC, accuracy, balanced accuracy, precision (PPV), recall (sensitivity), specificity (TNR), negative predictive value (NPV), F1 score, and Cohen's kappa.

📐 MCC Formula

MCC = (TP × TN − FP × FN) ÷ √[(TP+FP)(TP+FN)(TN+FP)(TN+FN)]

TP = True Positives (correctly predicted positive cases)

TN = True Negatives (correctly predicted negative cases)

FP = False Positives (negative cases incorrectly predicted as positive)

FN = False Negatives (positive cases incorrectly predicted as negative)

Convention: If any factor in the denominator is 0, MCC = 0

Range: -1 (perfect inverse) to 0 (random) to +1 (perfect classifier)

The numerator TP x TN - FP x FN measures the difference between correct and incorrect predictions in a balanced way across both classes. The denominator normalises this difference by the geometric mean of the four marginal totals of the confusion matrix, ensuring the result lies in [-1, +1] regardless of class balance or total sample count. The formula is equivalent to the Pearson product-moment correlation coefficient applied to two binary variables (actual and predicted labels coded as 0 and 1).

Related Metrics

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Balanced Accuracy = (Sensitivity + Specificity) / 2

Precision (PPV) = TP / (TP + FP)

Recall (Sensitivity) = TP / (TP + FN)

Specificity (TNR) = TN / (TN + FP)

F1 Score = 2 x Precision x Recall / (Precision + Recall)

📖 How to Use This Calculator

Using Confusion Matrix Mode and Raw Labels Mode

Choose input mode -- Select Confusion Matrix if you already have TP, TN, FP, FN counts. Select Raw Labels to paste actual and predicted binary arrays directly.

Enter your data -- In Confusion Matrix mode, type the count for each cell. Values must be zero or positive integers. In Raw Labels mode, enter labels separated by commas or spaces using 0/1, yes/no, true/false, or positive/negative.

Click Calculate MCC -- The results panel shows MCC with interpretation, plus 9 companion metrics: accuracy, balanced accuracy, precision, recall, specificity, NPV, F1, and Cohen's kappa.

💡 Example Calculations

Example 1 -- Balanced Dataset, Good Classifier

TP=90, FP=10, FN=5, TN=95 (200 total samples, balanced classes)

Numerator: TP x TN - FP x FN = 90 x 95 - 10 x 5 = 8,550 - 50 = 8,500

Denominator: sqrt((90+10)(90+5)(95+10)(95+5)) = sqrt(100 x 95 x 105 x 100) = sqrt(99,750,000) = 9,987.5

MCC = 8,500 / 9,987.5 = 0.8511 (Strong prediction). Accuracy = 185/200 = 92.5%

MCC = 0.8511 • Accuracy = 92.50% • F1 = 0.9231

Try this example →

Example 2 -- Imbalanced Dataset, Degenerate Classifier

TP=0, FP=0, FN=50, TN=950 (model always predicts negative; 95% class imbalance)

Accuracy = (0 + 950) / 1000 = 95.0% -- appears excellent

MCC numerator: 0 x 950 - 0 x 50 = 0. Denominator includes factor (TP + FP) = 0, so by convention MCC = 0

MCC = 0 correctly signals no predictive ability. Accuracy = 95% is misleading. F1 = 0 (also correct). Balanced Accuracy = 50.0%.

MCC = 0 • Accuracy = 95.00% (misleading) • Balanced Accuracy = 50.00%

Try this example →

Example 3 -- Medical Diagnostic Test

Disease screening: TP=80, FP=20, FN=15, TN=385 (500 patients)

Numerator: 80 x 385 - 20 x 15 = 30,800 - 300 = 30,500

Denominator: sqrt((80+20)(80+15)(385+20)(385+15)) = sqrt(100 x 95 x 405 x 400) = sqrt(1,539,000,000) = 39,230

MCC = 30,500 / 39,230 = 0.7775. Sensitivity = 80/95 = 84.2%, Specificity = 385/405 = 95.1%

MCC = 0.7775 • Sensitivity = 84.21% • Specificity = 95.06%

Try this example →

Example 4 -- Negative MCC (Inverted Classifier)

TP=5, FP=90, FN=95, TN=10 (classifier is systematically wrong)

Numerator: TP x TN - FP x FN = 5 x 10 - 90 x 95 = 50 - 8,550 = -8,500

Denominator: sqrt((5+90)(5+95)(10+90)(10+95)) = sqrt(95 x 100 x 100 x 105) = 9,987.5

MCC = -8,500 / 9,987.5 = -0.8511. Accuracy = 15/200 = 7.5%. Flipping predictions would give MCC = +0.8511.

MCC = -0.8511 (Strong inverse prediction; swap labels to fix)

Try this example →

❓ Frequently Asked Questions

What is the Matthews Correlation Coefficient (MCC)?+

The Matthews Correlation Coefficient is a single-number summary of a binary classifier's performance ranging from -1 to +1. It was introduced by Brian Matthews in 1975 for protein structure prediction and measures the correlation between actual and predicted binary outcomes across all four confusion matrix cells (TP, TN, FP, FN). A value of +1 means perfect prediction, 0 means no better than random guessing, and -1 means every prediction is wrong.

What is the MCC formula?+

MCC = (TP x TN - FP x FN) divided by the square root of (TP + FP)(TP + FN)(TN + FP)(TN + FN). TP = true positives, TN = true negatives, FP = false positives, FN = false negatives. If the denominator equals zero (because any of its four factors is zero), MCC is defined as 0 by convention. This convention handles degenerate cases where the model predicts only one class.

Why is MCC considered better than accuracy for imbalanced datasets?+

Accuracy counts all correct predictions equally, which makes it misleadingly high when one class dominates. A model that always predicts the majority class on a 99/1 dataset achieves 99% accuracy with zero predictive ability. Its MCC is 0, which correctly reflects no correlation. MCC uses all four confusion matrix cells and produces a balanced score regardless of how skewed the class distribution is, making it the recommended primary metric for fraud detection, disease screening, rare event prediction, and other imbalanced tasks.

What is a good MCC value for a machine learning model?+

MCC values above 0.7 indicate strong predictive performance and are considered good in most applications. Values between 0.5 and 0.7 indicate moderate performance. Values between 0.3 and 0.5 indicate weak but statistically meaningful association. Values below 0.3 are generally considered poor. Values near 0 suggest the model has no useful predictive ability. The threshold for an acceptable MCC depends on the application: medical diagnostics typically requires MCC above 0.8, while fraud detection might accept 0.5 given data complexity.

What is the difference between MCC and the F1 score?+

F1 score is the harmonic mean of Precision and Recall and ignores True Negatives entirely. This makes F1 high when the model correctly identifies positives but fails on negatives. MCC includes all four confusion matrix cells including TN, so it penalises models that perform poorly on either class. On balanced datasets the two metrics often rank models similarly. On imbalanced datasets with a large negative class, MCC provides a more conservative and complete assessment of model quality.

Is MCC the same as the Phi Coefficient?+

Yes. The MCC and the Phi Coefficient are mathematically identical. The Phi Coefficient is used in statistics to measure association between two binary variables in a 2x2 contingency table. MCC is the term preferred in machine learning and bioinformatics. Both formulas produce the same number when given the same TP, TN, FP, and FN values. The Phi Coefficient is also numerically equal to the Pearson product-moment correlation coefficient computed on binary (0/1) coded variables.

What does a negative MCC value mean?+

A negative MCC means the classifier is predicting the wrong class more often than it should by chance. An MCC of -0.5 means there is a moderate systematic tendency to predict incorrectly. An MCC of -1 means the classifier always predicts the wrong class (a perfect inverse classifier). In practice, a clearly negative MCC usually indicates inverted class labels (the positive and negative class definitions are swapped), a bug in the prediction pipeline, or a severely miscalibrated model.

What is Cohen's kappa and how does it relate to MCC?+

Cohen's kappa measures inter-rater agreement between two raters correcting for chance agreement. For binary classification it measures how much better the classifier performs compared to a random classifier with the same marginal distributions. Both kappa and MCC account for class imbalance, but they use different formulas and are not mathematically equivalent. Research by Chicco and Jurman (2020) showed that MCC is generally more informative than kappa for evaluating binary classifiers, particularly for severely imbalanced datasets.

Can MCC be used for multi-class classification?+

Yes, but not directly with this calculator. For problems with more than two classes, the MCC generalises to the multiclass MCC (sometimes called R_K or the multiclass correlation coefficient). It is computed from a K x K confusion matrix using a more complex formula. Several research papers (including Gorodkin 2004 and Jurman et al. 2012) have formalised the multiclass MCC. This calculator covers binary classification only (two-class problems with one positive and one negative class).

What is balanced accuracy and when should I use it instead of MCC?+

Balanced accuracy is the arithmetic mean of Sensitivity (recall on the positive class) and Specificity (recall on the negative class), equal to (Sensitivity + Specificity) / 2. It ranges from 0 to 1 and is easier to interpret than MCC but less sensitive to severely imbalanced confusions. MCC accounts for all four confusion cells simultaneously, while balanced accuracy only considers the per-class recall rates. MCC is generally preferred in machine learning research. Balanced accuracy is more common in clinical performance evaluation where sensitivity and specificity have direct clinical interpretations.

How do I interpret the confusion matrix in this calculator?+

The confusion matrix has four cells. True Positives (TP): actual positives correctly predicted as positive. True Negatives (TN): actual negatives correctly predicted as negative. False Positives (FP): actual negatives incorrectly predicted as positive (Type I error). False Negatives (FN): actual positives incorrectly predicted as negative (Type II error). For a medical test, TP = sick patients correctly diagnosed, TN = healthy patients correctly cleared, FP = healthy patients falsely alarmed, FN = sick patients missed.

How is MCC calculated from raw labels instead of a confusion matrix?+

To calculate MCC from raw labels, compare each actual label to the corresponding predicted label and count the four outcomes: TP (both 1), TN (both 0), FP (actual 0, predicted 1), FN (actual 1, predicted 0). Once you have these counts, apply the MCC formula. This calculator's Raw Labels mode does this automatically when you paste comma-separated binary arrays. Labels can be entered as 0/1, yes/no, true/false, or positive/negative.

🔗 Related Calculators

📌 Quick Tips

💡MCC is considered the best single metric for binary classifiers on imbalanced datasets because it accounts for all four cells of the confusion matrix (TP, TN, FP, FN), not just the correct predictions.

💡Accuracy can be misleadingly high on imbalanced data. A model that always predicts the majority class gets high accuracy but MCC near 0. Use MCC as the primary metric when class imbalance exceeds 80/20.

💡MCC equals +1 only when all four confusion matrix cells are correct (perfect prediction). Accuracy, F1, and AUC can each reach 1.0 without that constraint, making MCC a stricter measure of overall classifier quality.

💡The Matthews Correlation Coefficient is mathematically equivalent to the Phi Coefficient used in statistics to measure association between two binary variables in a 2x2 contingency table.

💡For multi-class classification, the MCC generalises to the multiclass MCC (also called the R_K statistic). This calculator covers the binary case (two classes only).