Confusion Matrix Calculator
Calculate confusion matrix metrics instantly online. Get accuracy, precision, recall, specificity, F1 score, MCC, and more for machine learning classification models.
Confusion Matrix Calculator
Enter confusion matrix values (TP, FP, FN, TN) to instantly calculate all classification metrics — accuracy, precision, recall, F1, MCC, and more. Upload a CSV for automatic matrix generation. All calculations run locally in your browser.
Confusion Matrix Values
Correctly predicted positive
Missed positive (Type II)
Wrong positive (Type I)
Correctly predicted negative
Ctrl+Enter to recalculate
Confusion Matrix
Example Scenarios
Key Metrics
Metrics Overview
What Is a Confusion Matrix?
A confusion matrix is a table that summarises the performance of a classification model by comparing actual labels against predicted labels. For binary classification, it contains four values: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN).
From these four numbers you can derive every standard classification metric — accuracy, precision, recall, F1 score, specificity, MCC, and more — giving you a complete picture of how well a model performs across both positive and negative classes.
Confusion Matrix Layout
Predicted Positive Predicted Negative Actual Positive TP (True+) FN (False-) Actual Negative FP (False+) TN (True-) TP = Correctly predicted positive TN = Correctly predicted negative FP = Incorrectly predicted positive (Type I error) FN = Incorrectly predicted negative (Type II error)
All Metric Formulas
Accuracy = (TP + TN) / (TP + FP + FN + TN) Precision = TP / (TP + FP) Recall = TP / (TP + FN) Specificity = TN / (TN + FP) F1 Score = 2 × (Precision × Recall) / (Precision + Recall) False Positive Rate (FPR) = FP / (FP + TN) False Negative Rate (FNR) = FN / (FN + TP) NPV = TN / (TN + FN) Balanced Accuracy = (Recall + Specificity) / 2 MCC = (TP×TN − FP×FN) / √((TP+FP)(TP+FN)(TN+FP)(TN+FN)) Example: TP=90, TN=80, FP=10, FN=20 Accuracy = (90 + 80) / 200 = 85.00% Precision = 90 / 100 = 90.00% Recall = 90 / 110 = 81.82% F1 Score = 2×(0.90×0.818)/(0.90+0.818) = 85.71%
Metric Reference Guide
| Metric | Formula | Best For |
|---|---|---|
| Accuracy | (TP+TN)/Total | Balanced datasets |
| Precision | TP/(TP+FP) | When FP is costly (e.g. spam) |
| Recall | TP/(TP+FN) | When FN is costly (e.g. medical) |
| Specificity | TN/(TN+FP) | Negative class performance |
| F1 Score | 2×P×R/(P+R) | Imbalanced datasets |
| FPR | FP/(FP+TN) | ROC curve analysis |
| FNR | FN/(FN+TP) | Miss rate analysis |
| NPV | TN/(TN+FN) | Negative prediction reliability |
| Balanced Accuracy | (Recall+Specificity)/2 | Class-imbalanced evaluation |
| MCC | (TP×TN−FP×FN)/√(…) | Overall quality (−1 to +1) |
When to Use Each Metric
Use Accuracy when…
Your dataset is balanced (roughly equal class sizes) and all misclassification types carry equal cost.
Use Precision when…
False positives are expensive. In spam detection, flagging a legitimate email as spam (FP) damages user trust more than missing a spam email.
Use Recall when…
False negatives are dangerous. In cancer screening, missing a true positive (FN) has far greater consequences than a false alarm.
Use F1 Score when…
Your dataset is imbalanced and you need a single metric that balances precision and recall equally.
Use MCC when…
You want a single comprehensive metric that accounts for all four quadrants of the confusion matrix, especially for heavily imbalanced datasets.
Frequently Asked Questions
What does a confusion matrix tell you?
It breaks down all prediction outcomes into four categories — TP, TN, FP, FN — letting you see exactly where your model succeeds and where it fails. From these you can calculate every classification metric without needing the raw predictions.
What is a good accuracy for a classification model?
It depends heavily on the dataset. For balanced datasets, 90%+ is typically good. For heavily imbalanced datasets, accuracy can be misleading — a model predicting the majority class 100% of the time could score 99% accuracy while being completely useless.
What is the difference between sensitivity and specificity?
Sensitivity (recall) measures how well the model finds actual positives. Specificity measures how well it correctly identifies actual negatives. A good diagnostic test aims for high values of both.
Why is MCC considered a better metric than F1?
MCC (Matthews Correlation Coefficient) uses all four values of the confusion matrix and is not inflated by class imbalance. F1 ignores true negatives entirely. For datasets where TN is large (common in fraud detection), MCC gives a more balanced assessment.
Can I upload predictions directly?
Yes. Use the CSV Upload mode and provide a two-column CSV with columns 'actual' and 'predicted'. The calculator parses binary values (1/0) or string labels (positive/negative, yes/no) and builds the confusion matrix automatically.
Related Tools
Precision Recall Calculator
Calculate precision, recall, F1 score, accuracy, and specificity from confusion matrix values. Free online machine learning evaluation metrics calculator.
F1 Score Calculator
Calculate F1 score instantly using confusion matrix or precision and recall values. Free online F1 score calculator for AI, machine learning, classification, and data science.
Model Accuracy Calculator
Calculate machine learning model accuracy instantly. Compare actual vs predicted labels, evaluate AI performance, upload CSV data, and get instant results online for free.
AI Token Cost Calculator
Estimate AI API token costs for OpenAI, Claude, Gemini, and custom models. Calculate prompt and completion token expenses, compare models, and forecast monthly and yearly costs.
AI Prompt Length Calculator
Calculate AI prompt length instantly. Count tokens, words, characters, sentences, and estimate context window usage for ChatGPT, Claude, Gemini, and other AI models.
Time Complexity Calculator
Estimate algorithm time complexity using Big-O notation. Analyze loop patterns, recursion, and algorithm presets with interactive growth visualizations and educational explanations.