Productive Toolbox

Precision Recall Calculator

Calculate precision, recall, F1 score, accuracy, and specificity from confusion matrix values. Free online machine learning evaluation metrics calculator.

๐Ÿ“Š

Precision Recall Calculator

Enter your confusion matrix values (TP, FP, FN, TN) to instantly calculate precision, recall, F1 score, accuracy, specificity, and more. All calculations run locally in your browser.

Confusion Matrix

Correctly predicted positive

Missed positive (Type II)

Wrong positive (Type I)

Correctly predicted negative

Total samples1,000

Ctrl+Enter to recalculate

Confusion Matrix

90
TP
9.0%
15
FN
1.5%
10
FP
1.0%
885
TN
88.5%
โ† Predicted Pos | Predicted Neg โ†’

Example Scenarios

Key Metrics

90.00%
Precision
85.71%
Recall
87.80%
F1 Score
Accuracy97.50%
Specificity98.88%
Precision
90.00%
Of all predicted positives, how many were correct?
Recall
85.71%
Of all actual positives, how many were detected?
F1 Score
87.80%
Harmonic mean of precision and recall.
Accuracy
97.50%
Fraction of all predictions that were correct.
Specificity
98.88%
Of all actual negatives, how many were correctly identified?
NPV
98.33%
Negative Predictive Value โ€” accuracy of negative predictions.
FPR
1.12%
False Positive Rate (1 โˆ’ Specificity).
FNR
14.29%
False Negative Rate (1 โˆ’ Recall).
MCC
0.8644
Matthews Correlation Coefficient (โˆ’1 to +1).

Metrics Overview

Precision90.00%
Recall85.71%
F1 Score87.80%
Accuracy97.50%
Specificity98.88%

How the Precision Recall Calculator Works

This tool computes the core classification evaluation metrics used in machine learning and AI: precision, recall, F1 score, accuracy, and specificity โ€” directly from the four values of a binary confusion matrix (TP, FP, FN, TN). All calculations run instantly in your browser with no data uploaded.

Enter your confusion matrix values and every metric updates in real time. Toggle the formula panel to see the step-by-step calculation for each metric.

The Confusion Matrix

Predicted PositivePredicted Negative
Actual PositiveTP
True Positive
FN
False Negative
Actual NegativeFP
False Positive
TN
True Negative
  • TP (True Positive): Model correctly predicted Positive
  • FP (False Positive): Model predicted Positive, but actual was Negative (Type I error)
  • FN (False Negative): Model predicted Negative, but actual was Positive (Type II error)
  • TN (True Negative): Model correctly predicted Negative

All Formulas

Precision    = TP รท (TP + FP)
Recall       = TP รท (TP + FN)        [also called Sensitivity]
F1 Score     = 2 ร— (Precision ร— Recall) รท (Precision + Recall)
Accuracy     = (TP + TN) รท (TP + FP + FN + TN)
Specificity  = TN รท (TN + FP)        [also called True Negative Rate]
NPV          = TN รท (TN + FN)        [Negative Predictive Value]
FPR          = FP รท (FP + TN)        [False Positive Rate = 1 - Specificity]
FNR          = FN รท (FN + TP)        [False Negative Rate = 1 - Recall]
MCC          = (TPร—TN โˆ’ FPร—FN) รท โˆš((TP+FP)(TP+FN)(TN+FP)(TN+FN))

Metric Reference Guide

MetricAnswersBest used when
PrecisionOf all positive predictions, how many were correct?False positives are costly (e.g. spam filter)
RecallOf all actual positives, how many were found?False negatives are costly (e.g. cancer screening)
F1 ScoreBalanced trade-off between precision and recallImbalanced datasets where both FP and FN matter
AccuracyOf all predictions, how many were correct?Balanced datasets with equal class distribution
SpecificityOf all actual negatives, how many were correctly identified?Minimising false alarms
MCCOverall quality of binary classifier, range -1 to +1Imbalanced datasets โ€” more informative than accuracy

Precision vs Recall Trade-off

Precision and recall are inversely related in most classifiers. Increasing the decision threshold raises precision (fewer false positives) but lowers recall (more false negatives). The optimal balance depends on your application:

  • Medical diagnosis: maximise recall โ€” missing a disease is far worse than a false alarm.
  • Spam detection: maximise precision โ€” blocking legitimate email is more disruptive than missing spam.
  • Search ranking: precision@k matters more than overall recall for top results.
  • Fraud detection: recall is critical โ€” missing fraud is expensive; false alerts are manageable.

Frequently Asked Questions

What is the difference between precision and recall?

Precision answers: of all predictions labeled positive, how many were correct? Recall answers: of all actual positives in the dataset, how many did the model identify? A model can have high precision with low recall (conservative) or high recall with low precision (aggressive).

When should I use F1 score instead of accuracy?

Use F1 when your dataset is imbalanced. If 95% of samples are class 0, a model predicting class 0 always achieves 95% accuracy but is completely useless. F1 balances precision and recall and is not inflated by a dominant class.

What is MCC (Matthews Correlation Coefficient)?

MCC is a correlation coefficient between actual and predicted binary classifications. It ranges from -1 (inverse prediction) to +1 (perfect prediction), with 0 representing random prediction. It is considered the most informative single metric for binary classification on imbalanced data.

What does a False Positive mean?

A False Positive (Type I error) occurs when the model predicts Positive but the true label is Negative. In spam detection: a legitimate email flagged as spam. In medical testing: a healthy patient testing positive for a disease.

Can this calculator handle multi-class classification?

This tool is designed for binary classification (one positive class vs one negative class). For multi-class problems, compute per-class TP/FP/FN/TN using a one-vs-rest approach and then macro/micro average the metrics.