Simple Logistic Regression, with which treatment will we have more probability of cured animals?
In this article, we are going to address a regression technique that allows us to relate a categorical dependent variable (for example cured/non cured) with one or more quantitative and/or categorical independent variables. This is the logistic regression.
We will focus on simple binary logistic regression, which relates a dichotomous dependent variable (two options) and an independent variable.
What is logistic regression?
The objective of this statistical technique is to express the probability that an event occurs depending on certain variables, which are considered potentially influential. We will have a categorical dependent variable, which can be dichotomous or polytomous, and one or more quantitative and/or categorical independent variables.
A dichotomous dependent variable has only two possible answers: yes or no, true or false, sick or not sick, cured or not cured, success or failure. These responses are coded with a value of 1 if a certain event occurs or with a value of 0 if this event does not occur. This aspect of the coding of the variables is not trivial, it influences the way in which the mathematical calculations are carried out and we must take it into account when interpreting the results.
In this type of process in which there are only two possible outcomes (0/1), the probability of each outcome being constant over a series of repetitions is distributed under the binomial law.
The simple logistic regression is a statistical test used to predict a single binary variable using one other variable.
The problem and solution using a simple logistic regression
Let us see with a practical example how to perform and interpret a simple logistic regression model.
We are going to check the effectiveness of two alternative treatments on the cure of a disease.
The objective is to study whether the cure/non-cure process is associated with the treatment or not. In other words, we are testing if the probability of cure applying treatment A is equal to, or different from, the probability of cure applying treatment B.
To do this, suppose that we have performed an experiment on a random sample of 40 sick animals, randomly divided into two groups of 20 animals, each of which is given a treatment (A or B). The results obtained in the experiment are shown in the following table:
|Treatment A (X = 1)||Treatment B (X = 0)|
|Cure (Y = 1)||18||13|
|Non-cure (Y = O)||2||7|
Before proposing a logistic regression model, we can make a series of calculations:
We can estimate the probability of cure (p) for both treatments:
- Treatment B (0.65): p | (X = 0) = 13/20;
- Treatment A (0.90): p | (X = 1) = 18/20