Just like in the previous example we will load the Titanic dataset and make the same manipulations to the data. Then we will use logistic regression with death as the outcome before creating a table of the model coefficients.
Code
library(tidyverse)library(skimr)library(titanic)library(janitor)library(gtsummary)library(bstfun)df_titanic<-titanic::titanic_train%>%select(-c(PassengerId, Name, Ticket, Parch, Cabin))%>%# Removing unwanted variablesmutate(# First converting SibSp into a factor variable and then collapsing it to 3 levels and an "other" level SibSp =factor(SibSp), SibSp =fct_lump_n(SibSp, n =3), SibSp =fct_recode(SibSp, ">=3"="Other"), # Renaming the new level Sex =fct_recode(Sex, # Renaming levels of categorical variables"Female"="female","Male"="male"), Embarked =fct_recode(Embarked,"Cherbourg"="C","Southhampton"="S","Cobh"="Q"))%>%rename(# Renaming variables"Passenger class"=Pclass,"Number of siblings"=SibSp,"Ticket price"=Fare,"City of embarkation"=Embarked)
Creating the model
Standard logistic regression model with death as the outcome. The “.” after the “~” indicates that we will use all available variables to predict the outcome. Here we have subtracted the “passenger_id” variable from the equation as we don’t want to use this to predict death.
Code
model_titanic<-glm(survived~.-passenger_id, data =df_titanic, family ="binomial")