Data Updated 2019-07-12
My partner brought home a box of three week old kittens that had been brought to her work. The man who found them said that during a heavy rainstorm heard tiny meows in his yard and then saw a little kitten face scrambling to get out of a depression. He went to rescue the baby kitten and found FIVE kittens, most of them were totally submerged in rising water.
When we got the kittens home, I did the best thing I knew how to do: start collecting data on their growth.
We color coded their tail tips with the nail polish we had on hand which led to their color names.
I made a quick Excel workbook to collect the date, weight in grams, and comments for each cat. A few days later, I forgot to weigh before dinner so I weighed after eating. I realized that it would be a good way to track their food consumption. I’m glad I took the two measures. The kittens got a little older and very wiggly before meals. The measurement error would have another measurement to track average growth.
The First Week of Fostering
The Second Week of Fostering
Now Plot Their Progress
Load some packages and data. I used the forcats package to re-code factors and was looking for a way to use purrr but didn’t have a good use case.
library(knitr)
library(tidyverse)
library(gganimate)
library(sjPlot)
load("Kittens.Rda")
kable(head(kittens))
Date | Cat | Weight | Pre_post.eating | Comments |
---|---|---|---|---|
2019-06-08 | Blue | 215 | Pre-eating | Goopy eyes. Applied antibiotics. |
2019-06-08 | Green | 249 | Pre-eating | Gave eye antibiotics |
2019-06-08 | Lavender | 233 | Pre-eating | NA |
2019-06-08 | Maroon | 252 | Pre-eating | NA |
2019-06-08 | No Tip | 255 | Pre-eating | NA |
2019-06-09 | Green | 265 | Pre-eating | Diarrhea; eye antibiotics |
Plotting Kitten Growth
The raw data need a bit of re-coding. I do several things in the next block:
- Make a color variable to use in the plots.
- Re-code the “weight.taken” variable so that it will be correct in the plots.
- Order the weight.taken variable levels so that the before eating comes first.
- Make a sex variable based on the cat’s sex.
- Make a numeric date variable that will be used in the regression. Zero is the first day.
kittens.coded<-kittens %>%
mutate(color=fct_recode(Cat,
"black" = "No Tip",
"violet" = "Lavender" ),
color=tolower(color),
weight.taken=fct_recode(Pre_post.eating,
"Before Eating" = "Pre-eating" ,
"After Eating" = "Post-eating"),
weight.taken=fct_relevel(weight.taken,"Before Eating"),
sex=fct_collapse(Cat,
female= c("Blue", "Maroon"),
male= c("Green", "Lavender", "No Tip")),
day.num=as.numeric(Date)-18055)
Now to plot with ggplot2.
Note: No Tip has some missing days because someone peed on my data collection form before I had a chance to record those days
kittentraj<-kittens.coded %>%
ggplot(aes(Date, Weight, color=Cat)) +
geom_line() +
scale_color_manual(values = kittens.coded$color)+
facet_grid(~ weight.taken) +
ggtitle("Foster Kitten Weight (Grams)")
kittentraj
It is also helpful to split the plots by each kitten so that we can see their weight pre- and post-eating easier.
kittens.sep<-kittens.coded %>%
ggplot(aes(Date, Weight, color=weight.taken)) +
geom_line() +
facet_wrap(~ Cat) +
ggtitle("Foster Kitten Weight (Grams)") +
labs(color = "Weight Taken")
kittens.sep
Animating Plots
I wanted to play around a bit more with the gganimate package so I made gifs of the growth charts.
kittentraj +
geom_point(aes(group = seq_along(Date))) +
transition_reveal(Date)
kittens.sep +
geom_point(aes(group = seq_along(Date))) +
transition_reveal(Date)
Averaging growth trends
The daily data are a bit messy but look remarkably linear. I’m interested in how fast they are growing every day and whether the cats are differing in their growth trajectories. To test this, I can estimate an OLS regression model that includes the day number and controlling for measurement time and which cat is weighed.
Update 7/5/2019: The growth trend is no longer looking as linear. After the linear model, below, I estimate a curvilinear model
The intercept is when all independent variables equal 0 or substantively: the average weight in grams for Blue after eating on the first day she came home.
I added sex to the model but ran into an identification issue. Since all the measurements for each cat are either male or female controls for the cat and sex can’t be included in the same model. A multilevel model with random intercept for cats can do this and shows that the female cats are slightly smaller than the males but there is not a lot of variation between cats.
weightlm<-lm(Weight ~ day.num + Pre_post.eating + Cat, kittens.coded)
summary(weightlm)
##
## Call:
## lm(formula = Weight ~ day.num + Pre_post.eating + Cat, data = kittens.coded)
##
## Residuals:
## Min 1Q Median 3Q Max
## -68.113 -21.317 -3.033 17.126 118.144
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 193.2344 6.1112 31.620 < 2e-16 ***
## day.num 19.8228 0.2127 93.194 < 2e-16 ***
## Pre_post.eatingPre-eating -18.7584 4.0055 -4.683 4.58e-06 ***
## CatGreen 57.1807 6.2310 9.177 < 2e-16 ***
## CatLavender 53.2267 6.2310 8.542 1.18e-15 ***
## CatMaroon 20.3889 6.2306 3.272 0.00121 **
## CatNo Tip 43.3185 6.4322 6.735 1.07e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 32.38 on 257 degrees of freedom
## (6 observations deleted due to missingness)
## Multiple R-squared: 0.9722, Adjusted R-squared: 0.9716
## F-statistic: 1498 on 6 and 257 DF, p-value: < 2.2e-16
The Kittens grow 19.8g per day on average!!
A few plots can be made to show the predicted weight of a cat over time and differences in the average weight of the cats.
plot_model(weightlm, type = "pred", terms = c("day.num"))
plot_model(weightlm, type = "pred", terms = c( "Cat"),
colors = kittens.coded$color)
Blue looks to be really small in the last graph, but it does not take time into account.
To examine trajectory differences, create an interaction term between day and Cat.
The resulting plot shows the average trajectories of each cat.
weightlm.int<-lm(Weight ~ day.num + Pre_post.eating + Cat*day.num, kittens.coded)
summary(weightlm.int)
##
## Call:
## lm(formula = Weight ~ day.num + Pre_post.eating + Cat * day.num,
## data = kittens.coded)
##
## Residuals:
## Min 1Q Median 3Q Max
## -71.579 -17.624 1.222 16.539 66.429
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 236.3490 7.2124 32.770 < 2e-16 ***
## day.num 17.2723 0.3610 47.849 < 2e-16 ***
## Pre_post.eatingPre-eating -18.6419 3.1212 -5.973 7.87e-09 ***
## CatGreen -4.7568 9.9081 -0.480 0.631577
## CatLavender -48.4132 9.9060 -4.887 1.82e-06 ***
## CatMaroon 1.9328 9.9057 0.195 0.845452
## CatNo Tip 10.9804 10.0857 1.089 0.277318
## day.num:CatGreen 3.6592 0.5103 7.171 8.17e-12 ***
## day.num:CatLavender 6.0051 0.5102 11.771 < 2e-16 ***
## day.num:CatMaroon 1.0904 0.5102 2.137 0.033530 *
## day.num:CatNo Tip 1.8447 0.5452 3.384 0.000829 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25.22 on 253 degrees of freedom
## (6 observations deleted due to missingness)
## Multiple R-squared: 0.9834, Adjusted R-squared: 0.9827
## F-statistic: 1499 on 10 and 253 DF, p-value: < 2.2e-16
plot_model(weightlm.int, type = "pred", terms = c("day.num", "Cat"),
colors = kittens.coded$color)
They grow up so fast!
Curvilinear Growth Model
In the OLS, I add a polynomial term for day number. The I() function lets you evaluate the term inside the function without the +, *, or ^ being interpreted as a part of the model equation.
weight.curv<-lm(Weight ~ day.num + I(day.num^2) + Pre_post.eating + Cat, kittens.coded)
summary(weight.curv)
##
## Call:
## lm(formula = Weight ~ day.num + I(day.num^2) + Pre_post.eating +
## Cat, data = kittens.coded)
##
## Residuals:
## Min 1Q Median 3Q Max
## -60.977 -17.430 0.809 15.679 86.522
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 238.87616 6.60877 36.145 < 2e-16 ***
## day.num 12.50165 0.70128 17.827 < 2e-16 ***
## I(day.num^2) 0.21268 0.01971 10.788 < 2e-16 ***
## Pre_post.eatingPre-eating -22.71086 3.34772 -6.784 8.07e-11 ***
## CatGreen 57.25390 5.17651 11.060 < 2e-16 ***
## CatLavender 53.15350 5.17651 10.268 < 2e-16 ***
## CatMaroon 20.38889 5.17614 3.939 0.000106 ***
## CatNo Tip 45.13954 5.34632 8.443 2.33e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 26.9 on 256 degrees of freedom
## (6 observations deleted due to missingness)
## Multiple R-squared: 0.9809, Adjusted R-squared: 0.9804
## F-statistic: 1878 on 7 and 256 DF, p-value: < 2.2e-16
plot_model(weight.curv, type = "pred", terms = c("day.num"),
colors = kittens.coded$color)
Adding interaction terms for day number and the squared term with Cat estimates a curvilinear trajectory for each cat.
weight.curv.int<-lm(Weight ~ day.num*Cat + I(day.num^2)*Cat + Pre_post.eating + Cat, kittens.coded)
summary(weight.curv.int)
##
## Call:
## lm(formula = Weight ~ day.num * Cat + I(day.num^2) * Cat + Pre_post.eating +
## Cat, data = kittens.coded)
##
## Residuals:
## Min 1Q Median 3Q Max
## -44.075 -8.861 0.361 8.925 38.651
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 268.65596 6.80965 39.452 < 2e-16 ***
## day.num 12.25007 0.88625 13.822 < 2e-16 ***
## CatGreen 28.99428 9.48089 3.058 0.00247 **
## CatLavender -4.22161 9.47675 -0.445 0.65637
## CatMaroon -1.66805 9.47666 -0.176 0.86042
## CatNo Tip 2.27275 9.51064 0.239 0.81133
## I(day.num^2) 0.14481 0.02483 5.833 1.69e-08 ***
## Pre_post.eatingPre-eating -22.57542 1.91595 -11.783 < 2e-16 ***
## day.num:CatGreen -1.91283 1.25280 -1.527 0.12808
## day.num:CatLavender -1.34980 1.25251 -1.078 0.28223
## day.num:CatMaroon 1.68849 1.25243 1.348 0.17883
## day.num:CatNo Tip 3.21027 1.26743 2.533 0.01193 *
## CatGreen:I(day.num^2) 0.16083 0.03510 4.582 7.30e-06 ***
## CatLavender:I(day.num^2) 0.21280 0.03510 6.063 4.94e-09 ***
## CatMaroon:I(day.num^2) -0.01730 0.03509 -0.493 0.62246
## CatNo Tip:I(day.num^2) -0.03500 0.03617 -0.968 0.33412
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.37 on 248 degrees of freedom
## (6 observations deleted due to missingness)
## Multiple R-squared: 0.994, Adjusted R-squared: 0.9936
## F-statistic: 2718 on 15 and 248 DF, p-value: < 2.2e-16
plot_model(weight.curv.int, type = "pred", terms = c("day.num", "Cat"),
colors = kittens.coded$color)