R进行两因素重复测量方差分析并可视化(双组折线图)

藏宝库编辑 7 天前 3637 0 来自 中国
在仙桃学术上的生信工具内里,有一个折线图的绘图工具,可以很快速便捷的得出结论并可视化结果,当然不是说这个功能有多强大,而是统计学方法非常专业。
好比用它自带的数据https://bioinfomatics.xiantao.love/biotools/data/demo/free/linePlot/%E6%8A%98%E7%BA%BF%E5%9B%BE.xlsx
通过无脑式的鼠标点击,可得到下面一系列的结果。

1.png
可以看到有很多结果,包括各种统计学形貌、各种格式的图片,以及一个demo.R
好比统计形貌
组别1组别2数目最小值最大值中位数(Median)四分位距(IQR)下四分位上四分位均值(Mean)标准差(SD)标准误(SE)Time1Control122.8333.3062.9670.2542.8853.1393.0200.1540.044Time1Intervene122.6243.4183.0740.2972.8763.1733.0320.2580.074Time2Control122.7333.2402.9300.1872.8613.0472.9570.1480.043Time2Intervene123.4194.3184.0820.3883.8244.2124.0050.2850.082Time3Control122.6113.4142.9380.3922.8053.1972.9790.2460.071Time3Intervene125.7996.2856.0650.1855.9166.1016.0300.1390.040又好比异常值分析、正态性查验(Shapiro-Wilk normality test)、方差齐性查验(Levene's test)、协方差同质假设(Homogeneity of covariances assumption/Box’s M-test)、球形假设(Mauchly's Test for Sphericity)、两因素单向重复测量数据方差分析(Two-way mixed ANOVA)、单独效应分析(Simple effect)、多重成对比较(Pairwise Comparisons of Estimated Marginal Means)等等专业术语,而且还给了详细的统计学结果。

2.png 这种统计学结果让各人很欣慰,那么这些结果都是如何计算出来的呢???
我们可以点开demo.R这个结果,读一读代码
https://bioinfomatics.xiantao.love/biotools/code/open/lineplot.R
现在我们复现一下:
加载必要的包并导入数据

library(ggplot2)library(reshape2)library(car)library(rstatix)## 导入数据,好比我们把下载的折线图数据生存在桌面library(readxl)data <- read_excel("~/Desktop/折线图.xlsx") ## mac系统代码data # 显示结果trtTime1Time2Time3Control3.1859682.8758023.414224Control2.8326322.8621112.801333Control3.1235332.9841422.909648Control2.8809273.1469243.256431Control3.0909362.7326202.966887Control2.9209492.8652872.820161Control3.3060132.8563902.806807Control2.8858423.0024802.611145Control2.9559193.2395963.182478Control2.9785722.8185672.734901Control3.1906473.0420653.001664Control2.8838673.0632263.240626Intervene2.6239733.7719826.041055Intervene2.6639263.8416856.098856Intervene3.0452864.3013426.108293Intervene3.0547124.0655936.085687Intervene3.1278574.0976336.055522Intervene3.1476864.1349475.926182Intervene3.2480953.4191615.855298Intervene3.3260864.1826295.885426Intervene3.0938904.3180146.151859Intervene3.4183474.3036156.284906Intervene2.9353963.9868795.798966Intervene2.6977903.6400846.073529新增一列id,id即为数字,用于后续分析,必不可少

data$id <- 1:nrow(data)trtTime1Time2Time3idControl3.1859682.8758023.4142241Control2.8326322.8621112.8013332Control3.1235332.9841422.9096483Control2.8809273.1469243.2564314Control3.0909362.7326202.9668875Control2.9209492.8652872.8201616Control3.3060132.8563902.8068077Control2.8858423.0024802.6111458Control2.9559193.2395963.1824789Control2.9785722.8185672.73490110Control3.1906473.0420653.00166411Control2.8838673.0632263.24062612Intervene2.6239733.7719826.04105513Intervene2.6639263.8416856.09885614Intervene3.0452864.3013426.10829315Intervene3.0547124.0655936.08568716Intervene3.1278574.0976336.05552217Intervene3.1476864.1349475.92618218Intervene3.2480953.4191615.85529819Intervene3.3260864.1826295.88542620Intervene3.0938904.3180146.15185921Intervene3.4183474.3036156.28490622Intervene2.9353963.9868795.79896623Intervene2.6977903.6400846.07352924将短数据转换为长数据

data2 <- gather(data, key = "x", value = "value", -trt, -id)对各组进行统计学形貌

data3 <- data2 %>%   group_by(trt, x) %>%   get_summary_stats(value)trtxvariablenminmaxmedianq1q3iqrmadmeansdseciControlTime1value122.8333.3062.9672.8853.1390.2540.1563.0200.1540.0440.098ControlTime2value122.7333.2402.9302.8613.0470.1870.1372.9570.1480.0430.094ControlTime3value122.6113.4142.9382.8053.1970.3920.2522.9790.2460.0710.156InterveneTime1value122.6243.4183.0742.8763.1730.2970.2323.0320.2580.0740.164InterveneTime2value123.4194.3184.0823.8244.2120.3880.3274.0050.2850.0820.181InterveneTime3value125.7996.2856.0655.9166.1010.1850.0976.0300.1390.0400.088批量运行正态性查验(Shapiro-Wilk normality test)

for(i in unique(data[,1])){  data1 <- data[data[,1] == i,]  print(lapply(data1[,-1], function(x) shapiro.test(x)))}
$Time1
Shapiro-Wilk normality test
data:  x
W = 0.91913, p-value = 0.2788
$Time2
Shapiro-Wilk normality test
data:  x
W = 0.88011, p-value = 0.08792
$Time3
Shapiro-Wilk normality test
data:  x
W = 0.73897, p-value = 0.002055
$id
Shapiro-Wilk normality test
data:  x
W = 0.95933, p-value = 0.7742
批量运行方差齐性查验(Levene's Test)

for(i in unique(data2$x)){  data1 <- data2[data2$x == i,]  print(leveneTest(value~trt, data = data1))}
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group  1  1.5617 0.2245
22
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group  1  2.5864  0.122
22
Levene's Test for Homogeneity of Variance (center = median)
Df F value  Pr(>F)
group  1  3.8375 0.06291 .
22
Signif. codes:  0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
两因素重复测量方差分析

anova_test(data = data2, dv = value, wid = id, within = x, between = trt)
ANOVA Table (type II tests)
$ANOVA
Effect DFn DFd       F        p p<.05   ges
1    trt   1  22 510.657 1.02e-16     * 0.918
2      x   2  44 391.826 9.19e-29     * 0.902
3  trt:x   2  44 407.706 4.01e-29     * 0.905
$Mauchly's Test for Sphericity
Effect     W     p p<.05
1      x 0.966 0.699
2  trt:x 0.966 0.699
$Sphericity Corrections
Effect   GGe      DF[GG]    p[GG] p[GG]<.05   HFe      DF[HF]    p[HF] p[HF]<.05
1      x 0.968 1.94, 42.57 6.65e-28         * 1.059 2.12, 46.61 9.19e-29         *
2  trt:x 0.968 1.94, 42.57 2.98e-28         * 1.059 2.12, 46.61 4.01e-29         *
可视化结果

ggplot(data = data3, aes(x = x, y = mean, color = trt, group = trt)) +  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), color = "black", width = 0.2) +  geom_line() +  geom_point() +  theme_bw() 5.png

当然这个图跟原图不一样,而且没有两两比较,我们可以使用ggpubrrstatix进行二次绘图。
关于rstatix进行统计学分析,着实可以Repeated Measures ANOVA in R: The Ultimate Guide - Datanovia
得到答案,详细的分析,我们暂不说了,只说如何进行统计学分析,而且自动两两比较
res.aov <- anova_test(data = data2, dv = value, wid = id, within = x, between = trt) # 组间两两比较pwc <- data2 %>%    group_by(x) %>%    pairwise_t_test(        value ~ trt, paired = TRUE,        p.adjust.method = "bonferroni" ## 校正方法,包括 "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".等    )pwcx.y.group1group2n1n2statisticdfpp.adjp.adj.signifTime1valueControlIntervene1212-0.1453235118.87e-018.87e-01nsTime2valueControlIntervene1212-12.3908123111.00e-071.00e-07****Time3valueControlIntervene1212-40.9352857110.00e+000.00e+00****可以看到已经自动进行了两两比较,我们使用rstatix包的add_xy_position()函数可以添加两两比较列表和x轴y轴位置。
pwc <- pwc %>% add_xy_position(x = "x")# 按时间分列pwcx.y.group1group2n1n2statisticdfpp.adjp.adj.signify.positiongroupsxminxmax1valueControlIntervene1212-0.1453235118.87e-018.87e-01ns3.7834Control  , Intervene0.81.22valueControlIntervene1212-12.3908123111.00e-071.00e-07****4.6834Control  , Intervene1.82.23valueControlIntervene1212-40.9352857110.00e+000.00e+00****6.6504Control  , Intervene2.83.2可视化绘图

library(ggpubr)ggline(data2,x='x',y='value',color = 'trt', #按组配色add = 'mean_sd', #添加均数标准差,也可以设置均数标准误,CI等。palette = "aaas" # aaas杂志配色)+stat_pvalue_manual(pwc, tip.length = 0, hide.ns = TRUE) + # 添加两两比较,隐藏偶然义    labs(        subtitle = get_test_label(res.aov, detailed = TRUE), # 添加整体差别        caption = get_pwc_label(pwc) # 右下角条件两两比较方法。    )当然,我们还可以继续修改图片,好比全部显示结果,取消显著性标记的下划线,大概显示详细的p值等
ggline(data2,x='x',y='value',       color = 'trt', #按组配色       add = 'mean_sd', #添加均数标准差,也可以设置均数标准误,CI等。       palette = "aaas",# aaas杂志配色ggtheme = theme_bw(), #背景xlab = F,ylab = "Score", #xy轴名称legend.title="Treatment" ,legend="top", #标签名称和位置title = "Comparison of two groups at different times" #标题)+stat_pvalue_manual(pwc, bracket.size = 0) + # 添加两两比较,显示所有结果    labs(        subtitle = get_test_label(res.aov, detailed = TRUE), # 添加整体差别        caption = get_pwc_label(pwc) # 右下角显示事后两两比较方法。    )也可以显示数值
ggline(data2,x='x',y='value',       color = 'trt', #按组配色       add = 'mean_sd', #添加均数标准差,也可以设置均数标准误,CI等。       palette = "aaas",# aaas杂志配色#背景       xlab = F,ylab = "Score", #xy轴名称       legend.title="Treatment" ,legend="right",       title = "Comparison of two groups at different times" #标题)+stat_pvalue_manual(pwc, bracket.size = 0,label = "p.adj") + # 添加两两比较,显示所有结果    labs(        subtitle = get_test_label(res.aov, detailed = TRUE), # 添加整体差别        caption = get_pwc_label(pwc) # 右下角显示事后两两比较方法。    ) 8.png
您需要登录后才可以回帖 登录 | 立即注册

Powered by CangBaoKu v1.0 小黑屋藏宝库It社区( 冀ICP备14008649号 )

GMT+8, 2024-10-18 14:26, Processed in 0.178344 second(s), 36 queries.© 2003-2025 cbk Team.

快速回复 返回顶部 返回列表