论文
Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies
https://www.nature.com/articles/s41588-022-01051-w
本地pdf s41588-022-01051-w.pdf
代码链接
https://zenodo.org/record/6332981#.YroV0nZBzic
https://github.com/Jingning-Zhang/PlasmaProtein/tree/v1.2
本日的推文重复一下论文中的Figure3,涉及到4个图,平凡箱线图,分组箱线图,箱线图分面,最后一个知识点是怎样将这5个图组合到一起
起首是定义了ggplot2的主题
library(ggplot2)My_Theme <- theme( panel.background = element_blank(), title = element_text(size = 7), text = element_text(size = 6))第一个平凡的箱线图
部分示例数据集
读取数据集
library(readxl)dat01<-read_excel("data/20220627/Fig3.xlsx", sheet = "3a")作图代码
p1 <- ggplot(data = dat01, aes(x = group)) + geom_boxplot(alpha=0.6, notch = TRUE, notchwidth = 0.5, aes(y=hsq, fill=kind)) + coord_cartesian(ylim = c(0,0.5)) + labs(y = expression(paste("cis-",h^2)), x=NULL, title=NULL) + theme(legend.position="top", legend.title=element_blank(), axis.text.x = element_text(color = c("#4a1486", "#4a1486", "#cb181d", "#cb181d"), vjust = 0.5, hjust = 0.5, angle = 15))+ My_Theme+ scale_fill_manual(values=c("#4a1486","#cb181d"))+ theme(axis.line = element_line())p1
分组箱线图
作图代码
dat02<-read_excel("data/20220627/Fig3.xlsx", sheet = "3b")head(dat02)p2 <- ggplot(data = dat02, aes(x = group)) + geom_boxplot(alpha=0.8, notch = TRUE, notchwidth = 0.5, aes(y=acc, fill=Model)) + coord_cartesian(ylim = c(0,1.2)) + labs(title = NULL, x=NULL, y=expression(paste(R^2,"/cis-",h^2))) + theme(legend.position="top", axis.text.x = element_text(color = c("#4a1486", "#4a1486", "#cb181d", "#cb181d"), vjust = 0.5, hjust = 0.5, angle = 15))+ My_Theme+ scale_fill_manual(values=c("#feb24c","#41b6c4"))+ theme(axis.line = element_line())p2箱线图分面
dat03<-read_excel("data/20220627/Fig3.xlsx", sheet = "3c")head(dat03)p3 <- ggplot(data = dat03, aes(x = model)) + geom_boxplot(alpha=0.8, notch = TRUE, notchwidth = 0.5, aes(y=acc, fill=model)) + facet_wrap(~race, ncol=2)+ labs(title = NULL, x=NULL, y=expression(paste(R^2,"/cis-",h^2))) + coord_cartesian(ylim = c(0,1.2)) + theme(axis.text.x = element_text(color = c("#238b45", "#2171b5"), vjust = 0.5, hjust = 0.5, angle = 15), legend.position="none") + My_Theme+ scale_fill_manual(values=c("#238b45","#2171b5"))+ theme(axis.line = element_line(), panel.spacing.x = unit(0,'lines'), strip.background = element_rect(color="white"))p3这里两个小知识点,
- 默认分面两个图之间是有空缺的,假如想没有这个空缺可以在主题里举行设置 panel.spacing.x = unit(0,'lines')
- 两个图中心没有空缺,上面灰色地区的地方假如想区分开,可以将边框颜色设置为白色strip.background = element_rect(color="white")
最后一个箱线图
dat04<-read_excel("data/20220627/Fig3.xlsx", sheet = "3d")head(dat04)gtex.colors <- read_excel("data/20220627/gtex_colors.xlsx")gtex.colorsmyColors <- gtex.colors$V2names(myColors) <- gtex.colors$V1colScale <- scale_fill_manual(name = "gtex.colors", values = myColors)p4 <- ggplot(data = dat04, aes(x = tissue, fill=tissue)) + geom_boxplot(alpha=0.8, notch = TRUE, notchwidth = 0.5, aes(y=cor)) + theme(axis.text.x = element_text(angle = 90, hjust = 1), legend.position="none", axis.title.y = element_text(hjust=1))+ My_Theme+ coord_cartesian(ylim = c(-0.25,1))+ colScale + labs(x = "GTEx V7 tissue", y = "Correlation between cis-regulated gene \nexpression and plasma protein SOMAmers ", title=NULL)+ theme(axis.line = element_line())p4将四个图组合到一起
library(ggpubr)p <- ggarrange(ggarrange(p1, p2, p3, ncol = 3, labels = c("a", "b","c"), widths = c(0.29,0.4,0.31)), p4, nrow = 2, heights = c(0.5,0.5), labels = c(NA,"d"))p示例数据和代码可以自己到论文中获取,大概给本篇推文点赞,点击在看,然后留言获取
接待各人关注我的公众号
小明的数据分析条记本
小明的数据分析条记本 公众号 重要分享:1、R语言和python做数据分析和数据可视化的简朴小例子;2、园艺植物相干转录组学、基因组学、群体遗传学文献阅读条记;3、生物信息学入门学习资料及自己的学习条记!
|