[关闭]
@agpwhy 2022-03-07T12:33:18.000000Z 字数 2901 阅读 204

王胖的生信笔记第40期:一些常见的科研做图

大家是不是在科研论文里经常见到这样的图:

image-20220307190212226

就是成对比较的两组,画出两组箱线图的同时,每个成对之间也画出线连接。

怎么画呢?这个STHDA的教程不错,和大家分享。

http://www.sthda.com/english/articles/32-r-graphics-essentials/132-plot-grouped-data-box-plot-bar-plot-and-more/

当然不是一定要用R去画,你擅长用GraphPad也可以。

设置环境

主要是要用到以下几个包。

  1. library(dplyr)
  2. library(ggplot2)
  3. library(ggpubr)

用的数据集是自带的五万个钻石价格。不过为了方便展示,只用D,E,F两个颜色的数据。(原始教程用的两组,配色是介绍里提到的那样,在这时期比较敏感,我这里就用了法国国旗配色代替https://www.flagcolorcodes.com/france)。

柱状图

  1. df <- diamonds %>%
  2. filter(color %in% c("D", "E","F")) %>%
  3. group_by(cut, color) %>%
  4. summarise(counts = n())
  1. # 堆叠柱状图
  2. ggplot(df, aes(x = cut, y = counts)) +
  3. geom_bar(
  4. aes(color = color, fill = color),
  5. stat = "identity", position = position_stack()
  6. ) +
  7. scale_color_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  8. scale_fill_manual(values = c("#002395", "#FFFFFF","#ED2939"))
  9. # 分组柱状图
  10. ggplot(df, aes(x = cut, y = counts)) +
  11. geom_bar(
  12. aes(color = color, fill = color),
  13. stat = "identity", position = position_dodge(0.8),
  14. width = 0.7
  15. ) +
  16. scale_color_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  17. scale_fill_manual(values = c("#002395", "#FFFFFF","#ED2939"))

p1

还能直接在上面标注出数字。

  1. ggplot(df, aes(x = cut, y = counts)) +
  2. geom_bar(
  3. aes(color = color, fill = color),
  4. stat = "identity", position = position_dodge(0.8),
  5. width = 0.7
  6. ) +
  7. scale_color_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  8. scale_fill_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  9. ggtitle("分组柱状图")+ geom_text(
  10. aes(label = counts, group = color),
  11. position = position_dodge(0.8),
  12. vjust = -0.3, size = 3.5
  13. ) +theme_dark()
  1. df <- df %>%
  2. arrange(cut, desc(color)) %>%
  3. mutate(lab_ypos = cumsum(counts) - .5 * counts)
  4. # 这里的.5就意味着文字在堆叠柱状图的中间
  5. ggplot(df, aes(x = cut, y = counts)) +
  6. geom_bar(
  7. aes(color = color, fill = color),
  8. stat = "identity", position = position_stack()
  9. ) +
  10. scale_color_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  11. scale_fill_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  12. ggtitle("堆叠柱状图")

p3

棒棒糖图

p2

  1. ggplot(df, aes(cut, counts)) +
  2. geom_linerange(
  3. aes(x = cut, ymin = 0, ymax = counts, group = color),
  4. color = "lightgray", size = 1.5,
  5. position = position_dodge(0.3)
  6. )+
  7. geom_point(
  8. aes(color = color),
  9. position = position_dodge(0.3), size = 3
  10. )+
  11. scale_color_manual(values = c("#002395", "#FFFFFF","#ED2939"))+
  12. theme_dark()+
  13. ggtitle("棒棒糖图")

实际上使用ggdotchart肯恩更需要的参数更少一些,不过这里我就不展开啦

带Error Bar的折线图

换个维生素D对豚鼠牙齿生长的数据

  1. df <- ToothGrowth
  2. df$dose <- as.factor(df$dose)
  3. df.summary <- df %>%
  4. group_by(dose) %>%
  5. summarise(
  6. sd = sd(len, na.rm = TRUE),
  7. len = mean(len)
  8. )

做带Error Bar的图。注意这里使用as.numeric的话,下面的坐标就是按照实际数值来,使用as.factor的话,下面的坐标就是按照组别来。

  1. df.summary$dose <- as.numeric(df.summary$dose)
  2. ggplot(df.summary, aes(dose, len)) +
  3. geom_line() +
  4. geom_errorbar( aes(ymin = len-sd, ymax = len+sd),width = 0.2) +
  5. geom_point(size = 2)+theme_classic()

p4

可以再加上每组的数据散点

  1. df$dose <- as.numeric(df$dose)
  2. ggplot(df, aes(dose, len)) +
  3. geom_jitter( position = position_jitter(0.2),
  4. color = "darkgray") +
  5. geom_line(aes(group = 1), data = df.summary) +
  6. geom_errorbar(
  7. aes(ymin = len-sd, ymax = len+sd),
  8. data = df.summary, width = 0.2) +
  9. geom_point(data = df.summary, size = 2)+theme_classic()

p5

做介绍里的成对比较图

  1. ggpaired(ToothGrowth, x = "supp", y = "len",
  2. color = "supp", line.color = "gray", line.size = 0.4,
  3. palette = c("#002395","#ED2939"))+
  4. stat_compare_means(paired = TRUE)

p6

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注