By Gengxin, 30 November, 2025
Forums

1. 由于没有生物学重复,所以不能使用DESeq进行差异分析(不能使用华东师兄的snakemake流程)

会报错,报错如下,交给AI解析

2. 解决方法:使用NOISeq计算

参考文献:

Sonia Tarazona, Pedro Furió-Tarí, David Turrà, Antonio Di Pietro, María José Nueda, Alberto Ferrer, Ana Conesa, Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package, Nucleic Acids Research, Volume 43, Issue 21, 2 December 2015, Page e140, https://doi.org/10.1093/nar/gkv711

NOISeq示例(需安装NOISeq包)

# NOISeq示例(需安装NOISeq包)
library(NOISeq)
# 加载包
library(NOISeq)
library(dplyr)

# ----------------------
# 1. 确保样本信息匹配(先验证)
# ----------------------
# 确认计数矩阵(列=样本)和metadata(行=样本)数量一致
cat("计数矩阵样本数:", ncol(counts_filtered), "\n")
cat("metadata样本数:", nrow(metadata), "\n")
cat("计数矩阵样本名:", colnames(counts_filtered), "\n")
cat("metadata样本名:", metadata$sample_id, "\n")

# 若顺序不一致,重新排序metadata(关键!)
metadata <- metadata[match(colnames(counts_filtered), metadata$sample_id), ]

# ----------------------
# 2. 构建NOISeq输入对象
# ----------------------
noiseq_data <- readData(
  data = counts_filtered, 
  factors = metadata  # 仅需包含样本分组(如group列)
)

# ----------------------
# 3. 用 noiseq 函数(适配无重复)替代 noiseqbio
# ----------------------
# 核心参数:
# - k = 0.5:无重复时的变异模拟参数(推荐默认值)
# - norm = "n":按测序深度归一化
# - factor = "group":分组列名(需与metadata一致)
noiseq_res <- noiseq(
  input = noiseq_data,
  k = 0.5,
  norm = "n",
  factor = "group",
  condition = c("control", "treatment")  # 明确指定两组名称(按你的实际分组修改!)
)

# ----------------------
# 4. 筛选差异基因(基于差异概率q)
# ----------------------
# q = 0.8:差异概率≥80%(可调整,如q=0.9更严格)
deg_noiseq <- degenes(noiseq_res, q = 0.8)

# 转换为数据框,添加基因ID列
deg_df <- as.data.frame(deg_noiseq) %>%
  rownames_to_column("gene_id") %>%
  arrange(desc(prob))  # 按差异概率降序排序

# 输出差异基因结果
write.csv(deg_df, "D:/NOISeq_DEGs.csv", row.names = FALSE, quote = FALSE)