By yangyulei, 28 April, 2026

【金山文档 | WPS云文档】 可视化绘图:divergence与雄性表达解偶联(ΔR)的回归散点图绘制 https://www.kdocs.cn/l/cjgX5MGZij1T

1. 输入文件

merged.csv: 包含divergence与雄性表达解偶联(ΔR)数据。

2.核心代码

# CDS, DNA, Promoter 的颜色
palette = {
    "CDS": "#eba198", 
    "DNA": "#8cc0cc", 
    "Promoter": "#9CC287"
}

# 映射脚本1中使用的列名到统一颜色
colors_div = {
    "cds_div": palette["CDS"],
    "dna_div": palette["DNA"],
    "promoter_div": palette["Promoter"]
}

#回归线散点图
X_raw = df[["cds_div","dna_div","promoter_div"]].values
y_raw = df["delta_raw"].values

plt.figure(figsize=(8, 6), dpi=300)
plt.grid(axis='both', color='lightgrey', linestyle='--', linewidth=0.5, alpha=0.7, zorder=0)

y_plot_limit = np.percentile(y_raw, 98) * 1.15
plt.ylim(-0.02, y_plot_limit) 

legend_handles = []

for i, col in enumerate(["cds_div","dna_div","promoter_div"]):
    Xi = X_raw[:, i].reshape(-1,1)
    reg = LinearRegression().fit(Xi, y_raw)
    slope = reg.coef_[0]
    r2 = reg.score(Xi, y_raw)
    
    # 绘制散点 (R2为三位小数)
    sns.scatterplot(
        x=Xi.flatten(), y=y_raw, s=5, alpha=0.5, 
        color=colors_div[col],
        zorder=1,
        legend=False
    )
    
    # 绘制回归线
    x_min, x_max = Xi.min(), Xi.max()
    x_range = np.linspace(x_min, x_max, 100).reshape(-1,1)
    y_pred = reg.predict(x_range)
    line_handle, = plt.plot(x_range, y_pred, color=colors_div[col], linewidth=2.5, zorder=2, label=f"{type_map[col]} ($R^2$={r2:.3f})")
    legend_handles.append(line_handle)

    # 标注斜率 β (标注在回归线末端位置)
    text_x = x_min + (x_max - x_min) * 0.90
    text_y = reg.predict([[text_x]])[0]
    trans = plt.gca().transData.transform
    p1 = trans((text_x, text_y))
    p2 = trans((text_x + 0.001, reg.predict([[text_x + 0.001]])[0]))
    # 计算角度 (弧度转角度)
    angle = np.rad2deg(np.arctan2(p2[1] - p1[1], p2[0] - p1[0]))
    
    # 在线上方标注 β
    plt.text(text_x, text_y, f'  β={slope:.3f}', 
             color='black', fontsize=9, fontweight='regular',
             rotation=angle, rotation_mode='anchor', va='bottom', zorder=3)

plt.xlabel("Divergence", fontsize=12, fontweight='bold')
plt.ylabel(r"$\Delta$ ($R_f$ - $R_m$)", fontsize=12, fontweight='bold')
plt.legend(handles=legend_handles, loc='upper left', fontsize=9, frameon=True)
plt.title("Regression Comparison", fontsize=13, fontweight='bold')
plt.tight_layout()

plt.savefig(OUTPUT_DIR / "Combined_Regression_Analysis.png", dpi=300)
plt.savefig(OUTPUT_DIR / "Combined_Regression_Analysis.pdf", transparent=True)

3. 输出文件

Combined_Regression_Analysis.png: 不同变量的回归线对比图(含 β 值标注)