By fanbingbing, 30 June, 2025

1.将用imagej测得长宽分别与天宇的程序测得的面积、宽、长进行回归,回归结果如下,结果并不好

2.分析原因:天宇的程序测得的长宽为识别矩形的长宽,如果小孢子正向摆放的话,结果是准确的,但如果是歪的,误差就产生了,因此回归效果不好。所以我们应该指定一个小孢子拍照的规范,即应该摆正

3.为了进一步验证程序的可信性,我们测量了面积,并进行回归,回归结果好,因此应该采用面积数据作为表型数据较为准确

4.回归的代码:

import pandas as pd

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

import matplotlib.pyplot as plt

import numpy as np

import os

import statsmodels.api as sm

 

file_path = r"D:\回归\S2area3.xlsx"

xls = pd.ExcelFile(file_path)

 

plt.figure(figsize=(10, 6))

 

for sheet_name in xls.sheet_names:

df = pd.read_excel(xls, sheet_name=sheet_name)

 

print(f"Data Overview for {sheet_name}:")

print(df.head())

 

X = df[['area2']]

y = df['area1']

 

model = LinearRegression()

 

model.fit(X, y)

 

y_pred = model.predict(X)

 

mse = mean_squared_error(y, y_pred)

r2 = r2_score(y, y_pred)

 

print(f"Overall Model Results for {sheet_name}:")

print(f"Coefficients: {model.coef_}")

print(f"Intercept: {model.intercept_}")

print(f"Mean Squared Error: {mse}")

print(f"R^2 Score: {r2}\n")

 

X_sm = sm.add_constant(X)

model_sm = sm.OLS(y, X_sm).fit()

p_value = model_sm.pvalues['area2']

 

plt.scatter(y, y_pred, label=f'{sheet_name} - R^2: {r2:.2f}, P-value: {p_value:.4f}')

 

plt.plot([y.min(), y.max()], [y.min(), y.max()], '--', color='red', linewidth=1)

plt.xlabel('Actual area')

plt.ylabel('Predicted area')

plt.title('Linear Regression on Entire Dataset')

plt.legend()

plt.tight_layout()

 

save_dir = r"D:\wty"

if not os.path.exists(save_dir):

os.makedirs(save_dir)

save_path = os.path.join(save_dir, 'overall_regression.png')

plt.savefig(save_path)

print(f"Saved plot to: {save_path}")

 

plt.show()

【金山文档 | WPS云文档】 线性回归问题讨论及解决方案
https://kdocs.cn/l/crwHwEjNSqOf