董老师发我的娄老师那边的雌雄干旱转录组的fpkm,但是是孙老师注释的版本,需要先转换一下
放在92: /data3/liruiyuan/blast/home/work/positive/biada/cixiongganhan_fpkm.csv
从鹏哥那里来的sun-ncc对应表
用下面这个代码替换的。是sun到ncc版本的替换,适用于sun到其他版本。
ncc到sun版本不一定适用。
def replace_gene_ids(expression_file, id_map_file, output_file):
id_map = {}
with open(id_map_file, 'r') as f:
for line in f:
line = line.strip()
if not line or '\t' not in line:
continue
parts = line.split('\t')
if len(parts) == 2:
old_id, new_id = parts
id_map[new_id] = old_id # ? 替换方向,根据表达量文件中的 ID 做 key
with open(expression_file, 'r') as infile, open(output_file, 'w') as outfile:
header = infile.readline()
outfile.write(header)
for line in infile:
parts = line.strip().split('\t')
gene_id = parts[0]
new_gene_id = id_map.get(gene_id, gene_id)
outfile.write('\t'.join([new_gene_id] + parts[1:]) + '\n')
# 示例调用
replace_gene_ids("cxganhan_fpkm.xls", "sun-ncc.csv", "cxrenamed_fpkm.xls")
替换完成后,提取正选择基因的表达量,做后面的分析。