Mapreads output to Eland
From Icbwiki
A tool called mapreadsToEland.py is developed to convert mapreads output to Eland default format.
Contents |
[edit]
Availability:
~maqc3/bin/
[edit]
Usage:
./mapreadsToEland.py stdin stdout
[edit]
Example:
mapreads F3.csfasta human.genome.fa M=1 Z=2 I=1 | mapreadsToEland.py > F3.eland
[edit]
Source script:
#!/usr/bin/env python
#convert mapreads output to eland default format
import sys
while 1:
line=sys.stdin.readline()
if not line: break
if line[0]=='#': continue
if line[0]=='>': sequence=sys.stdin.readline().strip() #next line is the actual sequence
fields=line.split(',')
annotation ={} # all hits in a dictionary of position and number of mismatches
for hit in fields[1:]:
annotation[hit.split('.')[0]]=int(hit.split('.')[1])
keys, values=annotation.keys(), annotation.values()
bestmatch,type,sign,ref='_','NM','F', '_'
counts=[values.count(0), values.count(1), values.count(2)]
if counts[0]==1: type, bestmatch='U0', keys[values.index(0)]
elif counts[0]>1: type='R0'
elif counts[1]==1: type, bestmatch='U1', keys[values.index(1)]
elif counts[1]>1:type='R1'
elif counts[2]==1:type, bestmatch='U2', keys[values.index(2)]
elif counts[2]>1:type='R2'
print fields[0].strip('>\n'), sequence, type,
if bestmatch[0]=='-': sign='R'
if '_' in bestmatch: ref, bestmatch = bestmatch.split('_')
if type[0]=='U': print ref, bestmatch.strip('-'), sign,'_'
else: print
