Tuesday, November 3, 2015

Python: Calculate GC content for a BAM file

import pysam
from sys import argv

bam_file = argv[1]
bed_file = argv[2]

bam = pysam.AlignmentFile(bam_file, 'rb')
with open(bed_file) as bf:
    for line in bf:
        line_parts = line.strip().split()
        chr = line_parts[0]
        start = int(line_parts[1])
        end = int(line_parts[2])
        read_data = bam.fetch(chr, start, end)
        total_bases = 0
        gc_bases = 0
        for read in read_data:
            seq = read.query_sequence
            total_bases += len(seq)
            gc_bases += len([x for x in seq if x == 'C' or x == 'G'])
        if total_bases == 0:
            gc_percent = 'No Reads'
        else:
            gc_percent = '{0:.2f}%'.format(float(gc_bases)/total_bases * 100)
        print '{0}\t{1}'.format(line.strip(), gc_percent) 
You need to have Pysam installed.

1 comment:

  1. Wang Zheng Yuan: Python: Calculate Gc Content For A Bam File >>>>> Download Now

    >>>>> Download Full

    Wang Zheng Yuan: Python: Calculate Gc Content For A Bam File >>>>> Download LINK

    >>>>> Download Now

    Wang Zheng Yuan: Python: Calculate Gc Content For A Bam File >>>>> Download Full

    >>>>> Download LINK M0

    ReplyDelete