Thursday, February 27, 2014

A script for query of UCSC database: convert UCSC ID to gene symbols



assembly='hg19'       ## Species and assembly you want to query
ucsc_ids="'uc001bek.2', 'uc001hku.1', 'uc001jal.3'" ## IDs to convert

mysql --user=genome --host=genome-mysql.cse.ucsc.edu --disable-auto-rehash -e \
"SELECT DISTINCT knownGene.name, kgXref.geneSymbol
FROM $assembly.knownGene INNER JOIN $assembly.kgXref ON
    knownGene.name = kgXref.kgID
WHERE knownGene.name IN ($ucsc_ids);" > ucsc2symbols.txt

cat ucsc2symbols.txt
name          geneSymbol
uc007aet.1       mKIAA1889
uc007aeu.1       Xkr4
uc007aev.1       AK149000

No comments:

Post a Comment