Monday, September 23, 2013

MySQL: commonly used syntax example

this is a quick dirty list of some commonly used SQL syntax, summarized by WZY:-

The delimited dump of data:


SELECT * INTO OUTFILE '/tmp/result.txt'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  FROM mydb1.table1;


LOAD DATA INFILE '/tmp/test.txt' INTO TABLE mydb1.table1 
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES STARTING BY 'xxx',
  LINES TERMINATED BY '\n';
 

Note: 
(1)Backing up your data using the delimited dump is more efficient than mysqldump, and the dump file is smaller on disk.
(2)To dump or restore multiple tables at once, you can use mk-parallel-dump.
 
(3)Here you should make sure the in/out file is readable/writeable by mysql (for example, the file is owned by mysql) . Otherwise, you will have a permission problem.


ALTER TABLE t ADD COLUMN c1_newtype NEWTYPE
UPDATE t SET c1_newtype = c1
ALTER TABLE t DROP COLUMN c1
RENAME COLUMN t.c1_newtype TO c1
----
example: change a column name

ALTER TABLE genes CHANGE oldName newName VARCHAR(9);
----

SHOW GLOBAL VARIABLES;
SHOW GLOBAL STATUS:
SHOW TABLE STATUS;
CHECK TABLE mytable;
OPTIMIZE TABLE mytable;
ANALYZE TABLE mytable;
REPAIR TABLE mytable;
 
 
-------------------------
Example: to create a table
 
 
CREATE TABLE test.genes(
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
gene VARCHAR(15),
chrom VARCHAR(6),
direction CHAR(1),
start1 INT,
stop1 INT,
start2 INT,
stop2 INT,
exons INT,
exonStarts VARCHAR(60),
exonStops VARCHAR(60),
unknown varchar(8),
remark VARCHAR(15)
) ENGINE=InnoDB; 

---------------------------
Here is the official table from UCSC

CREATE TABLE `knownGene` (
  `name` varchar(255) NOT NULL DEFAULT '',
  `chrom` varchar(255) NOT NULL DEFAULT '',
  `strand` char(1) NOT NULL DEFAULT '',
  `txStart` int(10) unsigned NOT NULL DEFAULT '0',
  `txEnd` int(10) unsigned NOT NULL DEFAULT '0',
  `cdsStart` int(10) unsigned NOT NULL DEFAULT '0',
  `cdsEnd` int(10) unsigned NOT NULL DEFAULT '0',
  `exonCount` int(10) unsigned NOT NULL DEFAULT '0',
  `exonStarts` longblob NOT NULL,
  `exonEnds` longblob NOT NULL,
  `proteinID` varchar(40) NOT NULL DEFAULT '',
  `alignID` varchar(255) NOT NULL DEFAULT '',
  KEY `name` (`name`),
  KEY `chrom` (`chrom`(16),`txStart`),
  KEY `chrom_2` (`chrom`(16),`txEnd`),
  KEY `protein` (`proteinID`(16)),
  KEY `align` (`alignID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;


(to make this post easier to find, I leave a few unique/weird labels here: mysql import infile wzy 
alex138671 
)

Friday, September 20, 2013

Tools for Monitoring Network Performance

Multi Router Traffic Grapher (MRTG)

http://oss.oetiker.ch/mrtg
This is a tool to monitor devices (routers, switches, servers, networks).

What it does
You have a router, you want to know what it does all day long? Then MRTG is for you. It will monitor SNMP network devices and draw pretty pictures showing how much traffic has passed through each interface.
Routers are only the beginning. MRTG is being used to graph all sorts of network devices as well as everything else from weather data to vending machines.
MRTG is written in perl and works on Unix/Linux as well as Windows and even Netware systems. MRTG is free software licensed under the Gnu GPL.

Sample Graph


Smokeping

http://oss.oetiker.ch/smokeping

SmokePing keeps track of your network latency:
  • Best of breed latency visualisation.
  • Interactive graph explorer.
  • Wide range of latency measurment plugins.
  • Master/Slave System for distributed measurement.
  • Highly configurable alerting system.
  • Live Latency Charts with the most 'interesting' graphs.
  • Free and OpenSource Software written in Perl written by Tobi Oetiker, the creator of MRTG and RRDtool

Cacti

http://www.cacti.net




Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices.

Friday, September 13, 2013

IP tables startup script example

#! /bin/sh
### BEGIN INIT INFO
# Provides:          iptables_custom
# Required-Start:    $networking
# Required-Stop:
# Default-Start:
# Default-Stop:      0 6
# Short-Description: Custom bridged iptables rules
### END INIT INFO

PATH=/sbin:/bin
IPTABLES=/sbin/iptables
LOCALIP=10.0.0.253
LOCALLAN=10.0.0.0/24
WEBPROXY=10.0.0.111

. /lib/lsb/init-functions

do_start () {
      log_action_msg "Loading custom iptables rules"

      # Flush active rules, custom tables
      $IPTABLES --flush
      $IPTABLES --delete-chain

      # Set default-deny policies for all three default tables
      $IPTABLES -P INPUT DROP
      $IPTABLES -P FORWARD DROP
      $IPTABLES -P OUTPUT DROP

      # Don't restrict loopback (local process intercommunication)
      $IPTABLES -A INPUT -i lo -j ACCEPT
      $IPTABLES -A OUTPUT -o lo -j ACCEPT

      # Block attempts at spoofed loopback traffic
      $IPTABLES -A INPUT -s $LOCALIP -j DROP

      # pass DHCP queries and responses
      $IPTABLES -A FORWARD -p udp --sport 68 --dport 67 -j ACCEPT
      $IPTABLES -A FORWARD -p udp --sport 67 --dport 68 -j ACCEPT

      # Allow SSH to firewall from the local LAN
      $IPTABLES -A INPUT -p tcp -s $LOCALLAN --dport 22 -j ACCEPT
      $IPTABLES -A OUTPUT -p tcp --sport 22 -j ACCEPT

      # pass HTTP and HTTPS traffic only to/from the web proxy
      $IPTABLES -A FORWARD -p tcp -s $WEBPROXY --dport 80 -j ACCEPT
      $IPTABLES -A FORWARD -p tcp --sport 80 -d $WEBPROXY -j ACCEPT
      $IPTABLES -A FORWARD -p tcp -s $WEBPROXY --dport 443 -j ACCEPT
      $IPTABLES -A FORWARD -p tcp --sport 443 -d $WEBPROXY -j ACCEPT

      # pass DNS queries and their replies
      $IPTABLES -A FORWARD -p udp -s $LOCALLAN --dport 53 -j ACCEPT
      $IPTABLES -A FORWARD -p tcp -s $LOCALLAN --dport 53 -j ACCEPT
      $IPTABLES -A FORWARD -p udp --sport 53 -d $LOCALLAN -j ACCEPT
      $IPTABLES -A FORWARD -p tcp --sport 53 -d $LOCALLAN -j ACCEPT

      # cleanup-rules
      $IPTABLES -A INPUT -j LOG --log-prefix "Dropped by default (INPUT):"
      $IPTABLES -A INPUT -j DROP
      $IPTABLES -A OUTPUT -j LOG --log-prefix "Dropped by default (OUTPUT):"
      $IPTABLES -A OUTPUT -j DROP
      $IPTABLES -A FORWARD -j LOG --log-prefix "Dropped by default (FORWARD):"
      $IPTABLES -A FORWARD -j DROP
}

do_unload () {
      $IPTABLES --flush
      $IPTABLES -P INPUT ACCEPT
      $IPTABLES -P FORWARD ACCEPT
      $IPTABLES -P OUTPUT ACCEPT
}

case "$1" in
  start)
        do_start
        ;;
  restart|reload|force-reload)
        echo "Reloading iptables rules"
        do_unload
        do_start
        ;;
  stop)
        echo "DANGER: Unloading firewall's Packet Filters!"
        do_unload
        ;;
  *)
        echo "Usage: $0 start|stop|restart" >&2
        exit 3
        ;;
esac

Monday, September 9, 2013

Top Next Generation Sequencing Forums

Forums offer a great opportunity to mind share with like minded professionals across the globe. They offer a vehicle to ask and potentially receive quick answers to just about any industry related question. Forums also offer a vehicle for collaboration and improved decision making by harnessing the power of the crowd. So which are the top next gen sequencing forums? These 4 were selected for having the largest number of active members discussing next gen sequencing specifically.

Top Next Generation Sequencing Forums

  1. Seqanswers: A top list of NGS forums would not be complete without SEQanswers. This forum is dedicated solely to this topic. Seqanswers was founded to be an information resource and user-driven community focused on all aspects of next-generation genomics.
  2. Nature Network: Nature Network is a professional networking website for scientists around the world. It is not dedicated specifically to next generation sequencing but they do have many members active on the subject. Nature is an online meeting place where you and your colleagues can gather, share and discuss ideas, and keep in touch. It’s also where you can consult the community for answers to scientific questions or offer your expertise to help others.
  3. The Science Advisory Board: This forum has as its mission to improve communications between medical and life science professionals and the companies who provide this community with products and services. To accomplish this mission, The Science Advisory Board conducts studies about the products and services NGS professionals currently use. Companies value this information and use it to improve existing products and services or to develop new ones that better meet the needs of their customers.
  4. LinkedIn: Little explanation required. Here are the top 3 forums (based on the number of members)
Forums can be an extremely valuable tool in a career involving next generation sequencing (or otherwise), enabling both researchers and NGS providers a fantastic way to collaborate and build relationships. Undoubtedly there are other great next gen sequencing forums out there. If you have any you would like to share we would love to here about it!

Sunday, September 8, 2013

12 Best MySQL Database Books

All open source applications use MySQL database.
With the introduction of stored procedures in MySQL 5, it is getting widely adopted by Oracle developers and commercial organizations.
If you are sysadmin, DBA, or developer, or someone who manages IT development, you should have good knowledge on MySQL database.

A while back, I provided a list of 12 best Linux books.
In this article, I’ve provided a list of 12 best MySQL books that covers various topics in MySQL for various skill levels. You’ll not be disappointed with any of these MySQL books. These books will be a great addition to your technical library.
  1. High Performance MySQL: Optimization, Backups, Replication, and More, by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, Derek J. Balling. This is hands-down the best book on MySQL performance. If you are running a big MySQL database in production, this book is a must. Every MySQL DBA (sysadmins, and developers too) should read this book. One of the authors of the book worked as performance engineer at the MySQL AB company. This book explains everything you need to know about MySQL performance including MySQL architecture, MySQL benchmarking tools, MySQL schema optimization, SQL query performance, optimizing server settings, MySQL replication and high availability, and MySQL security.
  2. MySQL Cookbook, by Paul DuBois. This is a big book with 900+ pages. If you are a developer who has to store and retrieve data from MySQL database, this book is a must. This explains how to use mySQL client, how to write MySQL program that connects to a DB, selects a DB, and queries data. This book explains about table management, strings, date and time, query sorting, and summary report generation. You’ll also learn about how to import and export data, and create stored procedures.
  3. MySQL Administrator’s Bible, by Sheeri K. Cabral and Keith Murphy. This is probably the only book that covers MySQL 5.1 in full detail. This also explains about MySQL 6. This covers all aspects of MySQL database administration including MySQL server tuning, query optimization, index tuning, managing storage engines, caching, authentication and managing users, partitioning and replication, logging, database and performance monitoring, security, backup and recovery, managing availability and scaling.
  4. MYSQL in a Nutshell, by Russell Dyer. If you’ve used other “In a Nutshell” series from O’Reilly, you would love this book. Please keep in mind that this is a reference manual to MySQL statements, functions, and administrative utilities. This contains MySQL programming lanague API for PHP, Perl, and C. MySQL replication, triggers, and stored procedures are explained with examples.
  5. MySQL (4th Edition), by Paul DuBois. This is a huge book with around 1200 pages. The fact that this book is in 4th edition shows how popular this book is. It is well organized around three topics 1) General MySQL, 2) MySQL programming interface for C, Perl and PHP, and 3) MySQL Administration. The appendix in this book is well structured with data type reference, MySQL operator reference, functions reference, and SQL syntax reference.
  6. MySQL High Availability: Tools for Building Robust Data Centers, by Charles Bell, Mats Kindahl, Lars Thalmann. This book is written by the team who devloped the MySQL replication functionality. If you are planning to implement MySQL replication, this book is a must. This is the best book that covers MySQL replication in a great depth. The book is organized into three high level parts. Part 1 covers MySQL replication fundamentals, binary log, replication for high availability, replication for scale-out, and advanced replication. Part 2 covers storage engine monitoring, replication monitoring, replication troubleshooting in great depth. Part 3 covers MySQL in cloud computing, and MySQL cluster.
  7. Understanding MySQL Internals, by Sasha Pachev. Just like the title says, this book explains about how the MySQL internal works by walking us through the source code. This book was written by a former MySQL developer. This book is not for MySQL beginner. But if you’ve been using MySQL for a while, you’ll learn a lot of details about how MySQL works, how to add your own storage engine, understand the core classes, structures, variables and API of MySQL source code.
  8. MySQL Stored Procedure Programming, by Guy Harrison, Steven Feuerstein. Oracle developers never took MySQL seriously, until the stored procedures were implemented in MySQL 5. This book explains everything you need to know about how to implement stored procedures in MySQL with lot of practical examples. This book is organized in four high level parts. Part 1 covers the stored procedure fundamentals, including language fundamentals, blocks, conditional statements, and error handling. Part 2 covers stored procedure construction, including creating and maintaining stored procedures, transation management, MySQL built-in functions, stored functions, and triggers. Part 3 covers how to use MySQL stored procedures in PHP, Java, and Perl. Part 4 covers optimizing stored procedures, including security, and tunning.
  9. Pro MySQL (Expert’s Voice in Open Source), by Michael Kruckenberg, Jay Pipes. Just like the name of the book suggests, this is not for MySQL beginners. This is for intermediate to advance MySQL users and administrators. On a high level, this book is organized into two parts 1) For MySQL design and development that covers MySQL architecture, indexes, stored procedures, functions, cursors, views, and triggers in a very detailed level 2) MySQL administration that discuss about security, backup and restore, replication, cluster in a very detailed level.
  10. Learning MySQL, by Seyed Tahaghoghi, Hugh Williams. This is a wonderful book for those who are new to MySQL, and would like someone to walk them through step-by-step on various MySQL features. This book is organized into 6 high level parts that covers MySQL introduction, Using MySQL, advanced MySQL topics, web development with MySQL and PHP, web development with MySQL and Perl.
  11. MySQL Pocket Reference: SQL Functions and Utilities (Pocket Reference (O’Reilly)), by George Reese. Even for those who use MySQL database everyday, it is hard to remember all MySQL commands and its syntax. This is a reference book that should be on your desk for a quick lookup on MySQL commands and options. This reference guide provides syntax and options for data types, sql, operators, and functions.
  12. Expert MySQL, by Dr. Charles A. Bell. This is not for beginners. Once you understand the basics of MySQL, if you are looking to extend your MySQL knowledge, you should read this book. This book is organized into three high level parts. Part 1 covers anotomy of MySQL database, and provides a tour of the MySQL source code. Part 2 explains how to extend MySQL including debugging, building custom storage engine, adding custom functions and commands to MySQL database. Part 3 covers advanced database internals including system internals, query representation, query optimization, and query execution in a great depth.

Thursday, September 5, 2013

Calculate FDR-adjusted p values in R

#This example shows how to calculate FDR-adjusted p values.


p<-c(...);

p2<-c(p.adjust(p, method=p.adjust.methods[7],n=length(p)));

write.table(p2,file="X://wzy/out.txt", row.names=FALSE, col.names=FALSE,sep=",");


Note:
> p.adjust.methods
[1] "holm"       "hochberg"   "hommel"     "bonferroni" "BH"      
[6] "BY"         "fdr"        "none"