Thursday, December 19, 2013

Browser based Shell connection

Most of the time I need to login to the server to submit the jobs or check the status of the running job and for programming, at my desk I have MTPPuTTY for this purpose. But what if you are at vacation and want a quick look at the jobs you are running or submit a new job. I started googling and found what I was looking for "A web browser based secure shell connection extension for Google chrome". 


Install it and give it a shot, its pretty impressive. you can customize it (tools-> Extensions->options in secure shell extension).

Monday, December 2, 2013

File checking in Perl

A list of checking you can perform on a file in perl
Commanly used checks are in Bold Font
-r      File is readable by effective uid/gid.
-w      File is writable by effective uid/gid.
-x      File is executable by effective uid/gid.
-o      File is owned by effective uid.
-R      File is readable by real uid/gid.
-W      File is writable by real uid/gid.
-X      File is executable by real uid/gid.
-O      File is owned by real uid.
-e      File exists.
-z      File has zero size (is empty).
-s      File has nonzero size (returns size in bytes).
-f      File is a plain file.
-d      File is a directory.
-l      File is a symbolic link.
-p      File is a named pipe (FIFO), or Filehandle is a pipe.
-S      File is a socket.
-b      File is a block special file.
-c      File is a character special file.
-t      Filehandle is opened to a tty.
-u      File has setuid bit set.
-g      File has setgid bit set.
-k      File has sticky bit set.
-T      File is an ASCII text file (heuristic guess).
-B      File is a "binary" file (opposite of -T).
-M      Script start time minus file modification time, in days.
-A      Same for access time.
-C      Same for inode change time (Unix, may differ for other platforms)

Split Bam files

In GATK Local Realignment process its recommended that we use as many bams as possible so that the samples help each other during the process. At the same time MuTect requires 2 separate bam files for blood and tumor. This dilemma requires to split the Bam file generated after the LR/BQSR process and generate the Sample Level Bam files.

Following python code takes a bam file and split it based on the sample Read ID and index the newly generated bam files.


To denote that the new bam files are already Locally Realigned and Quality Recalibrated, I added "BQSR.LR" in the names of new bams I am generating in, which can be removed/replaced if you dont need it.