Learn the data skills necessary for turning large sequencing datasets into reproducible and robust biological findings. With this practical guide, you'll learn how to use freely available open source tools to extract meaning from large complex biological data sets. At no other point in human history has our ability to understand life's complexities been so dependent on our skills to work with and analyze data. This intermediate-level book teaches the general computational and data skills you need to analyze biological data. If you have experience with a scripting language like Python, you're ready to get started. Go from handling small problems with messy scripts to tackling large problems with clever methods and tools Process bioinformatics data with powerful Unix pipelines and data tools Learn how to use exploratory data analysis techniques in the R language Use efficient methods to work with genomic range data and range operations Work with common genomics data file formats like FASTA, FASTQ, SAM, and BAM Manage your bioinformatics project with the Git version control system Tackle tedious data processing tasks with with Bash scripts and Makefiles
About the AuthorVince Buffalo is a bioinformatician at the UC Davis Department of Plant Sciences, in Jorge Dubcovsky's wheat genomics lab. Before this, he was the primary statistical programmer at the UC Davis Genome Center's Bioinformatics Core where he analyzed many diverse genomics datasets. An obsessive programmer since he was a young teenager, Vince was drawn to the statistical and computational problems of genomics. He works on open source bioinformatics tools in his work and free time, and enjoys fly fishing and cooking when away from the computer.
Book InformationISBN 9781449367374
Author Vince BuffaloFormat Paperback
Page Count 300
Imprint O'Reilly MediaPublisher O'Reilly Media