Overview
The Arkansas Children’s Nutrition Center Biostatistics and Data Innovation Core supports statistical and bioinformatics needs and is directed by Keith Williams, Ph.D., vice chair of education in the Department of Biostatistics at UAMS. The Biostatistics and Data Innovation team collaborates with Arkansas Children’s Nutrition Center scientists to support statistical consultation and/or analysis. Brian Piccolo, Ph.D., serves as the associate director of the Biostatistics and Data Innovation Core, with a focus on statistical analysis and visualization of –omics based data, along with an additional faculty member in the Department of Biostatistics, and a data scientist to collaborate with Williams and Piccolo.
Access to an Innovative Computing Facility
Dedicated office space with multiple workstations, team members have access to some of the latest in computing technology
Next-gen Sequencing
Assets for next-gen sequencing are available through the ACRI core lab, including access to a NextSeq2000 and Miseq desktop sequencer.
Sequencing and Array-based Platforms
Gene expression analysis (transcriptomics), epigenetics (DNA methylation and ChiP-seq) and microbial genomics (microbial ecology and metagenomics)
Practicing in the Latest Software
Tools for mathematical modeling, bioinformatics analysis, genomics, microbiomics, and transcriptomics analyses
In Greater Detail
For bioinformatics analysis, the Arkansas Children’s Nutrition Center has an Illumina Compute Server (running Linux Redhat), a networked 36-core 128GB RAM/12TB computation server (Ubuntu-Linux), a networked 36-core 128GB RAM/12TB computation server (Ubuntu-Linux), a networked 20-core 128GB RAM/16TB computation server and stand-alone MacPro (12-core, 16TB, 64 GB RAM), and two quad-core iMac computers with 32 GB RAM. Software tools for genomics, microbiomics, and transcriptomics analyses include short-read aligners (Bowtie2, BWA, STAR), and other tools such as QIIME, QIIME2, MetaMos, SeqMonk, TOPHAT, Cufflinks, FASTQC, MACS, Qeseq, IGV, Circos, R Bioconductor, and Cytoscape. Tools for mathematical modeling such as Bio-python, Mathematica, MATLAB, and ChromHMM and ChromDiff are available on separate workstations.
The R Statistical Language is our primary tool due to its flexibility, reproducibility, and more recent advances in interactive data visualization (Shiny, RMarkdown). To develop interactive HTML-based apps to streamline our data analytics approaches, the Biostatistics and Data Innovation Core also provides user-friendly tools for collaborating scientists to analyze high-dimensional data with little or no understanding of R. We have recently released DAME, a free Shiny app that allows for rapid and interactive exploration of 16S rRNA amplicon sequencing data. Future apps are in development for other data pipelines.
Team members have access to ‘Grace’, the high-performance computer facility at UAMS. Grace is composed of 96 traditional Xeon CPU nodes each with 28 cores and 128 GB of member (2688 CPU cores), 96 Xeon Phi nodes each with 64 cores, of which 80 have 384 GB of memory and 16 have 192 GB of memory (6144 Phi cores total), 3 Xeon nodes with 24 cores and 128 GB of memory plus 2 NVIDIA GPUs each (72 CPU cores, 6 GPUs with 27,264 total cores), 7 management/login/storage interface nodes and 1.9 PB of high speed storage (DDN GS14KX GridScaler).