1) Computational
genomics of single-molecule
long-reads
The long reads from
single molecular sequencing (SMS)
technologies such as PacBio single
molecule real time (SMRT) and Oxford
Nanopore have many advantages in genomic
studies. However, applying SMS to
large-sized genomes has suffered from
high computational cost. We developed an
ultra-fast Mapping, Error Correction,
and de novo Assembly Tool (MECAT) for
SMS reads [Nature
Methods 2017]. The computing
efficiency of MECAT is superior compared
to current tools while the results are
comparable or improved. Another
advantage of SMS sequencing technology
is to detect the less studied DNA 6mA
and 4mC modifications at
single-nucleotide resolution. We
developed MethSMRT, the first resource
hosting DNA 6mA and 4mC methylomes [Nucleic
Acids Research 2017].
2) Translational regulation of
gene expression
Ribosome profiling is a
technique that enables genome-wide
investigation of in vivo translation at
sub-codon resolution, to understand the
composition, regulation and mechanism of
translation. Using ribosome profiling,
we can not only understand the
translational regulation of protein
coding genes [Journal
of Virology 2016; BMC
Genomics 2017; Protein
Cell 2018], but also coding
potentials of non-coding genes [Nucleic
Acids Research 2017]. A
comprehensive collection of ribosome
profiling datasets can be found in [Nucleic
Acids Research 2015]. In addition,
a review of computational resources and
tools for ribosome profiling can be
found in [Briefings
in Bioinformatics, 2017].
3) Multi-omics and
systems biology
Recent advances in omics
technologies, including genomics,
transcriptomics, proteomics and
metabolomics, enable us to build an
integrative and comprehensive model and
to systematically understand molecular
changes. (1) By
integrating transcriptomic, proteomic
and metabolomic datasets, we found that
mitochondrial and protein quality
control playing an important role in
response to hibernation which could
potentially protect non-hibernated
species from cold stress [Journal
of Cellular Physiology 2017; Cell
2018]. (2) Using an integrative
analysis of transcriptomics and
proteomics, we revealed that oxidative
phosphorylation and mitochondrial
biogenesis were involved in induced
differentiation of glioblastoma cells
into astrocytes [Cell
Reports 2017]. (3) To
systematically understand human
immunity, we analyzed immune parameters
in depth both at baseline and in
response to influenza vaccination.
Peripheral blood mononuclear cell
transcriptomes, serum titers, cell
subpopulation frequencies, and B cell
responses were assessed in 63
individuals before and after
vaccination. Strikingly, independent of
age and preexisting antibody titers,
accurate models could be constructed
using pre-perturbation cell populations
alone, which were validated using
independent baseline time points [Cell
2014]. Overall, multi-omic
analysis can provide important insights
into the complex biological systems.
4) Deep learning in
bioinformatics and medicine
Deep learning is a recent
and fast-growing field of machine
learning. It attempts to model
abstraction from large-scale data by
employing multi-layered deep neural
networks, thus making sense of data such
as genomics, texts and images [Genomics,
Proteomics and Bioinformatics 2018].
Compared to the conventional analysis
strategies, deep learning is powerful to
leverage very large and complex data
sets. Our lab is currently working on a
number of projects in bioinformatics and
medical images to solve computational
challenges in biology and medicine.
5) Single cell
genomics
Recent technological
advances have enabled unprecedented
insight into genomics at the level of
single cells. Single cell
transcriptomics enables the measurement
of transcriptomic information of
thousands of single cells in a single
experiment. However, many technical
challenges exist in improving
sensitivity and specificity of the
experiments. In addition, the volume and
complexity of data make it a paradigm of
big data. Consequently, the field is
presented with new technical and, in
particular, analytical challenges. Our
lab is interested in developing both
computational and experimental methods
to explore single cell gene expression
variation in development and human
diseases.
|