Week of May 30
Computational biology is an interdisciplinary field, and part of my goal for the summer is to develop a better toolkit for studying and succeeding in a field with as much breadth as this one. Thus, there are two facets to my project this summer: one computational, and one biological. I’ll be focusing both on my computational skills and on my understanding of molecular biology.
To advance the molecular biology goal, this week I sat in on Tufts molecular biology lectures. This week was a primer on nucleic acids and proteins, with a focus on the methods used to identify and characterize them. Understanding the basics of methods used in molecular biology will provide important context when dealing with their data.
On the administrative side, Cowen Group and BCB Group meetings kicked off this week: the Cowen Group meetings serve as brainstorming sessions for my advisor’s students, and the BCB Group meetings are a larger weekly gathering of both Drs. Lenore Cowen and Donna Slonim’s students. At BCB Group meetings, one student presents the state of their research each week and gathers feedback from the group.
On the practical side, I met with Nick Woodward, Ph.D. student in the McVey Lab, to discuss how to handle sequencing errors in our data. DNA sequencing refers to the process of “reading” nucleotide sequences, and mistakes can occur. The trick with our data is that we need to separate miscalled bases from true mutations that occurred during repair. We previously decided to treat mutations near the break as true, and short mutations far from the break as errors. Earlier this week, I wrote a script to identify these errors based on each sequence’s CIGAR string (a shorthand representation of an alignment between sequences that identifies matches, insertions, and deletions). However, due an upstream process, the data format is not optimal for catching all potential sequencing artifacts. At this week’s meeting, we discussed new methods for identifying potential errors, which I will pursue next week.
What I’m Reading:
I plan to read 1-2 papers a week for the duration of the summer, and update this section accordingly. This week, I read a journal submission by another BCB Group student ahead of her presentation, as well as the following:
M. Jinek, K. Chylinski, I. Fonfara, M. Hauer, J. Doudna, and E. Charpentier, “A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity,” Science, vol. 337, no. 6096, pp. 816-821, June 2012. doi: 10.1126/science.1225829