Week of August 22

We’ve reached “Week 10”, my final formal week as a DREAM student. It’s the week I had intended to wrap up my analysis before writing my final paper, but since discovering the design flaw, I first spent some time making the necessary code changes before finally getting down to the business of parameter tuning. I ran 81 different sets of parameters through my aligner, producing plots and graphics for each set that will allow me to begin assessing various scoring schemes. Preliminary results indicate that my aligner is capable of producing similar results to a commercial alignment tool previously used on the same dataset.

Read More

Week of July 25

Parameter optimization is the process of selecting the coefficients, weights, or other parameters that produce the best results for a given algorithm. For the Needleman-Wunsch algorithm with an affine gap penalty, those parameters are match, mismatch, gap initialization, and gap extension scores. For our modified algorithm, this includes an additional parameter that accounts for the quality of each query sequence base, and increases the likelihood of introducing a mismatch (rather than a gap) if the quality is poor.

Read More

Week of July 18

This week, I finished a working, multithreaded draft of the custom aligner and was able to run it on the test files with a few different parameters.

Read More

Week of July 11

I wrapped up the bulk of the writing for the aligner this week and tested it on toy examples. The next challenge will be formatting the output and further speeding up the program. The output will need to be in SAM format to support a downstream analysis tool. SAM is an acronym for “Sequence Alignment/Map.” I’m working with three common (to the bioinformatics field) file formats this summer: FASTA, FASTQ, and SAM.

Read More

Week of July 4

This week marked aligner attempt #2: The first program I wrote ran too slowly on my test input, even with the help of the Tufts high-performance cluster (HPC). This gave me an opportunity to rewrite it with some performance and modularity improvements in C++. I spent some time planning and re-designing, and got back to work. Despite the abstract nature of programming, I tend to plan best when I can work with my hands, so I started by drawing a diagram of the code I planned to write in a notebook.

Read More

Week of June 27

This week was my first week back to work after a 10-day trip to the west coast. The view from my desk isn’t quite the same as the one from my tent in Olympic National Park, but I came home to the news that a paper I contributed to last year had been accepted to the 2022 ACM-BCB conference. I spent Monday helping prep the final “camera-ready” submission for the conference.

Read More

Week of June 6

McVey Lab

This week was my first McVey Lab meeting; I sat in on a practice talk about the roles of structure-specific endonucleases in DNA damage tolerance. Attending these meetings will reinforce the biological context for my project this summer and hopefully future projects as well.

Read More

Week of May 30

Computational biology is an interdisciplinary field, and part of my goal for the summer is to develop a better toolkit for studying and succeeding in a field with as much breadth as this one. Thus, there are two facets to my project this summer: one computational, and one biological. I’ll be focusing both on my computational skills and on my understanding of molecular biology.

Read More