Skip to content

maxhartke/cs576

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CS576

Intro to bioinformatics.

  • Implemented Eulerian path algorithm for spectral assembly allowing for k-mer multiplicity.
  • Constructed and tested euler_assemble_multi for superstring assembly.
  • Assembled the SARS-CoV-2 genome from reads, analyzed k-mer histograms, and identified variant.
  • Explored greedy assembly and correctness for read sets.
  • Provided overlap graphs, edge orders, and analysis.
  • Completed conceptual questions about greedy and spectral assembly paradigms.
  • Developed and tested overlap_align for optimal pairwise overlap alignment with affine gap penalties.
  • Calculated overlap graphs from real SARS-CoV-2 Nanopore reads and determined read ordering.
  • Performed hand-computed global and local alignments for provided sequences.
  • Modeled read overlap probabilities using a probabilistic framework and computed likelihoods and posteriors.
  • Included code, worked examples, and matrix visualizations for alignments.
  • Implemented UPGMA alignment order strategy, handling tie-breaking.
  • Constructed guide trees for progressive multiple alignments of SARS-CoV-2 spike proteins.
  • Used substitution matrices (e.g., BLOSUM62) and pairwise edit distances.
  • Performed multiple alignment, visualized domains, and analyzed evolutionary changes.
  • Explored star alignment strategies and sum-of-pairs scoring.
  • Analyzed DNA substitution matrix scenarios for zero, positive, and negative entries.
  • Implemented tree likelihood calculations using dynamic programming.
  • Constructed and applied a PhyloHMM to detect recombination in SARS-CoV-2 spike gene alignments, analyzed Viterbi paths.
  • Compared HMM predictions under different transition probabilities.
  • Manually performed branch-and-bound for unweighted parsimony on five taxa, documented queue states and tree count comparisons.
  • Estimated Markov chain parameters and likelihoods for DNA sequence data using uniform, MLE, and Laplace approaches.

H5

  • Estimating HMM parameters with Viterbi training
  • K-means clustering
  • Forward and Backward algorithm
  • The Baum–Welch algorithm

HW6

  • Gaussian mixture model-based clustering
  • Bottom-up hierarchical clustering
  • determining causal relationships
  • scoring Bayesian networks with model evidence

About

Introduction to Bioinformatics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published