CS576

Intro to bioinformatics.

Sequence assembly, graphs, k-mer multiplicity, SARS-CoV-2 assembly.

Implemented Eulerian path algorithm for spectral assembly allowing for k-mer multiplicity.
Constructed and tested euler_assemble_multi for superstring assembly.
Assembled the SARS-CoV-2 genome from reads, analyzed k-mer histograms, and identified variant.
Explored greedy assembly and correctness for read sets.
Provided overlap graphs, edge orders, and analysis.
Completed conceptual questions about greedy and spectral assembly paradigms.

Sequence alignment, dynamic programming, probabilistic models

Developed and tested overlap_align for optimal pairwise overlap alignment with affine gap penalties.
Calculated overlap graphs from real SARS-CoV-2 Nanopore reads and determined read ordering.
Performed hand-computed global and local alignments for provided sequences.
Modeled read overlap probabilities using a probabilistic framework and computed likelihoods and posteriors.
Included code, worked examples, and matrix visualizations for alignments.

Multiple sequence alignment, UPGMA, substitution matrices

Implemented UPGMA alignment order strategy, handling tie-breaking.
Constructed guide trees for progressive multiple alignments of SARS-CoV-2 spike proteins.
Used substitution matrices (e.g., BLOSUM62) and pairwise edit distances.
Performed multiple alignment, visualized domains, and analyzed evolutionary changes.
Explored star alignment strategies and sum-of-pairs scoring.
Analyzed DNA substitution matrix scenarios for zero, positive, and negative entries.

Tree-based dynamic programming, HMMs, parsimony, Markov chains

Implemented tree likelihood calculations using dynamic programming.
Constructed and applied a PhyloHMM to detect recombination in SARS-CoV-2 spike gene alignments, analyzed Viterbi paths.
Compared HMM predictions under different transition probabilities.
Manually performed branch-and-bound for unweighted parsimony on five taxa, documented queue states and tree count comparisons.
Estimated Markov chain parameters and likelihoods for DNA sequence data using uniform, MLE, and Laplace approaches.

H5

Estimating HMM parameters with Viterbi training
K-means clustering
Forward and Backward algorithm
The Baum–Welch algorithm

HW6

Gaussian mixture model-based clustering
Bottom-up hierarchical clustering
determining causal relationships
scoring Bayesian networks with model evidence

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
hw		hw
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS576

Sequence assembly, graphs, k-mer multiplicity, SARS-CoV-2 assembly.

Sequence alignment, dynamic programming, probabilistic models

Multiple sequence alignment, UPGMA, substitution matrices

Tree-based dynamic programming, HMMs, parsimony, Markov chains

H5

HW6

About

Uh oh!

Releases

Packages

Languages

maxhartke/cs576

Folders and files

Latest commit

History

Repository files navigation

CS576

Sequence assembly, graphs, k-mer multiplicity, SARS-CoV-2 assembly.

Sequence alignment, dynamic programming, probabilistic models

Multiple sequence alignment, UPGMA, substitution matrices

Tree-based dynamic programming, HMMs, parsimony, Markov chains

H5

HW6

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages