Hi, I’m Zach Maas 👋
PhD computer scientist, bioinformatician, and general tinkerer. I like building models we can actually trust, and software that's fast.
I'm actively interviewing for full-time roles (research or applied ML/SWE)! Reach out: [email protected]
TL;DR
- 6+ yrs heavy-Python data/ML stack & HPC, 10+ yrs Linux/Unix
- PhD: Bayesian + interpretable transformer models for sequencing data (spring ’25)
- Current: mechanistic interpretability & graph-based views of LLMs, full-stack dev
- Built ETL/HPC infra for thousands of experiments of genomics data & for a biotech startup from scratch
What I’m Doing Right Now
Project | Stack |
---|---|
Memeory: semantic search on all the pictures you have saved on your phone |
Expo, React Native, Typescript, MLKit |
Graphical Activation Probes: Finding structure in sparse LLM representations |
Python, PyTorch, HuggingFace, SKLearn |
This Website: static site + progressive enhancement |
Eleventy, Typescript, Cloudflare Workers |
DESeq2 Web Visualizer Web app for sanity-checking and exploring bulk RNA-seq stats |
JS, d3, R |
tsdav: secure WebDav over tailscale |
Go, Tailscale SDK |
Experience
Independent
AI/ML Researcher • May 2025 – Present
- Mechanistic interpretability, grant writing, and open-source tool dev.
- Building linear-graph techniques for feature merging in SAEs
- Open Philanthropy grant application under consideration
LincSwitch Therapeutics
Data Science Consultant • Jun 2024 – Dec 2024
- Greenfield data & infra for an early-stage biotech (NextFlow, HPC, R).
- Translated model results into plain English for stakeholders, designed reproducible data workflows.
University of Colorado Boulder
PhD Researcher (CS) • Aug 2020 – May 2025
- BERT + Sparse Autoencoders for discovering genomic regulatory grammar
- Bayesian MCMC “virtual spike-in” method → quantified uncertainty when external controls were missing
- Fine-tuned U-Net variants → SOTA cyanobacteria segmentation
Professional Research Assistant • May 2019 – Aug 2020
- Architected 30TB SQL-backed ETL system with associated data pipelines
Toolchain & Languages
Category | What I actually use |
---|---|
ML / DL | PyTorch, HuggingFace, scikit-learn, PyMC, CUDA, Weights & Biases |
Data | Pandas, NumPy, SQL (Postgres & friends), NextFlow, Docker |
Code | Python (power-user), Bash/Awk, R, C++ (perf work), JS/TS (frontend mostly), Go & Zig for fun + tools |
Infra | Linux, HPC (SLURM), Terraform, AWS (EC2/S3) |
Soft | Turning dense math into readable prose, mentoring grad students, explaining data to everyone |
Publications & Pre-prints
asterisk = first author
Year | Title | Notes |
---|---|---|
2025 | *Supervised and Unsupervised Methods for Transcriptional Sequencing Data | PhD Thesis |
2025 | TFIIH kinase CDK7 drives cell proliferation through a common core TF network | Sci Adv |
2024 | *Internal and External Normalization of Nascent RNA Sequencing Run-On Experiments | BMC Bioinf |
2024 | *Deconvolution of Nascent Sequencing Data Using Transcriptional Regulatory Elements | PSB 2024 |
2023 | Atlas of nascent RNA transcripts reveals enhancer→gene linkages | Under review |
2021 | Transcription Factor Enrichment Analysis (TFEA) | Commun Biol |
2020 | Selective inhibition of CDK7 reveals high-confidence targets | Genes & Dev |
2020 | TFIID enables RNA pol II promoter-proximal pausing | Mol Cell |
Education
PhD, Computer Science – Univ. of Colorado Boulder (2025)
- Dissertation: “Supervised and Unsupervised Methods for Transcriptional Sequencing Data”
- IQ Biology Certificate
B.A., Mathematics & Chemistry – Univ. of Colorado Boulder (2019)
- Summa Cum Laude
Odds & Ends
- Ham Radio (/AE), enthusiastic about digimodes
- 3-D printing to solve problems around the house
- Home lab with 15 + self-hosted services (docker-compose, tailscale, etc)