Transfer learning, decoded_
Weekly dispatch for ML engineers fine-tuning foundation models on deadline. Read it. Argue about it. Ship faster.
LoRA at Scale: Why 0.1% of Parameters Do 80% of the Work
Last week, three papers dropped that collectively reframe how we think about parameter-efficient fine-tuning. The short version: rank matters more than we thought, and the right initialization can cut your training time in half.
§1 — The Rank Selection Problem
Choosing LoRA rank is still mostly vibes. r=8 is the default because someone wrote it in the original paper. But new ablations from the Mistral team show rank-4 often matches rank-64 on domain adaptation — if you initialize from the top singular vectors of the target-domain gradient.
# Initialize LoRA from SVD of domain gradient
U, S, Vh = torch.linalg.svd(domain_grad, full_matrices=False)
lora_A = U[:, :rank] * S[:rank].sqrt()
lora_B = Vh[:rank] * S[:rank].sqrt().unsqueeze(-1)↑ This init cuts convergence from ~800 steps to ~200 on MedQA. The gradient signal is already pointing at what the base model needs to unlearn.
§2 — Vision Transformers on Niche Datasets
Satellite imagery fine-tuning remains a pain point. The spectral distribution shift between ImageNet and multispectral data breaks the first few attention layers. The fix is surprisingly simple: freeze the patch embedding, replace the positional encoding, and let the rest adapt.
Before
61.3% acc
After
84.7% acc
EuroSAT-MS (5-shot)
§3 — This Week's Reading List
LoRA vs full fine-tune on medical imaging
Freezing strategy for ViT-L/14 on satellite data
DoRA paper — anyone actually run this?
Val loss plot that made me cry (happy tears)
// filterable archive
47 issues. Every technique,
indexed.
LoRA at Scale: Why 0.1% of Parameters Do 80% of the Work
Rank selection is still mostly vibes. New ablations change that.
Spectral Shift: Fine-Tuning ViT on Satellite Imagery
The patch embedding is your enemy. Here's how to neutralize it.
CLIP Adaptation for Medical Imaging Without Labels
Prompt engineering gets you 70% of the way. The last 30% needs this.
Quantized Models on Edge: INT8 vs FP16 in Production
Your Raspberry Pi can run a fine-tuned ViT. Here's the catch.
Prefix Tuning vs Prompt Tuning: A Practitioner's Audit
We benchmarked both on 6 NLP tasks. The results surprised us.
Multi-Modal Alignment: Teaching LLAVA to See Your Data
The projection layer is where domain adaptation actually happens.
LoRA vs full fine-tune on medical imaging
Freezing strategy for ViT-L/14 on satellite data
DoRA paper — anyone actually run this?
Val loss plot that made me cry (happy tears)
bitsandbytes 0.43 breaking change — heads up
// interactive model explorer
Click a layer. Read the issue.
Every architectural decision in transfer learning has a Synapse issue behind it. Explore the ViT fine-tuning stack.
Select a layer
Click any layer in the model to see which Synapse issue covers that technique in depth.
Quick tips:
❄ Frozen = covered in freezing strategy issues
⚡ Trainable = covered in fine-tuning technique issues
// get it on your phone
Faster pipeline. Same signal.
The Synapse app delivers issues the moment they drop, with offline reading, code snippet copy, and a direct feed into the Slack community. No email client. No browser tab. Just the signal.
Push notification when issues drop — before your inbox
Offline reading with syntax-highlighted code blocks
Community feed integrated — reply without switching apps
Bookmark techniques, build your own transfer learning reference
Download on the
App Store
Get it on
Google Play
Get It On Your Phone
Text yourself the install link
We'll detect your platform and send the right store link. One SMS. No spam.
847
members active
47
issues to date
4,200+
ML engineers
across 40 countries
47
issues published
since Jan 2025
94%
open rate
vs. 21% industry avg
2 AM
peak activity
when the real debates happen
// from the community
What ML engineers actually say
Priya Nambiar
ML Eng @ Recursion Pharma
"The LoRA rank selection issue alone saved us 2 weeks of ablations. We just ran the SVD init and it worked first try."
Marcus Osei
Research Eng @ Orbital Insight
"I read every issue the day it drops. The community threads are where I actually learn — the arguments at 2 AM are more useful than most papers."
Selin Çelik
PhD → ML Eng @ Merantix
"Synapse is the only newsletter I've never unsubscribed from. The specificity is insane — it assumes you already know the basics."