Issue #47 just dropped

Transfer learning, decoded_

Weekly dispatch for ML engineers fine-tuning foundation models on deadline. Read it. Argue about it. Ship faster.

synapse_issue_047.md

#47Feb 24, 2026

LoRA at Scale: Why 0.1% of Parameters Do 80% of the Work

Last week, three papers dropped that collectively reframe how we think about parameter-efficient fine-tuning. The short version: rank matters more than we thought, and the right initialization can cut your training time in half.

§1 — The Rank Selection Problem

Choosing LoRA rank is still mostly vibes. r=8 is the default because someone wrote it in the original paper. But new ablations from the Mistral team show rank-4 often matches rank-64 on domain adaptation — if you initialize from the top singular vectors of the target-domain gradient.

rank-init.py

# Initialize LoRA from SVD of domain gradient
U, S, Vh = torch.linalg.svd(domain_grad, full_matrices=False)
lora_A = U[:, :rank] * S[:rank].sqrt()
lora_B = Vh[:rank] * S[:rank].sqrt().unsqueeze(-1)

↑ This init cuts convergence from ~800 steps to ~200 on MedQA. The gradient signal is already pointing at what the base model needs to unlearn.

§2 — Vision Transformers on Niche Datasets

Satellite imagery fine-tuning remains a pain point. The spectral distribution shift between ImageNet and multispectral data breaks the first few attention layers. The fix is surprisingly simple: freeze the patch embedding, replace the positional encoding, and let the rest adapt.

Before

61.3% acc

After

84.7% acc

EuroSAT-MS (5-shot)

§3 — This Week's Reading List

arxivDoRA: Weight-Decomposed LoRA

arxivSpectral Normalization for ViT Adaptation

blogFlash Attention 3 on A100 vs H100

Get Every Issue →

Synapse Community

847 online

Priya Nambiar📌 pinned2:14 AM

#fine-tuning

LoRA vs full fine-tune on medical imaging

34 replies

🔥12🧪8💀5

Marcus Osei📌 pinned11:52 PM

#architecture

Freezing strategy for ViT-L/14 on satellite data

19 replies

🛰️7👀11

Selin Çelik1:33 AM

#papers

DoRA paper — anyone actually run this?

27 replies

📄6🤔14

James Adewale3:07 AM

#show-and-tell

Val loss plot that made me cry (happy tears)

41 replies

📉23🎉18💯9

Rahul Krishnamurthy is typing in #fine-tuning

Join the conversation

✓ No ads. No sponsors.✓ 47 issues published✓ 4,200+ ML engineers✓ Unsubscribe anytime

LoRA Fine-TuningDomain AdaptationVision TransformersParameter-Efficient FTFoundation ModelsQLoRADoRAFlash AttentionSpectral NormalizationMulti-Modal TransferEdge DeploymentAdapter LayersLoRA Fine-TuningDomain AdaptationVision TransformersParameter-Efficient FTFoundation ModelsQLoRADoRAFlash AttentionSpectral NormalizationMulti-Modal TransferEdge DeploymentAdapter Layers

// filterable archive

47 issues. Every technique,
indexed.

6 issues← issues

#47nlp

8 min·Feb 24, 2026

LoRA at Scale: Why 0.1% of Parameters Do 80% of the Work

Rank selection is still mostly vibes. New ablations change that.

LoRAPEFTFine-tuning

#46vision

10 min·Feb 17, 2026

Spectral Shift: Fine-Tuning ViT on Satellite Imagery

The patch embedding is your enemy. Here's how to neutralize it.

ViTDomain ShiftSatellite

#45multimodal

7 min·Feb 10, 2026

CLIP Adaptation for Medical Imaging Without Labels

Prompt engineering gets you 70% of the way. The last 30% needs this.

CLIPZero-shotMedical

#44edge

9 min·Feb 3, 2026

Quantized Models on Edge: INT8 vs FP16 in Production

Your Raspberry Pi can run a fine-tuned ViT. Here's the catch.

QuantizationONNXEdge

#43nlp

11 min·Jan 27, 2026

Prefix Tuning vs Prompt Tuning: A Practitioner's Audit

We benchmarked both on 6 NLP tasks. The results surprised us.

Prefix TuningSoft PromptsGPT

#42multimodal

8 min·Jan 20, 2026

Multi-Modal Alignment: Teaching LLAVA to See Your Data

The projection layer is where domain adaptation actually happens.

LLAVAVision-LanguageAlignment

related discussionscommunity →

#fine-tuning🔥 hot

LoRA vs full fine-tune on medical imaging

34 replies

#architecture

Freezing strategy for ViT-L/14 on satellite data

19 replies

#papers🔥 hot

DoRA paper — anyone actually run this?

27 replies

#show-and-tell🔥 hot

Val loss plot that made me cry (happy tears)

41 replies

#tools

bitsandbytes 0.43 breaking change — heads up

12 replies

// interactive model explorer

Click a layer. Read the issue.

Every architectural decision in transfer learning has a Synapse issue behind it. Explore the ViT fine-tuning stack.

Frozen layerTrainable layerSelected

INPUT

224×224 image patches

OUTPUT

class_logits [batch, n_classes]

Select a layer

Click any layer in the model to see which Synapse issue covers that technique in depth.

Quick tips:

❄ Frozen = covered in freezing strategy issues

⚡ Trainable = covered in fine-tuning technique issues

// get it on your phone

Faster pipeline. Same signal.

The Synapse app delivers issues the moment they drop, with offline reading, code snippet copy, and a direct feed into the Slack community. No email client. No browser tab. Just the signal.

⚡

Push notification when issues drop — before your inbox

📱

Offline reading with syntax-highlighted code blocks

💬

Community feed integrated — reply without switching apps

🔖

Bookmark techniques, build your own transfer learning reference

Download on the

App Store

Get it on

Google Play

Get It On Your Phone

Text yourself the install link

We'll detect your platform and send the right store link. One SMS. No spam.

What ML engineers actually say

Priya Nambiar

ML Eng @ Recursion Pharma

"The LoRA rank selection issue alone saved us 2 weeks of ablations. We just ran the SVD init and it worked first try."

Marcus Osei

Research Eng @ Orbital Insight

"I read every issue the day it drops. The community threads are where I actually learn — the arguments at 2 AM are more useful than most papers."

Selin Çelik

PhD → ML Eng @ Merantix

"Synapse is the only newsletter I've never unsubscribed from. The specificity is insane — it assumes you already know the basics."

// 847 members online right now · next issue drops Monday

Get It On Your Phone

or read in browser

Transfer learning, decoded_

LoRA at Scale: Why 0.1% of Parameters Do 80% of the Work

§1 — The Rank Selection Problem

§2 — Vision Transformers on Niche Datasets

§3 — This Week's Reading List

47 issues. Every technique, indexed.

LoRA at Scale: Why 0.1% of Parameters Do 80% of the Work

Spectral Shift: Fine-Tuning ViT on Satellite Imagery

CLIP Adaptation for Medical Imaging Without Labels

Quantized Models on Edge: INT8 vs FP16 in Production

Prefix Tuning vs Prompt Tuning: A Practitioner's Audit

Multi-Modal Alignment: Teaching LLAVA to See Your Data

Click a layer. Read the issue.

Faster pipeline. Same signal.

Text yourself the install link

What ML engineers actually say

47 issues. Every technique,
indexed.