SYNTRAD RESEARCH LAB / FINANCIAL REASONING
FIN R1
Fin-R1 is a 7B financial reasoning model built on Qwen2.5-7B-Instruct with SFT and GRPO reinforcement learning for financial code, calculations, compliance, risk control, ESG, and multi-step finance QA.
LOCAL REPO READY
MERQUAN QFUSION Workbench
Objective: build one vLLM-loadable financial reasoning model for SYNTRAD: MERQUAN FINR1 QFUSION.
Fin-R1 remains the base because it is already Qwen2.5-7B-Instruct plus finance SFT and GRPO reasoning training.
We do not pull it backward with a blind Qwen weight blend.
Final Model Target
/opt/merquan/models/MERQUAN_FINR1_QFUSION
Base
/opt/merquan/models/Fin-R1
Qwen Role
Teacher / judge / reference model only, not default 50-50 weight merge.
Training Lab
/opt/merquan/finr1_loop_lab
SFT Seed
60,000 audited examples at /opt/merquan/finr1_loop_lab/data/merquan_finr1_sft_seed.jsonl
DPO Seed
10,000 preference pairs at /opt/merquan/finr1_loop_lab/data/merquan_finr1_dpo_seed.jsonl
Loop learning: attempt -> verify -> score -> classify failure -> create SFT/DPO/RL record -> train MERQUAN adapter -> holdout evaluation -> promote only if better.
Deterministic finance rewards come first: numerical tolerance, no-arbitrage checks, schema validity, cashflow equality, Greeks/PDE consistency, portfolio accounting, risk constraints, and compliance structure.
ONE vLLM MODEL
FIN-R1 BASE PRESERVED
MERQUAN LORA ADAPTER
SFT -> DPO -> RL
PROMOTION GATE
NO BENCHMARK LEAKAGE
Integrity rule: no QFBench tests, reference solutions, expected outputs, oracle artifacts, or benchmark-specific leaked answer files are allowed in training data.
The deployable model is produced by merging the trained MERQUAN adapter into Fin-R1 after evaluation, then serving that single folder through vLLM.
S3 Training Corpus Inventory
S3 Training Prefix
s3://merqintel-data/training/ - 1.962 GiB, 4 JSONL files
finance_instruct_500k
518,185 system/user/assistant records. Direct SFT candidate.
fincorpus_train
38,061 text/meta records, 1.401 GiB. Domain continued-pretraining candidate.
TFNS
9,543 train + 2,388 validation sentiment records. Convert into finance sentiment instruction/eval tasks.
S3 Backups Prefix
s3://merqintel-data/backups/ - 63.421 GiB total
MERQUAN Corpus Tar
MERQUAN-CORPUS-20260426-BACKUP.tar contains merquan-corpus PDFs, source tarballs, and abstracts. Use for CPT/RAG after extraction and cleaning, not raw SFT.
Local Cache
/opt/merquan/finr1_loop_lab/s3_training_cache
Inventory Report
/opt/merquan/finr1_loop_lab/reports/s3_training_inventory.json
Use policy: JSONL instruction data can feed SFT after dedupe and leakage checks. Corpus archives must be extracted into clean text chunks first. Raw backups, app code, verifier artifacts, or benchmark answer files are not training data.