用 Reflection Chain 生成高品質履歷

21 January 2026
Reflection,
LLM,
Resume,
LangChain,
Prompting

用 Reflection Chain 生成高品質履歷

一次生成的履歷常常「不夠貼 JD、措辭太 AI」。本文分享如何用 Writer → Critic → Reviser 反思迴圈，讓 LLM 自己改進輸出品質。

問題：一次生成的局限

單次 LLM 生成履歷時常見問題：

不貼 JD - 沒有針對特定技能要求調整措辭
太空泛 - 缺少量化數據和具體細節
AI 口吻 - 讀起來像機器生成的
時間軸錯誤 - 公司順序、日期可能顛倒

解決方案：Reflection Chain

┌─────────────────────────────────────┐
│ Context Preparation                 │
│ - JD 分析 (focus_items, skills)     │
│ - 素材搜尋 (per focus item)         │
│ - 時間軸載入 (raw_profile)          │
└────────────────┬────────────────────┘
                 ↓
┌─────────────────────────────────────┐
│ Writer (初稿)                       │
│ - 根據 JD + Materials 生成履歷      │
│ - 用 Timeline 確保時間正確          │
│ - 結構化輸出 (Pydantic Schema)      │
└────────────────┬────────────────────┘
                 ↓
┌─────────────────────────────────────┐
│ Critic (評分 + 建議)                │
│ - JD Alignment: 0-100               │
│ - Quantification: 0-100             │
│ - Relevance: 0-100                  │
│ - Natural Tone: 0-100               │
│ - 具體改進建議                      │
└────────────────┬────────────────────┘
                 ↓
        score >= min_score?
           ├── Yes → 輸出
           └── No ↓
┌─────────────────────────────────────┐
│ Reviser (修稿)                      │
│ - 根據 Critic Feedback 改進         │
│ - 只能用原始素材，不能編造          │
└────────────────┬────────────────────┘
                 ↓
        達到 max_iterations?
           ├── Yes → 輸出最佳版本
           └── No → 回到 Critic

LangChain 實作

Pydantic Schema

用結構化輸出確保格式一致：

class GeneratedResume(BaseModel):
    summary: str = Field(description="Professional summary")
    experience: list[ResumeSection] = Field(description="Experience sections")
    skills_alignment: str = Field(description="How skills align with JD")

class CriticFeedback(BaseModel):
    score: int = Field(description="Overall score 0-100")
    jd_alignment: int = Field(description="JD alignment score")
    quantification: int = Field(description="Quantification quality")
    relevance: int = Field(description="Material relevance")
    natural_tone: int = Field(description="Natural tone score")
    feedback: str = Field(description="Improvement suggestions")
    pass_threshold: bool = Field(description="Whether passes quality threshold")

Writer

def run_writer(jd_context, materials, timeline) -> GeneratedResume:
    llm = get_chat_model()
    structured_llm = llm.with_structured_output(GeneratedResume)
    
    prompt = f"""You are a professional resume writer.

=== CAREER TIMELINE (Authoritative dates) ===
{timeline}

=== JD CONTEXT ===
{jd_context}

=== CANDIDATE MATERIALS ===
{materials}

Generate the resume:"""

    return structured_llm.invoke(prompt)

Critic

def run_critic(resume_md, jd_context, min_score) -> CriticFeedback:
    llm = get_chat_model()
    structured_llm = llm.with_structured_output(CriticFeedback)
    
    prompt = f"""Evaluate this resume against JD requirements.

Minimum passing score: {min_score}

=== JD CONTEXT ===
{jd_context}

=== RESUME DRAFT ===
{resume_md}"""

    return structured_llm.invoke(prompt)

Reviser

def run_reviser(resume, feedback, jd_context, materials) -> GeneratedResume:
    llm = get_chat_model()
    structured_llm = llm.with_structured_output(GeneratedResume)
    
    prompt = f"""Improve the resume based on feedback.

=== CRITIC FEEDBACK ===
Score: {feedback.score}/100
{feedback.feedback}

=== CURRENT RESUME ===
{resume.to_markdown()}

=== CANDIDATE MATERIALS (source of truth) ===
{materials}

Revise to address the feedback:"""

    return structured_llm.invoke(prompt)

Timeline Enforcement

時間軸準確性靠兩件事：

1. 載入 raw_profile

def load_timeline_from_raw_profile():
    content = Path("materials/raw/raw_profile.md").read_text()
    
    # 提取經歷段落
    lines = content.split("\n")
    in_timeline = False
    timeline_lines = []
    
    for line in lines:
        if "經歷" in line.lower() or "experience" in line.lower():
            in_timeline = True
        if in_timeline:
            timeline_lines.append(line)
    
    return "\n".join(timeline_lines)

2. Prompt 注入

=== CAREER TIMELINE (Authoritative dates and companies) ===
## 工作經歷
- 2022.01 - Present: ABC Company, Technical Manager
- 2019.03 - 2021.12: DEF Corp, Senior Engineer
...

這讓模型把 timeline 當作「權威事實」，而不是從素材中推斷。

Per Focus Item 搜尋

不是用整個 JD 搜一次，而是對每個 focus_items 分別搜：

def search_materials_for_focus_items(jd_info, materials_table):
    all_materials = []
    seen_ids = set()
    
    # Parse focus_items JSON
    focus_items = json.loads(jd_info.get("focus_items", "[]"))
    
    for item in focus_items:
        query = f"{item['area']} {item['description']}"
        results = materials_table.search(embed_query(query)).limit(5).to_list()
        
        for r in results:
            if r["id"] not in seen_ids:
                seen_ids.add(r["id"])
                all_materials.append(r)
    
    return all_materials

為什麼這樣做？

JD 有 5 個 focus areas，每個可能需要不同素材
整體搜尋可能偏向某一個 area
分開搜確保每個 focus 都有素材支撐

CLI 使用

# 完整 Reflection Chain
uv run career-kb generate --jd-id "abc123" --output "resume.md"

# 快速單次生成（省 LLM 調用）
uv run career-kb generate --jd-id "abc123" --no-reflection

# 自訂門檻（更嚴格）
uv run career-kb generate --jd-id "abc123" --max-iterations 5 --min-score 80

# 追蹤使用的素材
uv run career-kb generate --jd-id "abc123" --mark-used

輸出範例

✓ Resume saved to: ./output/resume.md

╭─────────── Generation Stats ───────────╮
│ Iterations: 2                          │
│ Final Score: 78/100                    │
│ Materials used: 8                      │
╰────────────────────────────────────────╯

生成的 Markdown：

# Resume for AI Product Manager
**Company:** Google  
**Generated from JD:** abc123

---

## Summary

Experienced technical leader with 8+ years building 
AI-powered products. Led cross-functional teams of 
12+ engineers to deliver recommendation systems 
serving 100M+ users.

## Relevant Experience

### Senior Product Manager - ABC Corp (2022-Present)

Led AI chatbot product from 0-1, achieving 85% 
containment rate and reducing support tickets by 40%.
...

## Skills Alignment

- **LLM/RAG**: Built production RAG systems with 
  LangChain and vector databases
- **Product Strategy**: Defined 3-year AI roadmap 
  aligned with business OKRs
- **Team Leadership**: Managed 8-person cross-functional 
  scrum team

Critic 評分維度

維度	檢查重點
JD Alignment	有沒有針對 JD 提到的技能/經驗
Quantification	有具體數字嗎？(%, $, 人數)
Relevance	素材和職位相關嗎
Natural Tone	讀起來像人寫的嗎

最終分數是四個維度的平均。

效能與成本

模式	LLM 調用次數	延遲
--no-reflection	1	~3s
預設 (max 3 iterations)	1-7	~10-20s
max 5 iterations	1-11	~15-30s

成本計算：

Writer: ~1000 tokens input, ~500 output
Critic: ~800 tokens input, ~200 output
Reviser: ~1500 tokens input, ~500 output

一次完整 3 iteration 約 8000 tokens。

常見陷阱

1. Reviser 編造經驗

# ❌ 沒有限制素材來源
prompt = "Improve this resume to better match the JD"

# ✅ 明確限制
prompt = """Improve the resume...
RULE: Use ONLY information from CANDIDATE MATERIALS.
DO NOT invent new experiences."""

2. 無限迴圈

如果 min_score 太高，可能永遠達不到：

# 設定 max_iterations 作為上限
for iteration in range(max_iterations):
    feedback = run_critic(resume)
    if feedback.pass_threshold:
        break
    resume = run_reviser(resume, feedback)

3. Timeline 不一致

# 在 Writer prompt 中強調
"""
=== CAREER TIMELINE (Authoritative - DO NOT CHANGE) ===
{timeline}

RULE: Employment dates and company names MUST match 
the timeline exactly. Do not reorder or modify dates.
"""

擴展思路

1. 多 Critic 共識

用多個 Critic 觀點，取平均分數：

critics = [
    run_critic(resume, perspective="recruiter"),
    run_critic(resume, perspective="hiring_manager"),
    run_critic(resume, perspective="technical_lead"),
]
avg_score = sum(c.score for c in critics) / len(critics)

2. A/B 版本

生成多個版本讓用戶選擇：

versions = [
    generate_with_style("formal"),
    generate_with_style("conversational"),
    generate_with_style("metric-focused"),
]

3. 增量改進

保存每次 iteration 的版本，讓用戶可以比較：

result.feedback_history  # List[CriticFeedback]
# 可以看到每次迭代的分數變化

總結

組件	功能
Writer	根據 JD + Materials 生成初稿
Critic	從 4 個維度評分 + 建議
Reviser	根據 Feedback 改進
Timeline	確保時間軸準確
Focus Items	Per-item 搜尋素材

Reflection Chain 讓履歷生成從「一次碰運氣」變成「迭代式品質控制」。

Career Knowledge Base 是一個本地優先的履歷知識庫系統，使用 Python + LanceDB + LangChain 建構。

← Previous
混合搜尋：讓履歷素材無處遁形
Next →
用 NLI 框架檢驗技能覆蓋：從模糊判斷到結構化推理