中文 / EN
Workshop 2025

Dify x Hong Kong OSS

From Zero to Production: Building GenAI Applications

Workflow Orchestration, Knowledge Base Management, and Production Deployment with Dify

crazywoola (Banana)
Developer Relations @ Dify
banana@dify.ai

Workshop Agenda

01
Opening & Goals
Audience alignment, expected outcomes
3min
02
Dify & LLM-Native Mindset
Platform role, app types, why now
5min
03
Workflow Core
Node roles, orchestration patterns, live demo
9min
04
Knowledge / RAG Core
Chunking, retrieval optimization, quality loop
7min
05
Production Essentials
Deployment, guardrails, observability
6min
S2
Session 2 (15:50)
Advanced Practice · Deep Dives · Hands-on Projects
30min
Session 1 Total 30 min

What is Dify?

An open-source LLM application development platform connecting large model capabilities to production-grade applications.

Visual Orchestration
Workflow, Agent, and Chatflow modes
Knowledge Base
End-to-end RAG, documents to retrieval
Plugin Ecosystem
Open marketplace for models, tools, extensions
Observability
Full-stack logging, tracing, cost tracking
Human-in-the-Loop
Manual review and approval gates in pipelines
Skills
Reusable encapsulated tool capability units
Triggers
Schedule, event, and webhook-driven activation

Global Community

Driven by open source, GitHub Top 100 Project

1M+
Powered by Dify
130K+
GitHub Stars
150+
Countries Covered
1000+
Open Source Contributors
60+
Industry Applications
550M+
Total Downloads

LLM Application Development Lifecycle

2017
Transformer
Attention is all you need
2020
GPT-3
Few-shot emergent abilities
2023-24
App Frameworks
Rise of platforms like Dify
2025+
Engineering
Production-ready, standardized
Why Now?
Compute, models, and data are all ready. LLM applications are entering the engineering phase.

Traditional vs LLM-Native Development

Traditional Software

Core Logic Deterministic
If-Then-Else control
Data Processing Structured (SQL/JSON)
Schema-first
Testing Unit Tests (Pass/Fail)
Logic coverage

LLM-Native Development

Core Logic Probabilistic
Prompt + Context guidance
Data Processing Unstructured (Embedding)
Semantic retrieval
Testing Evaluation (Eval)
Accuracy, hallucination, relevance

Workflow Orchestration

Visually build complex AI processing pipelines, from simple conversations to multi-step automation.

1
Input Definition
Schema validation
2
Knowledge Retrieval
RAG enhancement
3
LLM Processing
Intelligent generation
4
Output Formatting
Result delivery

Four Application Types

Chatbot

Low Complexity

Conversational AI with context memory, ideal for customer service and Q&A scenarios

Multi-turn Context Retention

Agent

Medium Complexity

Autonomous task execution with tool calling, ideal for research and analysis

Tool Calling Autonomous Decisions

Core Node Types

Category Node Description
I/O Start Define input parameter schema, trigger workflow execution
End Define output format, terminate workflow
Processing Code Python/Node.js execution, data transformation
Template String template rendering, prompt construction
AI LLM Large model invocation, multi-provider support, parameter config
Knowledge Knowledge base retrieval, RAG context injection
Control Flow If/Else Conditional branching, expression-based routing
Iteration List iteration, batch data processing

Observability & Debugging

Execution Trace • Run ID: wf_8f3a9b2c
0.01s Start Input validated
0.45s Knowledge Retrieval 3 chunks retrieved (score > 0.75)
0.78s LLM Call 342 tokens generated
1.24s End Execution complete
Tokens: 1,245 prompt | 342 completion Cost: $0.0042
Variable Inspector
View variables at each step
Node Retry
Re-execute failed nodes
Diff Comparison
Compare multiple executions

Knowledge Base

Complete RAG solution: full-cycle management from document ingestion to intelligent retrieval.

Ingest Pipeline
1. Document Intake PDF / DOCX / MD
2. Chunking Strategy Recursive / Semantic / Markdown
3. Embedding text-embedding-3-large
4. Vector Storage Weaviate / Milvus / Qdrant
Query Pipeline
1. Query Rewrite Normalize intent + missing context
2. Hybrid Search Vector + keyword retrieval
3. Rerank Reorder Top-K candidates
4. Context Assembly Build final context for LLM
Key takeaway: Ingest quality sets retrieval ceiling; query strategy determines answer quality.

Chunking Strategies

Strategy Use Case Chunk Size Overlap
Recursive General document processing 500-1000 chars 50-100
Semantic Technical content, code Sentence-level N/A
Fixed Structured data Custom 0
Markdown Documentation, Wiki Header hierarchy Context
Recommended Configuration
CHUNKING
methodrecursive
chunk_size800
chunk_overlap80
separators\\n\\n, \\n, . , space
INDEXING
embedding_modeltext-embedding-3-large
vector_dbweaviate
rerankenabled
top_k5

Retrieval Optimization

Basic Retrieval

methodsemantic_search
top_k5
score_threshold0.75

Hybrid Search + Rerank

methodhybrid_search
top_k20 (candidate)
rerank_modelcohere-rerank-v3
top_n5 (final)
filtermetadata.source = official_docs
Semantic Search
Vector similarity based
Keyword Search
BM25 algorithm
Hybrid Search
Semantic + Keyword

Deployment & Integration

Deploy Dify quickly with Docker, configure for production environments.


# 1. Clone repository
git clone https://github.com/langgenius/dify.git
cd dify/docker

# 2. Configure environment
cp .env.example .env
# Edit .env to set OPENAI_API_KEY, SECRET_KEY, etc.

# 3. Start services
docker compose up -d

# 4. Access console
open http://localhost/install
                        

Production Architecture

Entry Layer
CDN / WAF Nginx Load Balancer
Application Layer
Dify API (Multi Pods) Web / Console Worker Nodes
Data Layer
PostgreSQL (HA) Redis (Sentinel) Weaviate / Milvus Cluster
Async Layer
Celery Queue Scheduled Jobs Retry / Dead Letter

Security Checklist

  • Change default passwords
  • Enable HTTPS / SSL certificates
  • Configure CORS policies
  • Set rate limiting policies
  • Enable audit logging
  • Network isolate vector database
  • Key Environment Variables

    Variable Description Example
    CONSOLE_API_URL Console API URL http://localhost:5001
    APP_API_URL App API URL http://localhost:5001
    DB_DATABASE PostgreSQL database name dify
    REDIS_HOST Redis host redis
    VECTOR_STORE Vector database type weaviate/qdrant/milvus
    S3_BUCKET_NAME File storage bucket name dify-files

    Session 2

    Advanced Practice · Hands-on Projects

    Advanced Practice
    Code Node · Low-Code vs Pro-Code · Plugin Ecosystem
    Deep Dives & Hands-on Projects
    RAG Eval · Agent Architecture · Guardrails
    Apple Watch Workflow · GDPR Compliance Bot

    Code Node Best Practices

    Extend Workflow capabilities with Python/Node.js

    Use Cases

    • • Complex data transformation & cleaning
    • • Custom business logic decisions
    • • External algorithm model invocation
    • • Sensitive data masking

    Security Limits

    • • Execution timeout: 60s default
    • • Memory limit: 512 MB
    • • No network requests (sandbox)
    • • Standard library only, limited 3rd-party
    Example: Risk Scoring Logic
    
    def main(payload: dict) -> dict:
        country = payload.get("country", "")
        score = payload.get("risk_score", 0)
        
        # Custom routing logic
        if country in ["US", "UK"] and score < 0.3:
            return {"route": "fast-lane", "reason": "low_risk"}
        elif score > 0.8:
            return {"route": "block", "reason": "high_risk"}
        
        return {"route": "standard", "score": score}
                                

    Low-Code vs Pro-Code

    Finding the sweet spot between visual orchestration and code

    Dimension Low-Code Orchestration (DSL) Pro-Code
    Core Use Flow orchestration, conditionals, RAG pipelines Complex algorithms, custom business logic
    Delivery Speed Fast iteration, business users can participate Requires dev cycle, but highly flexible
    Maintainability Visual and intuitive, easy to understand Requires documentation, code review
    Best Practice 80% DSL + 20% Code — Glue logic visually, core computation in code
    Golden Rule
    Visual handles "orchestration", code handles "computation". Data flows seamlessly between them via variables.

    Plugin Ecosystem

    Extend Dify's capabilities, integrate third-party services and custom tools.

    Model Plugins

    Connect OpenAI, Claude, Llama and other LLMs

    Tool Plugins

    Google Search, code execution, database queries

    Agent Strategies

    Function Calling, ReAct, Plan & Execute

    Plugin Marketplace
    Official + Community maintained, growing ecosystem
    Visit Marketplace

    RAG Evaluation Methods (Optional Deep Dive)

    Systematically evaluate retrieval quality and generation effectiveness

    Retrieval Quality Metrics

    Recall Proportion of relevant documents successfully retrieved
    Precision Proportion of retrieved results that are relevant
    MRR Mean Reciprocal Rank of first relevant document
    NDCG Ranking-aware weighted relevance score

    Generation Quality Metrics

    Faithfulness Is the answer based on retrieved context?
    Relevance How well does the answer match the question?
    Completeness Does it cover all aspects of the question?
    Hallucination Rate Proportion of fabricated information
    Gold Test Set
    Build 50-100 annotated Q&A pairs as regression baseline. Run before every strategy change to ensure no degradation.

    Agent Architecture Design (Optional Deep Dive)

    Evolution from simple conversations to autonomous task execution

    Reasoning
    Understand task goals
    Acting
    Invoke tools to execute
    Observing
    Get execution results
    Iterating
    Loop until complete
    Strategy
    Function Calling
    Structured tool invocation
    Strategy
    ReAct
    Reasoning and acting alternately
    Strategy
    Plan & Execute
    Plan first, then execute

    Knowledge Pipeline Deep Dive (Optional Deep Dive)

    Complete journey from raw documents to usable memories

    Ingest Pipeline
    Parse PDF / Markdown / HTML
    Chunking Semantic / Markdown / Fixed
    Enrich Summary Add summary + metadata
    Embedding + Store Milvus / Weaviate
    Query Pipeline
    Query Rewrite Clarify intent and context
    Hybrid Search Vector + BM25
    Rerank Sort best Top-N chunks
    Context Assembly Build final context for LLM
    Garbage In, Garbage Out — Upstream quality determines downstream results

    Security & Guardrails (Optional Deep Dive)

    Protect your applications from abuse and attacks

    Input Protection

    • • Sensitive word filtering & content moderation
    • • Prompt injection detection
    • • Request rate limiting
    • • User authentication & authorization

    Output Protection

    • • Hallucination detection & fact verification
    • • PII data masking
    • • Output length limits
    • • Harmful content filtering
    Budget Guardrails
    Set per-request token limits, daily spending caps to prevent cost overruns. Auto-downgrade or circuit-break when thresholds are hit.

    Observability Deep Dive (Optional Deep Dive)

    From logs to metrics, full-stack tracking of application performance

    Logs

    Execution traces, variable values, error stacks

    Metrics

    Latency, token consumption, success rate

    Traces

    Cross-node call chains, dependencies

    Cost Analysis Dashboard
    1.2M
    Total Tokens
    $42.5
    Today's Cost
    2.3s
    P99 Latency
    99.8%
    Success Rate

    Debugging Tips & Troubleshooting (Optional Deep Dive)

    Systematically identify and resolve issues

    Common Issues

    Timeout Error → Check LLM node timeout settings
    Variable Not Found → Check node output variable names
    Poor Retrieval Quality → Adjust chunking strategy or thresholds
    API Rate Limit → Add retry or fallback logic

    Troubleshooting Steps

    1 Check execution logs, identify error nodes
    2 Inspect variable panel, verify data flow
    3 Use preview mode, step-through testing
    4 Enable Trace, view detailed call chains
    5 Compare differences between successful/failed runs

    Community Contribution (Optional)

    Grow from user to contributor, build the open source ecosystem together.

    Contribute Code

    Submit PRs, fix bugs, implement new features

    Develop Plugins

    Create Model/Tool/Agent plugins for the community

    Translate Docs

    Help localize documentation, lower barriers to entry

    Discord Community
    GitHub Discussions
    @dify_ai

    Hands-on Project 1

    Apple Watch Voice Transcription Workflow

    Automatically transcribe Apple Watch voice memos, generate structured notes and sync to Notion.

    Voice Input
    Apple Watch
    Whisper Transcribe
    Speech → Text
    LLM Summary
    Extract key points
    Sync to Notion
    Structured storage

    Hands-on Project 2

    GDPR Compliance Q&A Bot

    Knowledge Base

    • • GDPR official documentation
    • • ICO guidelines
    • • Company privacy policies
    • • Data processing agreements

    Compliance Features

    • • Source citation for every answer
    • • Confidence scoring
    • • Human handoff for low confidence
    • • Auto-append disclaimer
    
    chatbot:
      system_prompt: |
        You are a GDPR compliance assistant. You must:
        1. Cite specific articles when answering
        2. Label confidence (High/Medium/Low)
        3. Recommend legal consultation for complex cases
        4. Provide information only, not legal advice
      guardrails:
        confidence_threshold: 0.6
        fallback: "Need more context, connecting you to a human agent."
                            

    Learning Resources

    Documentation

    docs.dify.ai

    Visit →

    GitHub

    github.com/langgenius/dify

    Star →

    Community

    Discord Community

    Join →

    Next Steps

    Join Discord Community
    Star GitHub Repo
    Deploy Your First Instance
    Share Your Workflows
    🙏

    Thank You

    Questions? Feel free to reach out anytime.

    crazywoola (Banana)
    banana@dify.ai
    Dify Developer Relations