Chris Hayes' Journal

Building a Semantic Bible Search Engine with RAG: From 61% to 80% Accuracy

A journey through embeddings, hybrid search, and making the KJV Bible searchable by meaning, not just keywords

The Problem: Searching the Bible is Hard

Traditional Bible search tools are limited. Type “God’s love” and you get verses containing those exact words. But what about the hundreds of verses about divine compassion, mercy, or grace that never use the word “love”?

What if you search for “trusting God in difficult times” but the KJV says “trust in the LORD” or “lean not on thine own understanding”? Keyword search fails.

We needed semantic search - the ability to find verses by meaning, not just exact words.

The Solution: RAG (Retrieval-Augmented Generation)

We built a semantic search engine for the King James Bible using:

31,102 Bible verses converted to vector embeddings
Ollama for local, private embedding generation
Hybrid search combining semantic understanding with keyword matching
Query expansion for archaic language (e.g., “subtil” → “subtle”)
Popularity boosting to surface famous verses like John 3:16

The result? 80% top-10 accuracy on 5,000 test queries.

The Journey: Three Critical Breakthroughs

1. The Parser Crisis: When Verses Were 7,837 Characters Long

Early on, we hit a wall. Our embeddings were failing with mysterious EOF errors from Ollama:

Error embedding verse Mark 12:24: EOF (status code: 500)

The Investigation: We discovered verses were abnormally long:

Matthew 5:3: 5,990 characters (should be ~74)
Mark 12:24: 2,889 characters
242 verses total > 1,000 characters

The Root Cause: The KJV XML uses OSIS milestone markers (sID and eID) that can cross element boundaries:

<q sID="q1">
  <verse sID="Gen.3.1" osisID="Gen.3.1"/>
  Now the serpent was more subtil...
  <verse eID="Gen.3.1"/>
</q eID="q1">

Quote boundaries can cross verse boundaries, putting markers at different nesting levels. Our recursive parser was concatenating multiple verses together.

The Fix: Complete parser rewrite using single-pass traversal with state tracking:

# Track current verse with milestone markers
for elem in root.iter():
    if elem.get('sID'):  # Verse start
        current_verse_id = elem.get('osisID')
        collecting = True

    if collecting:
        verse_elements.append(elem)

    if elem.get('eID') == current_verse_id:  # Verse end
        build_verse(verse_elements)
        collecting = False

Result: 0 verses over 1,000 characters. 100% parsing accuracy.

2. The “Subtil” Problem: When Modern Language Fails

Early testing showed poor accuracy on archaic KJV language:

Query: "serpent was crafty"
Expected: Genesis 3:1 ("serpent was more subtil")
Actual: Not in top 10 ❌

The KJV uses archaic words that modern embeddings don’t understand:

“subtil” (not “subtle”)
“spake” (not “spoke”)
“holpen” (not “helped”)

Attempt 1: Query Expansion We built a synonym expander:

query = "you shall not kill"
expanded = "you shall not kill slay slew slain"  # Add KJV variants

Result: Helped some, but limited. Accuracy improved from 61% to 65%.

Attempt 2: Hybrid Search We combined semantic search (70%) with keyword search (30%):

semantic_score = cosine_similarity(query_embedding, verse_embedding)
keyword_score = exact_word_matches(query, verse_text)
final_score = 0.7 * semantic_score + 0.3 * keyword_score

Result: Massive improvement! Accuracy jumped to 80%.

Query Type	Semantic Only	Hybrid	Improvement
Keywords	57.3%	84.7%	+27.3%
Paraphrases	37.7%	61.4%	+23.8%
Modern	84.6%	95.6%	+11.0%
Overall	61.1%	79.7%	+18.6%

Trade-off: 7.5x slower (277ms vs 37ms), but worth it for the accuracy gain.

3. The John 3:16 Problem: Famous Verses Not Ranking High

Users expect famous verses to rank higher. When searching “God loved the world,” John 3:16 should be #1, not buried on page 2.

The Solution: Popularity Boosting

We curated a database of 76 famous verses with popularity weights:

{
  "John 3:16": {"weight": 3.0, "category": "salvation"},
  "Psalm 23:1": {"weight": 3.0, "category": "comfort"},
  "Genesis 1:1": {"weight": 3.0, "category": "creation"},
  "Romans 3:23": {"weight": 2.5, "category": "salvation"},
  // ... 72 more
}

Boost formula:

boost = (popularity_weight - 1.0) * 0.3
new_score = original_score + boost

Results:

Query	Without Boost	With Boost
“beginning of everything”	Genesis 1:1 at #3	Genesis 1:1 at #1 ⬆️
“saved by faith”	Ephesians 2:8 at #2	Ephesians 2:8 at #1 ⬆️
“all have sinned”	Romans 3:23 at #1 (0.925)	Romans 3:23 at #1 (1.375) ✅

Impact: 20-50% of queries benefit from boosting, with famous verses rising 1-3 positions on average.

The Architecture: How It All Works

1. Build Phase (One Time)

python build.py

What happens:

Parse 31,102 KJV verses from OSIS XML
Generate embeddings using nomic-embed-text (768 dimensions)
Cache everything to disk (174 MB)
Validate with test queries

Time: 8-10 minutes first run, <1 second subsequent runs

2. Search Phase (Every Query)

from src.rag_search import RAGBibleSearch

rag = RAGBibleSearch(
    model='nomic-embed-text',
    use_query_expansion=True,      # Handle archaic language
    use_popularity_boost=True      # Boost famous verses
)

results = rag.hybrid_search("trusting God in difficult times", top_k=10)

What happens:

Query expansion: Add archaic synonyms
Semantic search: Convert query to embedding, find similar verses (70% weight)
Keyword search: Find exact word matches (30% weight)
Popularity boost: Boost famous verses by 30%
Return ranked results

Time: ~277ms average

3. API (Optional)

python api.py

RESTful API with FastAPI:

curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "God so loved the world",
    "top_k": 10,
    "search_mode": "hybrid"
  }'

Response:

{
  "query": "God so loved the world",
  "results": [
    {
      "reference": "John 3:16",
      "text": "For God so loved the world...",
      "confidence": 1.623,
      "rank": 1
    }
  ],
  "processing_time_ms": 245
}

The Stack

Core Technology:

Python 3.9+ - Main language
Ollama - Local embedding generation (no cloud APIs!)
nomic-embed-text - 768-dimensional embeddings
NumPy - Vector operations and similarity calculations
FastAPI - REST API
lxml - OSIS XML parsing

Data:

KJV Bible - 31,102 verses with Strong’s numbers
Strong’s Greek Lexicon - 5,624 entries (bonus feature)
Webster’s 1913 Dictionary - Archaic word definitions (bonus)
Famous Verses Database - 76 curated verses with weights

Techniques:

Cosine similarity for semantic matching
Exact word matching for keyword search
Query expansion for synonym handling
Weighted scoring for result combination
Popularity boosting for famous verses

Key Learnings

1. Hybrid > Pure Semantic

Pure semantic search sounds cool, but real-world accuracy demands hybrid approaches. Combining semantic understanding (70%) with keyword matching (30%) gave us the best of both worlds.

2. Archaic Language is Hard

Modern embedding models trained on contemporary text struggle with 400-year-old English. Query expansion helps, but hybrid search is the real solution.

3. Parser Edge Cases Matter

OSIS milestone markers are tricky. What seems like simple XML parsing becomes complex when elements can cross boundaries. Single-pass traversal with state tracking was the key.

4. User Expectations Drive Features

Users expect John 3:16 to rank #1 when relevant. Popularity boosting addresses this without sacrificing accuracy for less famous verses.

5. Benchmarking is Essential

We ran 5,000 query benchmarks to validate improvements. Without hard numbers, we’d never know if changes helped or hurt.

Performance at Scale

5,000 Query Benchmark Results:

Metric	Semantic	Hybrid	Improvement
Rank #1	39.0%	57.3%	+18.3%
Top 3	50.9%	71.1%	+20.2%
Top 10	61.1%	79.7%	+18.6%
Queries/sec	27.0	3.6	-85.7%
Latency	37ms	277ms	+647%

Trade-off Analysis:

Hybrid is 7.5x slower but 18.6% more accurate
For a user-facing search tool, accuracy > speed
277ms is still acceptable for interactive use

Query Type Breakdown:

Type	Semantic	Hybrid	Improvement
Keywords	57.3%	84.7%	+27.3% ⭐
Modern	84.6%	95.6%	+11.0%
Paraphrase	37.7%	61.4%	+23.8% ⭐
Partial	61.7%	83.0%	+21.2%
Typo	66.7%	80.7%	+14.0%

Key insight: Hybrid search excels at keywords (+27%) and paraphrases (+24%), precisely where pure semantic search struggles.

Try It Yourself

Quick Start

# 1. Clone the repo
git clone https://github.com/chrishayescodes/biblesearch.git
cd biblesearch

# 2. Setup
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# 3. Install Ollama
# Visit https://ollama.ai
ollama serve
ollama pull nomic-embed-text

# 4. Build (first time: ~10 minutes)
python build.py

# 5. Search!
python tests/test_search_interactive.py

Example Queries

Try these to see semantic search in action:

"God made light" → Genesis 1:3
"serpent was subtil" → Genesis 3:1 (archaic spelling!)
"you shall not kill" → Exodus 20:13 (modern language)
"love your enemies" → Matthew 5:44
"beginning of everything" → Genesis 1:1 (popularity boost)

The Road Ahead

What’s Next?

Potential improvements:

Contextual search - Search within specific books or chapters
Multi-verse results - Return passage ranges, not just single verses
Translation comparison - Add NIV, ESV, etc.
Question answering - Use LLM to generate answers from retrieved verses
Cross-references - Link related verses automatically

Performance Optimizations

FAISS indexing - Speed up vector similarity search
Keyword term indexing - Faster exact matching
Query result caching - Cache popular queries
Incremental embedding updates - Add verses without rebuilding

Conclusion

Building a semantic Bible search engine taught us valuable lessons about:

RAG systems and vector embeddings
Hybrid approaches combining multiple search strategies
Parser edge cases in XML processing
User expectations and product design
Benchmarking and performance optimization

The result? A search tool that understands meaning, not just keywords. Type “trusting God in difficult times” and get Proverbs 3:5-6, even though those exact words don’t appear.

80% accuracy. 31,102 verses. 100% local. 0% cloud APIs.

Resources

GitHub: https://github.com/chrishayescodes/biblesearch
Documentation: Full guides in docs/ directory
Benchmarks: Performance results in docs/benchmarks/
Blog: You’re reading it!

Acknowledgments

Built with:

Claude Code - AI pair programming assistant
Ollama - Local LLM infrastructure
Crosswire Bible Society - OSIS Bible format
Public domain resources - KJV Bible, Strong’s Concordance, Webster’s 1913

Questions? Issues? PRs welcome!

Open an issue on GitHub or reach out at @chrishayescodes

This project demonstrates practical RAG implementation for educational purposes. May it help others learn about semantic search, embeddings, and building intelligent text retrieval systems.

📖 “Search the scriptures” - John 5:39 📖