Tools: How I Reduced My Test Suite by 72% Using TestIQ

Tools: How I Reduced My Test Suite by 72% Using TestIQ

The Problem: AI Generates Tests Fast, But Redundantly ## The Scale of the Problem ## Why Text Similarity Doesn't Work ## The Solution: Coverage Analysis ## Types of Duplicates Detected ## How It Works ## 1. Install TestIQ ## 2. Collect Coverage Data ## 3. Analyze for Duplicates ## 4. Compare Tests Visually ## Real-World Results ## Calculator Example (AI-Generated) ## Production Django Application ## CI/CD Integration ## Configuration ## Try It Yourself ## Key Takeaways ## What's Next? AI coding assistants (GitHub Copilot, Cursor, ChatGPT) are game-changers for test generation. But there's a critical issue: they don't check if tests duplicate existing coverage. Here's what happened when I let AI generate tests for a simple calculator: Different function names, slightly different formatting, but identical coverage. Each test executes the exact same code paths. I analyzed the complete calculator test suite: Only 15 unique tests were actually needed. That's a 72% redundancy rate! 🤯 My first instinct: "Just compare the test code!" But these tests look different yet do the same thing: Text similarity would rate these as "different" - but they execute identical code paths. I built TestIQ to analyze execution patterns instead of text: Exact Duplicates - Identical coverage Subset Duplicates - One test covered by another Similar Tests - Significant overlap (configurable) TestIQ includes a built-in pytest plugin that tracks per-test coverage. No configuration needed! Get an interactive HTML report showing: Click any test pair to see a split-screen comparison: Prevent duplicate tests from merging: Builds fail if duplicates are detected. No manual review needed! Fine-tune duplicate detection: See full configuration docs. TestIQ includes a demo with the actual AI-generated calculator example: This generates and opens an HTML report showing all 207 redundancies in 54 tests. TestIQ is open source (MIT) and actively developed: Contributions welcome! Issues, PRs, and feedback appreciated. Have you dealt with test suite bloat from AI-generated code? Share your experiences in the comments! 👇 Want to try it? pip install testiq && testiq demo Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to ? It will become hidden in your post, but will still be visible via the comment's permalink. as well , this person and/or COMMAND_BLOCK: # AI cheerfully created all 5 of these: def test_addition(): assert Calculator().add(2, 3) == 5 def test_add_numbers(): assert Calculator().add(2, 3) == 5 def test_sum_calculation(): assert Calculator().add(2, 3) == 5 def test_calculate_sum(): assert Calculator().add(2, 3) == 5 def test_basic_addition(): assert Calculator().add(2, 3) == 5 COMMAND_BLOCK: # AI cheerfully created all 5 of these: def test_addition(): assert Calculator().add(2, 3) == 5 def test_add_numbers(): assert Calculator().add(2, 3) == 5 def test_sum_calculation(): assert Calculator().add(2, 3) == 5 def test_calculate_sum(): assert Calculator().add(2, 3) == 5 def test_basic_addition(): assert Calculator().add(2, 3) == 5 COMMAND_BLOCK: # AI cheerfully created all 5 of these: def test_addition(): assert Calculator().add(2, 3) == 5 def test_add_numbers(): assert Calculator().add(2, 3) == 5 def test_sum_calculation(): assert Calculator().add(2, 3) == 5 def test_calculate_sum(): assert Calculator().add(2, 3) == 5 def test_basic_addition(): assert Calculator().add(2, 3) == 5 COMMAND_BLOCK: # Test 1: Simple assertion def test_addition(): result = Calculator().add(2, 3) assert result == 5 # Test 2: Different style, same coverage def test_sum(): calc = Calculator() total = calc.add(2, 3) self.assertEqual(total, 5) # Test 3: One-liner, same coverage def test_calc_add(): assert Calculator().add(2, 3) == 5 COMMAND_BLOCK: # Test 1: Simple assertion def test_addition(): result = Calculator().add(2, 3) assert result == 5 # Test 2: Different style, same coverage def test_sum(): calc = Calculator() total = calc.add(2, 3) self.assertEqual(total, 5) # Test 3: One-liner, same coverage def test_calc_add(): assert Calculator().add(2, 3) == 5 COMMAND_BLOCK: # Test 1: Simple assertion def test_addition(): result = Calculator().add(2, 3) assert result == 5 # Test 2: Different style, same coverage def test_sum(): calc = Calculator() total = calc.add(2, 3) self.assertEqual(total, 5) # Test 3: One-liner, same coverage def test_calc_add(): assert Calculator().add(2, 3) == 5 CODE_BLOCK: test_addition → lines [9, 11, 12] test_add_numbers → lines [9, 11, 12] ❌ DUPLICATE CODE_BLOCK: test_addition → lines [9, 11, 12] test_add_numbers → lines [9, 11, 12] ❌ DUPLICATE CODE_BLOCK: test_addition → lines [9, 11, 12] test_add_numbers → lines [9, 11, 12] ❌ DUPLICATE CODE_BLOCK: test_basic_add → lines [9, 11] test_comprehensive_add → lines [9, 11, 15, 20, 25] ⚠️ SUBSET CODE_BLOCK: test_basic_add → lines [9, 11] test_comprehensive_add → lines [9, 11, 15, 20, 25] ⚠️ SUBSET CODE_BLOCK: test_basic_add → lines [9, 11] test_comprehensive_add → lines [9, 11, 15, 20, 25] ⚠️ SUBSET CODE_BLOCK: test_addition → lines [9, 11, 15] test_subtraction → lines [9, 11, 20] ⚠️ 66% similar CODE_BLOCK: test_addition → lines [9, 11, 15] test_subtraction → lines [9, 11, 20] ⚠️ 66% similar CODE_BLOCK: test_addition → lines [9, 11, 15] test_subtraction → lines [9, 11, 20] ⚠️ 66% similar COMMAND_BLOCK: pip install testiq COMMAND_BLOCK: pip install testiq COMMAND_BLOCK: pip install testiq CODE_BLOCK: pytest --testiq-output=coverage.json CODE_BLOCK: pytest --testiq-output=coverage.json CODE_BLOCK: pytest --testiq-output=coverage.json CODE_BLOCK: testiq analyze coverage.json --format html --output report.html CODE_BLOCK: testiq analyze coverage.json --format html --output report.html CODE_BLOCK: testiq analyze coverage.json --format html --output report.html COMMAND_BLOCK: # .github/workflows/test-quality.yml name: Test Quality Check on: [pull_request] jobs: check-duplicates: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: | pip install testiq pytest pytest-cov pip install -r requirements.txt - name: Collect coverage run: pytest --testiq-output=coverage.json - name: Check for duplicates run: | testiq analyze coverage.json \ --quality-gate \ --max-duplicates 0 \ --format html \ --output testiq-report.html - name: Upload report if: always() uses: actions/upload-artifact@v4 with: name: testiq-report path: testiq-report.html COMMAND_BLOCK: # .github/workflows/test-quality.yml name: Test Quality Check on: [pull_request] jobs: check-duplicates: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: | pip install testiq pytest pytest-cov pip install -r requirements.txt - name: Collect coverage run: pytest --testiq-output=coverage.json - name: Check for duplicates run: | testiq analyze coverage.json \ --quality-gate \ --max-duplicates 0 \ --format html \ --output testiq-report.html - name: Upload report if: always() uses: actions/upload-artifact@v4 with: name: testiq-report path: testiq-report.html COMMAND_BLOCK: # .github/workflows/test-quality.yml name: Test Quality Check on: [pull_request] jobs: check-duplicates: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: | pip install testiq pytest pytest-cov pip install -r requirements.txt - name: Collect coverage run: pytest --testiq-output=coverage.json - name: Check for duplicates run: | testiq analyze coverage.json \ --quality-gate \ --max-duplicates 0 \ --format html \ --output testiq-report.html - name: Upload report if: always() uses: actions/upload-artifact@v4 with: name: testiq-report path: testiq-report.html COMMAND_BLOCK: # .testiq.toml [analysis] similarity_threshold = 0.3 # 30% overlap = similar min_coverage_lines = 5 # Ignore trivial tests max_results = 100 # Limit report size [reporting] format = "html" split_similar_threshold = 0.5 # Split similar tests [performance] enable_parallel = true max_workers = 8 COMMAND_BLOCK: # .testiq.toml [analysis] similarity_threshold = 0.3 # 30% overlap = similar min_coverage_lines = 5 # Ignore trivial tests max_results = 100 # Limit report size [reporting] format = "html" split_similar_threshold = 0.5 # Split similar tests [performance] enable_parallel = true max_workers = 8 COMMAND_BLOCK: # .testiq.toml [analysis] similarity_threshold = 0.3 # 30% overlap = similar min_coverage_lines = 5 # Ignore trivial tests max_results = 100 # Limit report size [reporting] format = "html" split_similar_threshold = 0.5 # Split similar tests [performance] enable_parallel = true max_workers = 8 COMMAND_BLOCK: # .testiq.yaml analysis: similarity_threshold: 0.3 min_coverage_lines: 5 reporting: format: html performance: enable_parallel: true COMMAND_BLOCK: # .testiq.yaml analysis: similarity_threshold: 0.3 min_coverage_lines: 5 reporting: format: html performance: enable_parallel: true COMMAND_BLOCK: # .testiq.yaml analysis: similarity_threshold: 0.3 min_coverage_lines: 5 reporting: format: html performance: enable_parallel: true COMMAND_BLOCK: pip install testiq testiq demo COMMAND_BLOCK: pip install testiq testiq demo COMMAND_BLOCK: pip install testiq testiq demo - 54 tests generated by AI - 39 exact duplicates (same coverage) - 168 subset duplicates (tests completely covered by others) - 45 similar test pairs (significant overlap) - Collect coverage - Track which lines each test executes - Compare patterns - Find tests with identical/overlapping coverage - Visualize results - Interactive HTML reports with side-by-side comparison - All duplicate test relationships - Side-by-side coverage comparison - Recommended actions (keep/remove) - Before: 54 tests, ~10 minute CI run - After: 15 tests, ~3 minute CI run - Savings: 70% faster CI, 39 fewer tests to maintain - Coverage: 100% maintained ✅ - Before: 847 tests - After: 612 tests (removed 235 duplicates) - Savings: 28% faster CI runs - Coverage: No reduction ✅ - AI tools generate tests fast - but don't check for coverage duplication - Coverage analysis beats text similarity - catches duplicates that look different - Significant reduction possible - 72% in AI-generated suites, 28% in production - No coverage loss - Remove duplicates while maintaining 100% coverage - CI/CD ready - Block PRs with duplicate tests automatically - GitHub: https://github.com/pydevtools/TestIQ - PyPI: https://pypi.org/project/testiq/ - Documentation: https://github.com/pydevtools/TestIQ/tree/main/README.md