babylon-mcp/scripts/test-source-indexing.ts
Michael Mainguy 779fa53363 Add source code indexing and search with comprehensive documentation
Features:
- Implemented SourceCodeIndexer class for indexing TypeScript/JavaScript source files
  - Chunks large files into 200-line segments with 20-line overlap
  - Extracts imports, exports, and metadata
  - Generates semantic embeddings using Xenova/all-MiniLM-L6-v2
  - Creates GitHub URLs with line numbers for easy navigation

- Enhanced LanceDBSearch with source code search capabilities
  - Added searchSourceCode() method for semantic source code search
  - Added getSourceFile() method for retrieving specific files or line ranges
  - Supports package filtering and configurable table names
  - Fixed score calculation to ensure values between 0-100%

- Added two new MCP tools
  - search_babylon_source: Search Babylon.js source code with semantic search
  - get_babylon_source: Retrieve full source files or specific line ranges
  - Both tools include comprehensive error handling and JSON responses

- Created indexing and testing scripts
  - scripts/index-source.ts: Production script for indexing all packages
  - scripts/test-source-indexing.ts: Test script for core package only
  - scripts/test-source-search.ts: Test script for search functionality

- Updated package.json with comprehensive indexing commands
  - npm run index:docs - Index documentation only
  - npm run index:api - Index API documentation only
  - npm run index:source - Index source code only
  - npm run index:all - Master script to index everything

- Created comprehensive README.md
  - Complete setup and installation instructions
  - Claude Desktop integration guide with configuration examples
  - Documentation of all 5 MCP tools with parameters and examples
  - Project structure, development commands, and troubleshooting guide
  - Architecture overview and disk space requirements

Testing:
- All 118 tests passing
- TypeScript compilation successful
- Source code search verified with real queries
- Successfully indexed 1,561 files into 5,650 searchable chunks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 06:34:00 -06:00

33 lines
890 B
TypeScript

import { SourceCodeIndexer } from '../src/search/source-code-indexer.js';
async function main() {
// Start with just core package for testing
const packages = ['core'];
console.log('Testing source code indexing with core package...');
console.log();
const indexer = new SourceCodeIndexer(
'./data/lancedb',
'babylon_source_test',
'./data/repositories/Babylon.js',
100, // smaller chunk size for testing
10 // smaller overlap for testing
);
try {
await indexer.initialize();
await indexer.indexSourceCode(packages);
await indexer.close();
console.log('\n✓ Test source code indexing completed successfully!');
} catch (error) {
console.error('Error during test indexing:', error);
if (error instanceof Error) {
console.error('Stack trace:', error.stack);
}
process.exit(1);
}
}
main().catch(console.error);