babylon-mcp/scripts/index-source.ts
Michael Mainguy 779fa53363 Add source code indexing and search with comprehensive documentation
Features:
- Implemented SourceCodeIndexer class for indexing TypeScript/JavaScript source files
  - Chunks large files into 200-line segments with 20-line overlap
  - Extracts imports, exports, and metadata
  - Generates semantic embeddings using Xenova/all-MiniLM-L6-v2
  - Creates GitHub URLs with line numbers for easy navigation

- Enhanced LanceDBSearch with source code search capabilities
  - Added searchSourceCode() method for semantic source code search
  - Added getSourceFile() method for retrieving specific files or line ranges
  - Supports package filtering and configurable table names
  - Fixed score calculation to ensure values between 0-100%

- Added two new MCP tools
  - search_babylon_source: Search Babylon.js source code with semantic search
  - get_babylon_source: Retrieve full source files or specific line ranges
  - Both tools include comprehensive error handling and JSON responses

- Created indexing and testing scripts
  - scripts/index-source.ts: Production script for indexing all packages
  - scripts/test-source-indexing.ts: Test script for core package only
  - scripts/test-source-search.ts: Test script for search functionality

- Updated package.json with comprehensive indexing commands
  - npm run index:docs - Index documentation only
  - npm run index:api - Index API documentation only
  - npm run index:source - Index source code only
  - npm run index:all - Master script to index everything

- Created comprehensive README.md
  - Complete setup and installation instructions
  - Claude Desktop integration guide with configuration examples
  - Documentation of all 5 MCP tools with parameters and examples
  - Project structure, development commands, and troubleshooting guide
  - Architecture overview and disk space requirements

Testing:
- All 118 tests passing
- TypeScript compilation successful
- Source code search verified with real queries
- Successfully indexed 1,561 files into 5,650 searchable chunks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 06:34:00 -06:00

40 lines
1013 B
TypeScript

import { SourceCodeIndexer } from '../src/search/source-code-indexer.js';
async function main() {
// Define packages to index
const packages = [
'core',
'gui',
'materials',
'loaders',
'serializers',
];
console.log('Starting source code indexing for Babylon.js packages...');
console.log(`Indexing ${packages.length} packages:`, packages.join(', '));
console.log();
const indexer = new SourceCodeIndexer(
'./data/lancedb',
'babylon_source_code',
'./data/repositories/Babylon.js',
200, // chunk size (lines)
20 // chunk overlap (lines)
);
try {
await indexer.initialize();
await indexer.indexSourceCode(packages);
await indexer.close();
console.log('\n✓ Source code indexing completed successfully!');
} catch (error) {
console.error('Error during source code indexing:', error);
if (error instanceof Error) {
console.error('Stack trace:', error.stack);
}
process.exit(1);
}
}
main().catch(console.error);