[Feat] KB Export, Folder Upload & Vision LLM for Image Processing#1207
[Feat] KB Export, Folder Upload & Vision LLM for Image Processing#1207CREDO23 wants to merge 26 commits intoMODSetter:devfrom
Conversation
There was a problem hiding this comment.
Review by RecurseML
🔍 Review performed on b96c049..7e14df6
✨ No bugs found, your code is sparkling clean
✅ Files analyzed, no issues (22)
• surfsense_backend/app/etl_pipeline/etl_pipeline_service.py
• surfsense_backend/app/etl_pipeline/file_classifier.py
• surfsense_backend/app/etl_pipeline/parsers/vision_llm.py
• surfsense_backend/app/routes/__init__.py
• surfsense_backend/app/routes/documents_routes.py
• surfsense_backend/app/routes/export_routes.py
• surfsense_backend/app/services/export_service.py
• surfsense_backend/app/tasks/connector_indexers/local_folder_indexer.py
• surfsense_backend/app/tasks/document_processors/file_processors.py
• surfsense_backend/app/utils/file_extensions.py
• surfsense_backend/tests/unit/etl_pipeline/test_etl_pipeline_service.py
• surfsense_backend/tests/unit/utils/test_file_extensions.py
• surfsense_web/components/documents/DocumentsFilters.tsx
• surfsense_web/components/documents/FolderNode.tsx
• surfsense_web/components/documents/FolderTreeView.tsx
• surfsense_web/components/layout/ui/sidebar/DocumentsSidebar.tsx
• surfsense_web/components/sources/DocumentUploadTab.tsx
• surfsense_web/messages/en.json
• surfsense_web/messages/es.json
• surfsense_web/messages/hi.json
• surfsense_web/messages/pt.json
• surfsense_web/messages/zh.json
|
@CREDO23 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel. A member of the Team first needs to authorize it. |
Description
API Changes
GET /search-spaces/{id}/export?folder_id=— export KB/folder as ZIPPOST /documents/folder-upload— upload folder with structureChange Type
Testing Performed
Checklist
High-level PR Summary
This PR adds three major features: KB-level and folder-level export as ZIP files preserving folder structure, web folder upload with structure preservation, and Vision LLM integration for image processing with document-parser fallback. The Vision LLM integration allows images to be analyzed using a language model with vision capabilities (like GPT-4V or Claude with vision), falling back to the existing document parser if the vision LLM is unavailable or fails. The export functionality creates ZIP archives of markdown documents organized by folder hierarchy, with batch processing to handle large knowledge bases efficiently. The folder upload feature enables users to upload entire directory structures via the web interface, maintaining the folder hierarchy. Additionally, a hydration error in the mobile upload drop zone was fixed by replacing a button-in-button structure with a proper interactive div.
⏱️ Estimated Review Time: 30-90 minutes
💡 Review Order Suggestion
surfsense_backend/app/utils/file_extensions.pysurfsense_backend/app/etl_pipeline/file_classifier.pysurfsense_backend/app/etl_pipeline/parsers/vision_llm.pysurfsense_backend/app/etl_pipeline/etl_pipeline_service.pysurfsense_backend/app/tasks/document_processors/file_processors.pysurfsense_backend/app/tasks/connector_indexers/local_folder_indexer.pysurfsense_backend/tests/unit/etl_pipeline/test_etl_pipeline_service.pysurfsense_backend/tests/unit/utils/test_file_extensions.pysurfsense_backend/app/services/export_service.pysurfsense_backend/app/routes/export_routes.pysurfsense_backend/app/routes/__init__.pysurfsense_backend/app/routes/documents_routes.pysurfsense_web/lib/apis/documents-api.service.tssurfsense_web/messages/en.jsonsurfsense_web/messages/es.jsonsurfsense_web/messages/hi.jsonsurfsense_web/messages/pt.jsonsurfsense_web/messages/zh.jsonsurfsense_web/components/sources/DocumentUploadTab.tsxsurfsense_web/components/documents/DocumentsFilters.tsxsurfsense_web/components/documents/FolderNode.tsxsurfsense_web/components/documents/FolderTreeView.tsxsurfsense_web/components/layout/ui/sidebar/DocumentsSidebar.tsx