Dump-ster Diver

A Knowledge Graph Forensics System for Document Analysis

Dump-ster Diver is an AI-powered document analysis system that extracts entities, relationships, and insights from large document collections. This tool transforms unstructured documents into an interconnected knowledge graph, making it easier to discover patterns, connections, and key information across thousands of files - whether you're analyzing legal documents, corporate archives, email dumps, or any other large-scale document repository.

View on Github

Features

🎯 Processing Mode

Simple Processing Mode (Recommended for Large Collections)

Generates concise 1-2 sentence summaries for each document
Automatic document type classification (email, memo, legal-document, chat, etc.)
Intelligent tag extraction for easy filtering and search
Faster processing ideal for initial document review
Review status tracking and flagging system

📄 Supported Document Types

Text Files: .txt
Images: .jpg, .jpeg, .png, .gif, .tif (OCR via vision models)

🎨 Interactive Web Interface

Windows 95-inspired retro UI for nostalgia and clarity
Filter documents by type, tags, review status, and flags
Document detail viewer with inline text/image display
Real-time processing progress monitoring
Dark/light mode toggle

🤖 Configurable AI Models

Note: these are just what I used while testing this project
Text Models: llama3.2:3b, qwen3:4b-instruct, qwen3:8b, qwen3:30b
Vision Models: gemma3:4b, gemma3:12b, qwen3-vl:8b, qwen3-vl:32b
Switch models on-the-fly through the UI
Powered by Ollama for local, privacy-focused AI

💾 Graph Database Storage

Neo4j 5.13 with APOC plugins for advanced graph operations
Efficient querying of entity relationships
Built-in similarity relationships between documents
Persistent storage with Docker volumes

PreviousGMKTec Evo X-2 AI Mini PC ++ Ubuntu 25.10 NextImage Caption

Last updated 3 months ago

hashtagDump-ster Diver

hashtagFeatures

hashtag🎯 Processing Mode

hashtag📄 Supported Document Types

hashtag🎨 Interactive Web Interface

hashtag🤖 Configurable AI Models

hashtag💾 Graph Database Storage