a specialized and trainable vision agent
Analyze and act upon events found in your images, videos and live camera feeds.
Available templates:
- Sewer inspections
- Face recognition
- Build your own...
Vidsy.ai 0.6 is out! Check out the release notes for more details. In short, it's a major update with an entirely new backend and database. Not all models and features have been ported yet from the previous version, so both versions will be available for a while.
For sewer inspectors, V0.6 is the one you need. Get it from the getting started page.
For sewer inspectors, V0.6 is the one you need. Get it from the getting started page.
Home use
Search
Easily search through all your images and videos using a simple text prompt.Note: disabled in v0.6 due to large changes in the backend. Coming back soon!
Local
Perform searches directly on your local machine. All your data remains securely on your computer.Share
Share files effortlessly with family and friends. Upload and share your files with a click using a downloadable link.For professionals
Index
Create large, efficient indexes of your data for fast search and analysis, stored on external databases accessible by multiple users.Cameras
Analyze live camera streams for events and anomalies using local or network cameras (RTSP).Inspect & report
Automate visual inspections of single or synchronized recordings and generate detailed reports.For power users
Limitless cameras
Analyze a (virtually) unlimited number of cameras simultaneously, all in sync.Synchronized analysis
Process large sets of video recordings that require synchronized analysis.Interactive responses
Automate predetermined and interactive responses to detected events.How it works
LLM
The LLM converts your search text into a structured query. This process currently uses a cloud-based service. Once converted, the results are stored locally, allowing you to reuse the same search text without needing to reprocess it. You can review and adjust the structured query as needed, ensuring protection against potential LLM hallucinations.CNN
Multiple pre-trained CNNs are used to analyze images and videos locally on your machine. This ensures data security and privacy. The more memory, CPU cores, and GPUs you have, the faster the analysis.Vidsy engine
The structured query is utilized to analyze CNN results, enabling real-time content, time, and spatial analysis. This adaptive system can detect new features without requiring specialized CNNs, saving time for one-off tasks. For high-performance, real-time tasks, this method also simplifies the collection of training data for future specialized CNNs.Outputs
Various outputs can be triggered when queries match, such as event triggers, notifications, or other actions. In the future, LLMs may also be used as an output channel to enable conversations with users or subjects of interest.Status and progress
In June 2025, a new version of Vidsy was released, with a major rework of the backend and database. The new backend allows for custom models which also required a major database redesign. This version is tailored for sewer inspections, and only contains pre-trained models for this use case.
Read all about this major release in the release notes.