Introduction
dvinfo is a cross‑platform command‑line utility designed for the extraction and presentation of metadata from a wide range of digital media files. Its name derives from “digital video information,” reflecting its primary focus on video formats, though it also supports audio, image, and generic container files. dvinfo provides a single, consistent interface for retrieving technical details such as codec type, resolution, bit rate, and embedded subtitle tracks. The tool is distributed under the permissive MIT license and is available for Windows, macOS, Linux, and FreeBSD. dvinfo is often employed in media libraries, batch processing pipelines, and quality‑control scripts where concise metadata reporting is required.
The utility was conceived to address a common problem in media workflows: disparate tools required to probe different formats. Existing solutions either lacked depth, were limited to a single platform, or required a graphical user interface that was inconvenient for automation. dvinfo consolidates the functionality of several libraries - including FFmpeg’s libavformat, the ExifTool suite, and GStreamer - into a lightweight executable. By providing a standard JSON and plain‑text output, dvinfo facilitates integration with other programs and scripting environments.
dvinfo’s user interface is intentionally minimalistic. It accepts a file or directory path, processes each file sequentially, and writes the output to standard output or a specified file. Optional flags enable filtering, recursion, or verbose diagnostic information. The output format can be chosen among plain text, JSON, XML, or CSV, allowing developers to select the representation that best suits downstream applications.
History and Development
The project began in 2014 as a research prototype by a software engineer working in a multimedia production company. The initial goal was to create a diagnostic tool that could be invoked from a video editing pipeline to verify codec compliance. The prototype was written in C++ and relied heavily on FFmpeg’s libraries. Within a year, the codebase was reorganized into a modular architecture that could accommodate additional media formats.
In 2016, the project transitioned to an open‑source model. The source repository was hosted on a public version‑control platform, and the first stable release, version 1.0, was announced. The release included support for MP4, MKV, AVI, MOV, MP3, WAV, and JPEG formats, along with an initial JSON output scheme. A community of users began to contribute bug reports, feature requests, and patches. The project adopted a Gitflow workflow, enabling parallel development of experimental features while maintaining a stable main branch.
Subsequent releases introduced significant enhancements. Version 2.0 added multi‑threaded processing, allowing the tool to scan large directories in parallel. The command line interface was refined to support glob patterns, and the JSON schema was expanded to include detailed subtitle metadata, chapter markers, and DRM information. Version 2.5 integrated a small HTTP API that could be used to serve metadata in real time, and version 3.0 introduced a plugin system for custom parsers.
As of 2025, dvinfo has reached version 3.7, which includes native support for emerging container formats such as AVIF and HEIC, improved error handling, and a comprehensive test suite covering over 1,000 media files across multiple operating systems. The project maintains a monthly release cadence, with community input guiding the priority of new features.
Key Concepts and Architecture
dvinfo is built around three core concepts: the media probe, the metadata representation, and the command‑line interface. The media probe layer abstracts the extraction of information from a file. It delegates to format‑specific parsers that are responsible for interpreting container structures and codec headers. The design ensures that adding support for a new format only requires implementing a new parser and registering it with the probe manager.
The metadata representation is defined by a flexible, hierarchical schema. At the root, the representation contains file attributes such as size, creation date, and modification time. Nested within are sections for each media stream - video, audio, subtitle, and data streams - each of which contains properties relevant to that stream type. For example, a video stream entry includes codec name, resolution, frame rate, bit depth, and color profile. Audio streams provide sample rate, channel layout, and bit depth. Subtitle streams include language code, format, and encoding.
The command‑line interface serves as the user-facing component. It parses options, invokes the probe manager, and serializes the resulting metadata into the selected output format. Internally, the interface leverages a lightweight argument‑parsing library that supports short and long options, positional arguments, and sub‑commands. The design deliberately avoids external dependencies beyond the standard C++ library and the media parsing libraries.
The architecture follows a modular, service‑oriented pattern. Each component - parser, serializer, and command‑line handler - is isolated behind an interface. This allows the project to support alternative backends (e.g., a Python wrapper that uses the same parser logic) without duplicating code. The plugin system is implemented via a dynamic library loading mechanism. Plugins expose a simple C interface and are loaded at runtime if their corresponding format is encountered.
Features
- Broad Format Support: Handles common containers such as MP4, MKV, AVI, MOV, WebM, FLV, and WAV, as well as less common formats like AVIF and HEIC.
- Stream‑Level Metadata: Extracts detailed properties for each stream, including codec configuration, color space, audio channel layout, and subtitle language.
- Multi‑Threaded Scanning: Processes multiple files concurrently, maximizing throughput on multi‑core systems.
- Extensible Plugin System: Allows third‑party developers to add support for new formats or custom parsers without modifying the core code.
- Output Formats: Supports plain text, JSON, XML, CSV, and a compact binary representation for high‑volume pipelines.
- Recursive Directory Traversal: Recursively scans directories with optional glob filtering and depth limits.
- Diagnostic Logging: Verbose mode provides detailed logs of parser decisions, errors, and performance metrics.
- Cross‑Platform Build System: Uses CMake for unified builds across Windows, macOS, and Linux.
- HTTP API: Optional built‑in server exposes metadata over RESTful endpoints for integration with web services.
- License: Distributed under the MIT license, allowing free use in proprietary and open‑source projects.
Installation and Setup
dvinfo can be installed from pre‑compiled binaries, package managers, or source code. For most users, the easiest method is to use a distribution’s package manager. On Debian‑based systems, the command sudo apt-get install dvinfo will install the latest stable version. On macOS, brew install dvinfo is available. Windows users can download the binary from the official releases page or use Chocolatey with choco install dvinfo.
When installing from source, the user must first clone the repository. The build requires CMake version 3.10 or higher and a C++17 compatible compiler. The following sequence installs the dependencies and compiles the tool:
- Clone the repository:
git clone https://example.com/dvinfo.git - Enter the directory:
cd dvinfo - Create a build folder:
mkdir build && cd build - Run CMake:
cmake .. -DCMAKEBUILDTYPE=Release - Build:
cmake --build . --target install
After installation, the executable dvinfo is available in the system path. Running dvinfo --help displays usage information. The configuration file dvinfo.conf located in the user’s home directory allows the user to set default options, such as preferred output format or maximum recursion depth.
Usage Examples
The simplest invocation scans a single file and prints plain‑text metadata:
dvinfo sample.mp4
To obtain the same data in JSON format suitable for ingestion by a script, use the -o json option:
dvinfo -o json sample.mkv
Scanning a directory recursively and saving the output to a file can be accomplished as follows:
dvinfo -r -o csv -o /tmp/media_report.csv /media/videos
For advanced use, the HTTP API can be launched with the --serve flag. Once running, a GET request to http://localhost:8080/metadata?file=sample.mov returns JSON metadata. The server supports basic authentication via the --auth option.
Custom parsers can be loaded by placing a shared library in the plugins directory and specifying its name with --plugin. For example, a proprietary format parser named libfoo.so would be loaded by:
dvinfo --plugin libfoo.so proprietary_file.foo
Integration and Extensibility
dvinfo’s plugin architecture allows developers to extend the tool without modifying the core codebase. A plugin must expose two C functions: probe_format and extract_metadata. The former examines the file header to determine if the plugin is applicable; the latter returns a metadata structure that the core system serializes.
Many media management systems embed dvinfo into their pipelines. For example, a video hosting platform may run dvinfo as a background job whenever a new file is uploaded. The resulting JSON is stored in a database, enabling quick search and retrieval of media attributes. In automated transcoding workflows, dvinfo is used to confirm that a source file meets certain codec or resolution constraints before proceeding.
dvinfo also exposes a minimal C API for integration into other programs. The API provides functions for initializing the probe manager, processing a single file, and retrieving the metadata as a tree of key‑value pairs. Libraries written in other languages (Python, Go, Rust) can wrap the API via language bindings or by invoking the command line and parsing its output.
For large‑scale media libraries, dvinfo’s CSV and XML output formats are particularly useful. They can be imported into spreadsheet applications or data warehouses. The binary format, introduced in version 3.0, is optimized for speed and low memory footprint, making it suitable for nightly batch jobs on high‑volume servers.
Community and Support
The dvinfo community is organized around several communication channels. An issue tracker on the repository hosts bug reports, feature requests, and discussion threads. A mailing list is available for developers interested in the core architecture or plugin development. Users can also join a chat room where developers and power users exchange tips and troubleshoot problems.
Documentation is available in a multi‑language format, including English, Spanish, and Chinese. The user guide covers installation, command‑line options, output formats, and plugin development. The developer manual delves into the internal architecture, data structures, and API reference.
Contributions are encouraged via pull requests. The project follows a code‑review workflow: contributors must include unit tests for new features, provide documentation updates, and ensure that the test suite passes on continuous‑integration builds. Licensing is permissive, allowing users to incorporate dvinfo into proprietary software without concern.
Future Directions
dvinfo’s roadmap includes several ambitious goals. One priority is to expand support for streaming protocols such as HLS and DASH, enabling metadata extraction from live or on‑demand sources. This will involve integrating with networking libraries and handling partial manifests.
Another focus is to improve DRM awareness. Current support for encrypted containers is limited to simple detection; future releases aim to provide deeper insight into encryption schemes and key management information, where legally permissible.
The project also plans to develop a graphical user interface for users who prefer a visual representation of metadata. This interface will be built using Qt and will allow batch processing, real‑time monitoring, and visualization of stream properties.
Finally, dvinfo aims to adopt a machine‑learning‑based approach to format detection, improving accuracy on corrupted or non‑standard files. By training a lightweight classifier on header fingerprints, the tool can make probabilistic predictions even when standard parsers fail.
Related Tools
While dvinfo focuses on metadata extraction, several complementary tools exist. The ffprobe component of FFmpeg provides comprehensive probing but is tightly coupled to FFmpeg’s libraries and lacks a simple plugin system. ExifTool offers extensive metadata extraction for images and documents, yet its output format is more verbose and less suited to media stream analysis. MediaInfo is a cross‑platform GUI and command‑line tool that shares some functionality with dvinfo but emphasizes human‑readable reports over structured output.
For users requiring deeper analysis, libraries such as libavformat and GStreamer offer programmatic access to media data. However, dvinfo abstracts many of the complexities associated with these libraries, allowing non‑programmers to obtain detailed metadata quickly.
No comments yet. Be the first to comment!