Tools

Tools: How to Build a Python Passport Scanner with Portrait, MRZ and Document Detection

2026-02-09 0 views admin

Tools: How to Build a Python Passport Scanner with Portrait, MRZ and Document Detection

Source: Dev.to

Demo Video: Passport Scanner with Face, MRZ and Document Detection ## What You Will Build ## Why Dynamsoft Capture Vision SDK? ## Prerequisites ## Getting Your Trial License ## Step 1: Project Setup and Dependencies ## Step 2: Understanding the SDK Architecture ## Step 3: Creating the Data Models ## Step 4: Implementing Portrait Detection with Intermediate Results ## Step 5: Parsing MRZ Results ## Step 6: Building the GUI with PySide6 ## Step 7: Processing Images ## Step 8: Adding Visual Overlays ## Step 9: Running the Application ## Testing the Application ## Source Code In this tutorial, you will learn how to build a professional desktop application for reading Machine Readable Zones (MRZ) from passports and ID cards with portrait and document detection capabilities. By the end of this guide, you'll have a fully functional GUI application built with PySide6 and powered by Dynamsoft Capture Vision SDK. A cross-platform desktop passport scanner featuring: Dynamsoft Capture Vision SDK provides a comprehensive solution for document processing with these key advantages: Before starting, ensure you have: Dynamsoft provides a free trial license for evaluation: Create your project directory and install the required packages: Install the dependencies: Create a requirements.txt file for dependency management: Dynamsoft Capture Vision SDK uses a pipeline-based architecture: First, define data structures to store MRZ results: This data class captures all essential information from an MRZ: raw text lines, parsed personal data, document location, and portrait zone coordinates. Portrait detection requires accessing intermediate processing results. Create a receiver class: The IdentityProcessor.find_portrait_zone() method intelligently locates the portrait area by analyzing the spatial relationship between MRZ text lines and document boundaries—a powerful feature exclusive to Dynamsoft's SDK. Create a processor to convert SDK results into your data model: Create the main application window with dual-panel layout: Implement the core image processing logic: Create a custom widget for displaying images with MRZ and portrait overlays: Complete the application with dark theme and entry point: https://github.com/yushulx/python-mrz-scanner-sdk/tree/main/examples/official/gui Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: mkdir mrz_scanner_gui cd mrz_scanner_gui python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: mkdir mrz_scanner_gui cd mrz_scanner_gui python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate COMMAND_BLOCK: mkdir mrz_scanner_gui cd mrz_scanner_gui python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate COMMAND_BLOCK: pip install dynamsoft-capture-vision-bundle opencv-python PySide6 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: pip install dynamsoft-capture-vision-bundle opencv-python PySide6 COMMAND_BLOCK: pip install dynamsoft-capture-vision-bundle opencv-python PySide6 CODE_BLOCK: dynamsoft-capture-vision-bundle PySide6 opencv-python Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: dynamsoft-capture-vision-bundle PySide6 opencv-python CODE_BLOCK: dynamsoft-capture-vision-bundle PySide6 opencv-python CODE_BLOCK: from dataclasses import dataclass from typing import List, Optional @dataclass class MRZResult: """Stores parsed MRZ result data.""" raw_lines: List[str] doc_type: str doc_id: str surname: str given_name: str nationality: str issuer: str gender: str date_of_birth: str date_of_expiry: str is_passport: bool mrz_locations: List['Quadrilateral'] portrait_zone: Optional['Quadrilateral'] = None Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: from dataclasses import dataclass from typing import List, Optional @dataclass class MRZResult: """Stores parsed MRZ result data.""" raw_lines: List[str] doc_type: str doc_id: str surname: str given_name: str nationality: str issuer: str gender: str date_of_birth: str date_of_expiry: str is_passport: bool mrz_locations: List['Quadrilateral'] portrait_zone: Optional['Quadrilateral'] = None CODE_BLOCK: from dataclasses import dataclass from typing import List, Optional @dataclass class MRZResult: """Stores parsed MRZ result data.""" raw_lines: List[str] doc_type: str doc_id: str surname: str given_name: str nationality: str issuer: str gender: str date_of_birth: str date_of_expiry: str is_passport: bool mrz_locations: List['Quadrilateral'] portrait_zone: Optional['Quadrilateral'] = None COMMAND_BLOCK: from dynamsoft_capture_vision_bundle import * class NeededResultUnit: """Container for intermediate results needed for portrait extraction.""" def init(self): self.deskewed_image_unit = None self.localized_text_lines_unit = None self.scaled_colour_img_unit = None self.detected_quads_unit = None self.recognized_text_lines_unit = None class MyIntermediateResultReceiver(IntermediateResultReceiver): """Captures intermediate processing stages for portrait detection.""" def init(self, cvr: CaptureVisionRouter): super().init() self.cvr = cvr self.unit_groups: Dict[str, NeededResultUnit] = {} def on_deskewed_image_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].deskewed_image_unit = result def on_scaled_colour_image_unit_received(self, result, info): id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].scaled_colour_img_unit = result def on_localized_text_lines_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].localized_text_lines_unit = result def on_recognized_text_lines_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].recognized_text_lines_unit = result def on_detected_quads_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].detected_quads_unit = result def get_portrait_zone(self, hash_id: str) -> Optional['Quadrilateral']: """Extract portrait zone using IdentityProcessor.""" if self.unit_groups.get(hash_id) is None: return None id_processor = IdentityProcessor() units = self.unit_groups[hash_id] ret, portrait_zone = id_processor.find_portrait_zone( units.scaled_colour_img_unit, units.localized_text_lines_unit, units.recognized_text_lines_unit, units.detected_quads_unit, units.deskewed_image_unit ) if ret != EnumErrorCode.EC_OK: return None return portrait_zone Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: from dynamsoft_capture_vision_bundle import * class NeededResultUnit: """Container for intermediate results needed for portrait extraction.""" def init(self): self.deskewed_image_unit = None self.localized_text_lines_unit = None self.scaled_colour_img_unit = None self.detected_quads_unit = None self.recognized_text_lines_unit = None class MyIntermediateResultReceiver(IntermediateResultReceiver): """Captures intermediate processing stages for portrait detection.""" def init(self, cvr: CaptureVisionRouter): super().init() self.cvr = cvr self.unit_groups: Dict[str, NeededResultUnit] = {} def on_deskewed_image_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].deskewed_image_unit = result def on_scaled_colour_image_unit_received(self, result, info): id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].scaled_colour_img_unit = result def on_localized_text_lines_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].localized_text_lines_unit = result def on_recognized_text_lines_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].recognized_text_lines_unit = result def on_detected_quads_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].detected_quads_unit = result def get_portrait_zone(self, hash_id: str) -> Optional['Quadrilateral']: """Extract portrait zone using IdentityProcessor.""" if self.unit_groups.get(hash_id) is None: return None id_processor = IdentityProcessor() units = self.unit_groups[hash_id] ret, portrait_zone = id_processor.find_portrait_zone( units.scaled_colour_img_unit, units.localized_text_lines_unit, units.recognized_text_lines_unit, units.detected_quads_unit, units.deskewed_image_unit ) if ret != EnumErrorCode.EC_OK: return None return portrait_zone COMMAND_BLOCK: from dynamsoft_capture_vision_bundle import * class NeededResultUnit: """Container for intermediate results needed for portrait extraction.""" def init(self): self.deskewed_image_unit = None self.localized_text_lines_unit = None self.scaled_colour_img_unit = None self.detected_quads_unit = None self.recognized_text_lines_unit = None class MyIntermediateResultReceiver(IntermediateResultReceiver): """Captures intermediate processing stages for portrait detection.""" def init(self, cvr: CaptureVisionRouter): super().init() self.cvr = cvr self.unit_groups: Dict[str, NeededResultUnit] = {} def on_deskewed_image_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].deskewed_image_unit = result def on_scaled_colour_image_unit_received(self, result, info): id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].scaled_colour_img_unit = result def on_localized_text_lines_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].localized_text_lines_unit = result def on_recognized_text_lines_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].recognized_text_lines_unit = result def on_detected_quads_received(self, result, info): if info.is_section_level_result: id = result.get_original_image_hash_id() if self.unit_groups.get(id) is None: self.unit_groups[id] = NeededResultUnit() self.unit_groups[id].detected_quads_unit = result def get_portrait_zone(self, hash_id: str) -> Optional['Quadrilateral']: """Extract portrait zone using IdentityProcessor.""" if self.unit_groups.get(hash_id) is None: return None id_processor = IdentityProcessor() units = self.unit_groups[hash_id] ret, portrait_zone = id_processor.find_portrait_zone( units.scaled_colour_img_unit, units.localized_text_lines_unit, units.recognized_text_lines_unit, units.detected_quads_unit, units.deskewed_image_unit ) if ret != EnumErrorCode.EC_OK: return None return portrait_zone COMMAND_BLOCK: class DCPResultProcessor: """Converts ParsedResultItem into structured MRZ data.""" def init(self, item: ParsedResultItem): self.doc_type = item.get_code_type() self.raw_text = [] self.doc_id = None self.surname = None self.given_name = None self.nationality = None self.issuer = None self.gender = None self.date_of_birth = None self.date_of_expiry = None self.is_passport = False # Handle passport-specific fields if self.doc_type == "MRTD_TD3_PASSPORT": if item.get_field_value("passportNumber") is not None: self.doc_id = item.get_field_value("passportNumber") elif item.get_field_value("documentNumber") is not None: self.doc_id = item.get_field_value("documentNumber") self.is_passport = True # Extract raw MRZ lines for i in range(1, 4): line = item.get_field_value(f"line{i}") if line is not None: if item.get_field_validation_status(f"line{i}") == EnumValidationStatus.VS_FAILED: line += " [Validation Failed]" self.raw_text.append(line) # Extract structured fields with validation checks if item.get_field_value("nationality") is not None: self.nationality = item.get_field_value("nationality") if item.get_field_value("issuingState") is not None: self.issuer = item.get_field_value("issuingState") if item.get_field_value("dateOfBirth") is not None: self.date_of_birth = item.get_field_value("dateOfBirth") if item.get_field_value("dateOfExpiry") is not None: self.date_of_expiry = item.get_field_value("dateOfExpiry") if item.get_field_value("sex") is not None: self.gender = item.get_field_value("sex") if item.get_field_value("primaryIdentifier") is not None: self.surname = item.get_field_value("primaryIdentifier") if item.get_field_value("secondaryIdentifier") is not None: self.given_name = item.get_field_value("secondaryIdentifier") Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: class DCPResultProcessor: """Converts ParsedResultItem into structured MRZ data.""" def init(self, item: ParsedResultItem): self.doc_type = item.get_code_type() self.raw_text = [] self.doc_id = None self.surname = None self.given_name = None self.nationality = None self.issuer = None self.gender = None self.date_of_birth = None self.date_of_expiry = None self.is_passport = False # Handle passport-specific fields if self.doc_type == "MRTD_TD3_PASSPORT": if item.get_field_value("passportNumber") is not None: self.doc_id = item.get_field_value("passportNumber") elif item.get_field_value("documentNumber") is not None: self.doc_id = item.get_field_value("documentNumber") self.is_passport = True # Extract raw MRZ lines for i in range(1, 4): line = item.get_field_value(f"line{i}") if line is not None: if item.get_field_validation_status(f"line{i}") == EnumValidationStatus.VS_FAILED: line += " [Validation Failed]" self.raw_text.append(line) # Extract structured fields with validation checks if item.get_field_value("nationality") is not None: self.nationality = item.get_field_value("nationality") if item.get_field_value("issuingState") is not None: self.issuer = item.get_field_value("issuingState") if item.get_field_value("dateOfBirth") is not None: self.date_of_birth = item.get_field_value("dateOfBirth") if item.get_field_value("dateOfExpiry") is not None: self.date_of_expiry = item.get_field_value("dateOfExpiry") if item.get_field_value("sex") is not None: self.gender = item.get_field_value("sex") if item.get_field_value("primaryIdentifier") is not None: self.surname = item.get_field_value("primaryIdentifier") if item.get_field_value("secondaryIdentifier") is not None: self.given_name = item.get_field_value("secondaryIdentifier") COMMAND_BLOCK: class DCPResultProcessor: """Converts ParsedResultItem into structured MRZ data.""" def init(self, item: ParsedResultItem): self.doc_type = item.get_code_type() self.raw_text = [] self.doc_id = None self.surname = None self.given_name = None self.nationality = None self.issuer = None self.gender = None self.date_of_birth = None self.date_of_expiry = None self.is_passport = False # Handle passport-specific fields if self.doc_type == "MRTD_TD3_PASSPORT": if item.get_field_value("passportNumber") is not None: self.doc_id = item.get_field_value("passportNumber") elif item.get_field_value("documentNumber") is not None: self.doc_id = item.get_field_value("documentNumber") self.is_passport = True # Extract raw MRZ lines for i in range(1, 4): line = item.get_field_value(f"line{i}") if line is not None: if item.get_field_validation_status(f"line{i}") == EnumValidationStatus.VS_FAILED: line += " [Validation Failed]" self.raw_text.append(line) # Extract structured fields with validation checks if item.get_field_value("nationality") is not None: self.nationality = item.get_field_value("nationality") if item.get_field_value("issuingState") is not None: self.issuer = item.get_field_value("issuingState") if item.get_field_value("dateOfBirth") is not None: self.date_of_birth = item.get_field_value("dateOfBirth") if item.get_field_value("dateOfExpiry") is not None: self.date_of_expiry = item.get_field_value("dateOfExpiry") if item.get_field_value("sex") is not None: self.gender = item.get_field_value("sex") if item.get_field_value("primaryIdentifier") is not None: self.surname = item.get_field_value("primaryIdentifier") if item.get_field_value("secondaryIdentifier") is not None: self.given_name = item.get_field_value("secondaryIdentifier") COMMAND_BLOCK: from PySide6.QtWidgets import ( QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QFileDialog, QTextEdit, QSplitter, QComboBox, QGroupBox, QListWidget, QMessageBox ) from PySide6.QtCore import Qt, QThread, Signal from PySide6.QtGui import QImage, QPixmap, QPainter, QPen, QColor, QFont class MRZScannerWindow(QMainWindow): """Main application window for MRZ scanning.""" def init(self): super().init() self.setWindowTitle("MRZ Scanner - Dynamsoft Capture Vision") self.setMinimumSize(1200, 800) # Initialize SDK components self._init_sdk() # Setup UI self._setup_ui() def _init_sdk(self): """Initialize Dynamsoft Capture Vision SDK.""" # Initialize license - replace with your license key error_code, error_message = LicenseManager.init_license( "YOUR_LICENSE_KEY_HERE" ) if error_code != EnumErrorCode.EC_OK and error_code != EnumErrorCode.EC_LICENSE_WARNING: QMessageBox.warning( None, "License Error", f"License initialization failed: {error_message}" ) # Create router and intermediate result manager self.cvr = CaptureVisionRouter() self.irm = self.cvr.get_intermediate_result_manager() # Create and register intermediate result receiver self.irr = MyIntermediateResultReceiver(self.cvr) self.irm.add_result_receiver(self.irr) def _setup_ui(self): """Setup the user interface.""" central_widget = QWidget() self.setCentralWidget(central_widget) main_layout = QHBoxLayout(central_widget) # Create splitter for resizable panels splitter = QSplitter(Qt.Horizontal) main_layout.addWidget(splitter) # Left panel - Image display and controls left_panel = QWidget() left_layout = QVBoxLayout(left_panel) # Input controls input_group = QGroupBox("Input Source") input_layout = QVBoxLayout(input_group) self.source_combo = QComboBox() self.source_combo.addItems(["Image File", "Image Folder", "Camera"]) input_layout.addWidget(self.source_combo) button_layout = QHBoxLayout() self.load_btn = QPushButton("Load File/Folder") self.paste_btn = QPushButton("Paste from Clipboard") self.export_btn = QPushButton("Export Passport") button_layout.addWidget(self.load_btn) button_layout.addWidget(self.paste_btn) button_layout.addWidget(self.export_btn) input_layout.addLayout(button_layout) left_layout.addWidget(input_group) # Image display with overlay support display_group = QGroupBox("Image / Camera View") display_layout = QVBoxLayout(display_group) self.image_display = ImageDisplayWidget() display_layout.addWidget(self.image_display) left_layout.addWidget(display_group, 1) # Right panel - Results display right_panel = QWidget() right_layout = QVBoxLayout(right_panel) # Raw MRZ text raw_group = QGroupBox("MRZ Raw Text") raw_layout = QVBoxLayout(raw_group) self.raw_text_edit = QTextEdit() self.raw_text_edit.setReadOnly(True) self.raw_text_edit.setFont(QFont("Courier New", 10)) raw_layout.addWidget(self.raw_text_edit) right_layout.addWidget(raw_group) # Parsed results parsed_group = QGroupBox("Parsed Results") parsed_layout = QVBoxLayout(parsed_group) self.parsed_text_edit = QTextEdit() self.parsed_text_edit.setReadOnly(True) parsed_layout.addWidget(self.parsed_text_edit) right_layout.addWidget(parsed_group, 1) # Add panels to splitter splitter.addWidget(left_panel) splitter.addWidget(right_panel) splitter.setSizes([800, 400]) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: from PySide6.QtWidgets import ( QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QFileDialog, QTextEdit, QSplitter, QComboBox, QGroupBox, QListWidget, QMessageBox ) from PySide6.QtCore import Qt, QThread, Signal from PySide6.QtGui import QImage, QPixmap, QPainter, QPen, QColor, QFont class MRZScannerWindow(QMainWindow): """Main application window for MRZ scanning.""" def init(self): super().init() self.setWindowTitle("MRZ Scanner - Dynamsoft Capture Vision") self.setMinimumSize(1200, 800) # Initialize SDK components self._init_sdk() # Setup UI self._setup_ui() def _init_sdk(self): """Initialize Dynamsoft Capture Vision SDK.""" # Initialize license - replace with your license key error_code, error_message = LicenseManager.init_license( "YOUR_LICENSE_KEY_HERE" ) if error_code != EnumErrorCode.EC_OK and error_code != EnumErrorCode.EC_LICENSE_WARNING: QMessageBox.warning( None, "License Error", f"License initialization failed: {error_message}" ) # Create router and intermediate result manager self.cvr = CaptureVisionRouter() self.irm = self.cvr.get_intermediate_result_manager() # Create and register intermediate result receiver self.irr = MyIntermediateResultReceiver(self.cvr) self.irm.add_result_receiver(self.irr) def _setup_ui(self): """Setup the user interface.""" central_widget = QWidget() self.setCentralWidget(central_widget) main_layout = QHBoxLayout(central_widget) # Create splitter for resizable panels splitter = QSplitter(Qt.Horizontal) main_layout.addWidget(splitter) # Left panel - Image display and controls left_panel = QWidget() left_layout = QVBoxLayout(left_panel) # Input controls input_group = QGroupBox("Input Source") input_layout = QVBoxLayout(input_group) self.source_combo = QComboBox() self.source_combo.addItems(["Image File", "Image Folder", "Camera"]) input_layout.addWidget(self.source_combo) button_layout = QHBoxLayout() self.load_btn = QPushButton("Load File/Folder") self.paste_btn = QPushButton("Paste from Clipboard") self.export_btn = QPushButton("Export Passport") button_layout.addWidget(self.load_btn) button_layout.addWidget(self.paste_btn) button_layout.addWidget(self.export_btn) input_layout.addLayout(button_layout) left_layout.addWidget(input_group) # Image display with overlay support display_group = QGroupBox("Image / Camera View") display_layout = QVBoxLayout(display_group) self.image_display = ImageDisplayWidget() display_layout.addWidget(self.image_display) left_layout.addWidget(display_group, 1) # Right panel - Results display right_panel = QWidget() right_layout = QVBoxLayout(right_panel) # Raw MRZ text raw_group = QGroupBox("MRZ Raw Text") raw_layout = QVBoxLayout(raw_group) self.raw_text_edit = QTextEdit() self.raw_text_edit.setReadOnly(True) self.raw_text_edit.setFont(QFont("Courier New", 10)) raw_layout.addWidget(self.raw_text_edit) right_layout.addWidget(raw_group) # Parsed results parsed_group = QGroupBox("Parsed Results") parsed_layout = QVBoxLayout(parsed_group) self.parsed_text_edit = QTextEdit() self.parsed_text_edit.setReadOnly(True) parsed_layout.addWidget(self.parsed_text_edit) right_layout.addWidget(parsed_group, 1) # Add panels to splitter splitter.addWidget(left_panel) splitter.addWidget(right_panel) splitter.setSizes([800, 400]) COMMAND_BLOCK: from PySide6.QtWidgets import ( QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QFileDialog, QTextEdit, QSplitter, QComboBox, QGroupBox, QListWidget, QMessageBox ) from PySide6.QtCore import Qt, QThread, Signal from PySide6.QtGui import QImage, QPixmap, QPainter, QPen, QColor, QFont class MRZScannerWindow(QMainWindow): """Main application window for MRZ scanning.""" def init(self): super().init() self.setWindowTitle("MRZ Scanner - Dynamsoft Capture Vision") self.setMinimumSize(1200, 800) # Initialize SDK components self._init_sdk() # Setup UI self._setup_ui() def _init_sdk(self): """Initialize Dynamsoft Capture Vision SDK.""" # Initialize license - replace with your license key error_code, error_message = LicenseManager.init_license( "YOUR_LICENSE_KEY_HERE" ) if error_code != EnumErrorCode.EC_OK and error_code != EnumErrorCode.EC_LICENSE_WARNING: QMessageBox.warning( None, "License Error", f"License initialization failed: {error_message}" ) # Create router and intermediate result manager self.cvr = CaptureVisionRouter() self.irm = self.cvr.get_intermediate_result_manager() # Create and register intermediate result receiver self.irr = MyIntermediateResultReceiver(self.cvr) self.irm.add_result_receiver(self.irr) def _setup_ui(self): """Setup the user interface.""" central_widget = QWidget() self.setCentralWidget(central_widget) main_layout = QHBoxLayout(central_widget) # Create splitter for resizable panels splitter = QSplitter(Qt.Horizontal) main_layout.addWidget(splitter) # Left panel - Image display and controls left_panel = QWidget() left_layout = QVBoxLayout(left_panel) # Input controls input_group = QGroupBox("Input Source") input_layout = QVBoxLayout(input_group) self.source_combo = QComboBox() self.source_combo.addItems(["Image File", "Image Folder", "Camera"]) input_layout.addWidget(self.source_combo) button_layout = QHBoxLayout() self.load_btn = QPushButton("Load File/Folder") self.paste_btn = QPushButton("Paste from Clipboard") self.export_btn = QPushButton("Export Passport") button_layout.addWidget(self.load_btn) button_layout.addWidget(self.paste_btn) button_layout.addWidget(self.export_btn) input_layout.addLayout(button_layout) left_layout.addWidget(input_group) # Image display with overlay support display_group = QGroupBox("Image / Camera View") display_layout = QVBoxLayout(display_group) self.image_display = ImageDisplayWidget() display_layout.addWidget(self.image_display) left_layout.addWidget(display_group, 1) # Right panel - Results display right_panel = QWidget() right_layout = QVBoxLayout(right_panel) # Raw MRZ text raw_group = QGroupBox("MRZ Raw Text") raw_layout = QVBoxLayout(raw_group) self.raw_text_edit = QTextEdit() self.raw_text_edit.setReadOnly(True) self.raw_text_edit.setFont(QFont("Courier New", 10)) raw_layout.addWidget(self.raw_text_edit) right_layout.addWidget(raw_group) # Parsed results parsed_group = QGroupBox("Parsed Results") parsed_layout = QVBoxLayout(parsed_group) self.parsed_text_edit = QTextEdit() self.parsed_text_edit.setReadOnly(True) parsed_layout.addWidget(self.parsed_text_edit) right_layout.addWidget(parsed_group, 1) # Add panels to splitter splitter.addWidget(left_panel) splitter.addWidget(right_panel) splitter.setSizes([800, 400]) COMMAND_BLOCK: def _process_image_file(self, file_path: str): """Process a single image file.""" try: # Read image for display image = cv2.imread(file_path) if image is None: self.statusBar().showMessage("Failed to load image") return # Process with SDK using "ReadPassportAndId" template result = self.cvr.capture(file_path, "ReadPassportAndId") self.current_captured_result = result if result is None: self.image_display.set_image(image, []) self._update_results_display([]) return # Extract MRZ results with portrait zones mrz_results = self._extract_mrz_results(result) doc_quad = self._get_document_quad(result) # Update display self.image_display.set_image(image, mrz_results, doc_quad) self._update_results_display(mrz_results) except Exception as e: self.statusBar().showMessage(f"Error: {str(e)}") def _extract_mrz_results(self, result: CapturedResult) -> List[MRZResult]: """Extract MRZ results from CapturedResult.""" mrz_results = [] parsed_result = result.get_parsed_result() if parsed_result is None: return mrz_results # Get MRZ text line locations mrz_locations = [] line_result = result.get_recognized_text_lines_result() if line_result is not None: for item in line_result.get_items(): mrz_locations.append(item.get_location()) hash_id = result.get_original_image_hash_id() for item in parsed_result.get_items(): processor = DCPResultProcessor(item) # Get portrait zone for passports portrait_zone = None if processor.is_passport and self.irr: portrait_zone = self.irr.get_portrait_zone(hash_id) mrz_result = processor.to_mrz_result(portrait_zone, mrz_locations) mrz_results.append(mrz_result) return mrz_results Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: def _process_image_file(self, file_path: str): """Process a single image file.""" try: # Read image for display image = cv2.imread(file_path) if image is None: self.statusBar().showMessage("Failed to load image") return # Process with SDK using "ReadPassportAndId" template result = self.cvr.capture(file_path, "ReadPassportAndId") self.current_captured_result = result if result is None: self.image_display.set_image(image, []) self._update_results_display([]) return # Extract MRZ results with portrait zones mrz_results = self._extract_mrz_results(result) doc_quad = self._get_document_quad(result) # Update display self.image_display.set_image(image, mrz_results, doc_quad) self._update_results_display(mrz_results) except Exception as e: self.statusBar().showMessage(f"Error: {str(e)}") def _extract_mrz_results(self, result: CapturedResult) -> List[MRZResult]: """Extract MRZ results from CapturedResult.""" mrz_results = [] parsed_result = result.get_parsed_result() if parsed_result is None: return mrz_results # Get MRZ text line locations mrz_locations = [] line_result = result.get_recognized_text_lines_result() if line_result is not None: for item in line_result.get_items(): mrz_locations.append(item.get_location()) hash_id = result.get_original_image_hash_id() for item in parsed_result.get_items(): processor = DCPResultProcessor(item) # Get portrait zone for passports portrait_zone = None if processor.is_passport and self.irr: portrait_zone = self.irr.get_portrait_zone(hash_id) mrz_result = processor.to_mrz_result(portrait_zone, mrz_locations) mrz_results.append(mrz_result) return mrz_results COMMAND_BLOCK: def _process_image_file(self, file_path: str): """Process a single image file.""" try: # Read image for display image = cv2.imread(file_path) if image is None: self.statusBar().showMessage("Failed to load image") return # Process with SDK using "ReadPassportAndId" template result = self.cvr.capture(file_path, "ReadPassportAndId") self.current_captured_result = result if result is None: self.image_display.set_image(image, []) self._update_results_display([]) return # Extract MRZ results with portrait zones mrz_results = self._extract_mrz_results(result) doc_quad = self._get_document_quad(result) # Update display self.image_display.set_image(image, mrz_results, doc_quad) self._update_results_display(mrz_results) except Exception as e: self.statusBar().showMessage(f"Error: {str(e)}") def _extract_mrz_results(self, result: CapturedResult) -> List[MRZResult]: """Extract MRZ results from CapturedResult.""" mrz_results = [] parsed_result = result.get_parsed_result() if parsed_result is None: return mrz_results # Get MRZ text line locations mrz_locations = [] line_result = result.get_recognized_text_lines_result() if line_result is not None: for item in line_result.get_items(): mrz_locations.append(item.get_location()) hash_id = result.get_original_image_hash_id() for item in parsed_result.get_items(): processor = DCPResultProcessor(item) # Get portrait zone for passports portrait_zone = None if processor.is_passport and self.irr: portrait_zone = self.irr.get_portrait_zone(hash_id) mrz_result = processor.to_mrz_result(portrait_zone, mrz_locations) mrz_results.append(mrz_result) return mrz_results COMMAND_BLOCK: class ImageDisplayWidget(QLabel): """Custom widget with visual overlays for MRZ and portrait.""" def init(self, parent=None): super().init(parent) self.setAcceptDrops(True) self.current_image = None self.mrz_results = [] self.doc_quad = None def set_image(self, image, mrz_results=None, doc_quad=None): self.current_image = image self.mrz_results = mrz_results or [] self.doc_quad = doc_quad self._update_display() def _update_display(self): if self.current_image is None: return # Convert OpenCV image to QPixmap rgb_image = cv2.cvtColor(self.current_image, cv2.COLOR_BGR2RGB) h, w, ch = rgb_image.shape bytes_per_line = ch * w q_image = QImage(rgb_image.data, w, h, bytes_per_line, QImage.Format_RGB888) pixmap = QPixmap.fromImage(q_image) # Scale to fit widget scaled_pixmap = pixmap.scaled(self.size(), Qt.KeepAspectRatio, Qt.SmoothTransformation) # Calculate scale factor self.scale_factor = scaled_pixmap.width() / w # Draw overlays if self.mrz_results or self.doc_quad: painter = QPainter(scaled_pixmap) painter.setRenderHint(QPainter.Antialiasing) # Draw document boundary (blue) if self.doc_quad: self._draw_quadrilateral(painter, self.doc_quad, QColor(0, 0, 255, 200), "Document") for result in self.mrz_results: # Draw MRZ locations (green) for location in result.mrz_locations: self._draw_quadrilateral(painter, location, QColor(0, 255, 0, 200), "MRZ") # Draw portrait zone (orange) if result.portrait_zone: self._draw_quadrilateral(painter, result.portrait_zone, QColor(255, 165, 0, 200), "Portrait") painter.end() self.setPixmap(scaled_pixmap) def _draw_quadrilateral(self, painter: QPainter, quad, color: QColor, label: str): pen = QPen(color, 3) painter.setPen(pen) points = quad.points if len(points) >= 4: # Scale points to display coordinates scaled_points = [] for p in points: x = int(p.x * self.scale_factor) y = int(p.y * self.scale_factor) scaled_points.append((x, y)) # Draw quadrilateral for i in range(4): x1, y1 = scaled_points[i] x2, y2 = scaled_points[(i + 1) % 4] painter.drawLine(x1, y1, x2, y2) # Draw label font = QFont("Arial", 12, QFont.Bold) painter.setFont(font) painter.setPen(QPen(color, 2)) min_y = min(p[1] for p in scaled_points) min_x = min(p[0] for p in scaled_points) painter.drawText(min_x, max(0, min_y - 5), label) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: class ImageDisplayWidget(QLabel): """Custom widget with visual overlays for MRZ and portrait.""" def init(self, parent=None): super().init(parent) self.setAcceptDrops(True) self.current_image = None self.mrz_results = [] self.doc_quad = None def set_image(self, image, mrz_results=None, doc_quad=None): self.current_image = image self.mrz_results = mrz_results or [] self.doc_quad = doc_quad self._update_display() def _update_display(self): if self.current_image is None: return # Convert OpenCV image to QPixmap rgb_image = cv2.cvtColor(self.current_image, cv2.COLOR_BGR2RGB) h, w, ch = rgb_image.shape bytes_per_line = ch * w q_image = QImage(rgb_image.data, w, h, bytes_per_line, QImage.Format_RGB888) pixmap = QPixmap.fromImage(q_image) # Scale to fit widget scaled_pixmap = pixmap.scaled(self.size(), Qt.KeepAspectRatio, Qt.SmoothTransformation) # Calculate scale factor self.scale_factor = scaled_pixmap.width() / w # Draw overlays if self.mrz_results or self.doc_quad: painter = QPainter(scaled_pixmap) painter.setRenderHint(QPainter.Antialiasing) # Draw document boundary (blue) if self.doc_quad: self._draw_quadrilateral(painter, self.doc_quad, QColor(0, 0, 255, 200), "Document") for result in self.mrz_results: # Draw MRZ locations (green) for location in result.mrz_locations: self._draw_quadrilateral(painter, location, QColor(0, 255, 0, 200), "MRZ") # Draw portrait zone (orange) if result.portrait_zone: self._draw_quadrilateral(painter, result.portrait_zone, QColor(255, 165, 0, 200), "Portrait") painter.end() self.setPixmap(scaled_pixmap) def _draw_quadrilateral(self, painter: QPainter, quad, color: QColor, label: str): pen = QPen(color, 3) painter.setPen(pen) points = quad.points if len(points) >= 4: # Scale points to display coordinates scaled_points = [] for p in points: x = int(p.x * self.scale_factor) y = int(p.y * self.scale_factor) scaled_points.append((x, y)) # Draw quadrilateral for i in range(4): x1, y1 = scaled_points[i] x2, y2 = scaled_points[(i + 1) % 4] painter.drawLine(x1, y1, x2, y2) # Draw label font = QFont("Arial", 12, QFont.Bold) painter.setFont(font) painter.setPen(QPen(color, 2)) min_y = min(p[1] for p in scaled_points) min_x = min(p[0] for p in scaled_points) painter.drawText(min_x, max(0, min_y - 5), label) COMMAND_BLOCK: class ImageDisplayWidget(QLabel): """Custom widget with visual overlays for MRZ and portrait.""" def init(self, parent=None): super().init(parent) self.setAcceptDrops(True) self.current_image = None self.mrz_results = [] self.doc_quad = None def set_image(self, image, mrz_results=None, doc_quad=None): self.current_image = image self.mrz_results = mrz_results or [] self.doc_quad = doc_quad self._update_display() def _update_display(self): if self.current_image is None: return # Convert OpenCV image to QPixmap rgb_image = cv2.cvtColor(self.current_image, cv2.COLOR_BGR2RGB) h, w, ch = rgb_image.shape bytes_per_line = ch * w q_image = QImage(rgb_image.data, w, h, bytes_per_line, QImage.Format_RGB888) pixmap = QPixmap.fromImage(q_image) # Scale to fit widget scaled_pixmap = pixmap.scaled(self.size(), Qt.KeepAspectRatio, Qt.SmoothTransformation) # Calculate scale factor self.scale_factor = scaled_pixmap.width() / w # Draw overlays if self.mrz_results or self.doc_quad: painter = QPainter(scaled_pixmap) painter.setRenderHint(QPainter.Antialiasing) # Draw document boundary (blue) if self.doc_quad: self._draw_quadrilateral(painter, self.doc_quad, QColor(0, 0, 255, 200), "Document") for result in self.mrz_results: # Draw MRZ locations (green) for location in result.mrz_locations: self._draw_quadrilateral(painter, location, QColor(0, 255, 0, 200), "MRZ") # Draw portrait zone (orange) if result.portrait_zone: self._draw_quadrilateral(painter, result.portrait_zone, QColor(255, 165, 0, 200), "Portrait") painter.end() self.setPixmap(scaled_pixmap) def _draw_quadrilateral(self, painter: QPainter, quad, color: QColor, label: str): pen = QPen(color, 3) painter.setPen(pen) points = quad.points if len(points) >= 4: # Scale points to display coordinates scaled_points = [] for p in points: x = int(p.x * self.scale_factor) y = int(p.y * self.scale_factor) scaled_points.append((x, y)) # Draw quadrilateral for i in range(4): x1, y1 = scaled_points[i] x2, y2 = scaled_points[(i + 1) % 4] painter.drawLine(x1, y1, x2, y2) # Draw label font = QFont("Arial", 12, QFont.Bold) painter.setFont(font) painter.setPen(QPen(color, 2)) min_y = min(p[1] for p in scaled_points) min_x = min(p[0] for p in scaled_points) painter.drawText(min_x, max(0, min_y - 5), label) COMMAND_BLOCK: def main(): app = QApplication(sys.argv) app.setStyle("Fusion") # Apply dark theme palette = QPalette() palette.setColor(QPalette.Window, QColor(53, 53, 53)) palette.setColor(QPalette.WindowText, Qt.white) palette.setColor(QPalette.Base, QColor(35, 35, 35)) palette.setColor(QPalette.Text, Qt.white) palette.setColor(QPalette.Button, QColor(53, 53, 53)) palette.setColor(QPalette.ButtonText, Qt.white) app.setPalette(palette) window = MRZScannerWindow() window.show() sys.exit(app.exec()) if name == "main": main() Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: def main(): app = QApplication(sys.argv) app.setStyle("Fusion") # Apply dark theme palette = QPalette() palette.setColor(QPalette.Window, QColor(53, 53, 53)) palette.setColor(QPalette.WindowText, Qt.white) palette.setColor(QPalette.Base, QColor(35, 35, 35)) palette.setColor(QPalette.Text, Qt.white) palette.setColor(QPalette.Button, QColor(53, 53, 53)) palette.setColor(QPalette.ButtonText, Qt.white) app.setPalette(palette) window = MRZScannerWindow() window.show() sys.exit(app.exec()) if name == "main": main() COMMAND_BLOCK: def main(): app = QApplication(sys.argv) app.setStyle("Fusion") # Apply dark theme palette = QPalette() palette.setColor(QPalette.Window, QColor(53, 53, 53)) palette.setColor(QPalette.WindowText, Qt.white) palette.setColor(QPalette.Base, QColor(35, 35, 35)) palette.setColor(QPalette.Text, Qt.white) palette.setColor(QPalette.Button, QColor(53, 53, 53)) palette.setColor(QPalette.ButtonText, Qt.white) app.setPalette(palette) window = MRZScannerWindow() window.show() sys.exit(app.exec()) if name == "main": main() CODE_BLOCK: python mrz_scanner_gui.py Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: python mrz_scanner_gui.py CODE_BLOCK: python mrz_scanner_gui.py - Real-time MRZ recognition from passports and ID documents (TD1, TD2, TD3 formats) - Portrait/face detection and localization on travel documents - Multiple input methods: image files, folders, camera streams, drag-and-drop, clipboard paste - Visual overlays showing document boundaries, MRZ locations, and detected portraits - Parsed data display with structured extraction of personal information - Export functionality for normalized document images - All-in-one document processing: Combines document detection, MRZ recognition, and portrait extraction in a single SDK - High accuracy: Advanced OCR engines optimized for machine-readable zones - Intermediate results access: Fine-grained control over the processing pipeline - Cross-platform support: Windows, Linux, and macOS compatibility - Production-ready: Trusted by enterprises for identity verification systems - Python 3.9 or higher installed - A webcam (optional, for camera capture functionality) - Sample passport or ID card images for testing - Visit the Dynamsoft License Portal - Select "Capture Vision Suite (DCV)" as the product - Complete the registration to receive your license key - The trial license is valid for 30 days with full feature access - CaptureVisionRouter: Orchestrates the image processing workflow - IntermediateResultManager: Provides access to intermediate processing stages - IntermediateResultReceiver: Callback interface for capturing intermediate results like deskewed images and detected text lines - IdentityProcessor: Specialized processor for portrait zone detection on identity documents

🏷️ Tags

how-totutorialguidedev.toailinuxrouterpythongitgithub