Skip to content

Amazon Nova 2 Lite and Claude for Cost-Optimized Document Processing

The point: A two-stage pipeline using Amazon Nova 2 Lite for structured extraction and Claude Sonnet 3.5 for spatial reasoning reduces document digitization costs by two-thirds.

Amazon demonstrates a two-stage process for automated digitization of scanned documents with combined AI models: Nova 2 Lite recognizes photos and names, Claude performs spatial analysis. The combination saves approximately two-thirds of costs compared to a single-model approach.

Amazon has implemented a production process on AWS Bedrock to automatically digitize scanned yearbook pages. The first stage uses Amazon Nova 2 Lite to accomplish multiple tasks on a single page in one API call: natively multimodal recognition of photographs with bounding boxes, extraction of visible text (names) with position information, and capture of metadata such as titles and categories. The model was configured on LOW-reasoning mode, as tests across all 336 test pages showed no significant accuracy loss, but this setting minimizes token costs.

In the second stage, Claude Sonnet 3.5 analyzes spatial arrangement: based on name positions and photo bounding boxes from Nova, Claude determines which name belongs to which face. This spatial reasoning is necessary because yearbook layouts vary – names can appear above or below photos, pages mix portrait grids with group photos. Claude’s adaptive thinking handles this variability without additional prompt engineering per layout type.

In testing with 336 scanned yearbook pages, 3,122 name-to-face associations were created, of which 93 percent were rated with a confidence score of 0.95 or higher. Cost optimization works through constraints on Nova: instead of full OCR of text content (approximately 4,500 tokens per page), the model extracts only names near photos, reducing token output to approximately 1,000 tokens per page. Overall, the two-stage approach costs approximately two-thirds less per page than a single-model scenario that sends the entire task to a single vision-language model.

Amazon has also redesigned Nova 2 Lite’s image billing: a fixed price-per-image model provides cost predictability when processing hundreds of thousands of pages – a critical factor for enterprise applications with high-volume scaling.


Source: aws.amazon.com · Published June 29, 2026
Lumi AI News — AI-assisted curation pursuant to Art. 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.7.2.

Share on: