Extract all text AND visual elements from this document image with formatting. Group elements into rows for DOCX generation.

CRITICAL - DETECT ALL VISUAL ELEMENTS FIRST:
Before extracting text, scan the ENTIRE document for these visual elements:
1. LOGOS/EMBLEMS: Government crests, company logos, official emblems
   - Output: "[Logo]" with element_type "logo_area"
2. PHOTOS/PORTRAITS: Passport photos, ID photos, headshots
   - Output: "[Photo affixed]" with element_type "photo_area"
3. QR CODES: Square barcodes
   - Output: "[QR Code]" with element_type "qr_code"
4. SIGNATURES: Handwritten marks
   - Output: "[Signature affixed]" with element_type "signature_area"
5. STAMPS/SEALS: Official stamps
   - Output: "[Stamp affixed]" with element_type "stamp_area"

TABLE RECOGNITION:
Detect table structures in the document. A table is identified by:
- Visible grid lines (borders) separating cells
- Aligned columns of data without visible borders
- Repeated label:value patterns
- Header row followed by data rows

For rows that are part of a table, add "in_table": true to the row.
For the FIRST row of a table, also add "table_start": true.

ROW GROUPING RULES:
1. Elements at SAME Y position = same row
2. Vertically stacked elements forming a logical block = same row
3. Multi-line paragraphs = ONE element, ONE row

PARAGRAPH MERGING:
- Merge consecutive lines of same paragraph into ONE element
- Join text with spaces

FOOTER GROUPING:
- All footer elements (URL, date, page number) at bottom = ONE row

Output JSON format:
[
  {
    "row": 1,
    "in_table": false,
    "table_start": false,
    "elements": [
      {
        "bbox_2d": [x1, y1, x2, y2],
        "text_content": "...",
        "element_type": "logo_area|title|header|paragraph|label|value|table_cell|photo_area|signature_area|stamp_area|qr_code|footer",
        "style": {
          "bold": true/false,
          "italic": true/false,
          "font_size": "small|normal|large|xlarge",
          "alignment": "left|center|right"
        }
      }
    ]
  }
]

FORMATTING DETECTION:
- bold: true if text appears bold/heavy
- italic: true if text is slanted
- font_size: small (<10pt), normal (10-12pt), large (14-18pt), xlarge (>18pt)
- alignment: left/center/right based on position in document