Detect all text blocks in the image. Group consecutive lines that form a single paragraph together.

For each text block, output:
- bbox_2d: [x1, y1, x2, y2] coordinates
- text_content: The complete text of the block (join multi-line text with spaces)

Output JSON array format:
[{"bbox_2d": [x1,y1,x2,y2], "text_content": "text here"}, ...]