Generate structured JSON from PDF documents
Generate speech from text using various voices
TTS
yolo preprocessing and layoutlm for inference