NoteGenNOTEGEN.

Image Recognition

Image recognition configuration guide, supporting both OCR and VLM recognition methods.

Enable Image Recognition

When enabled, text recognition will be performed automatically when images are uploaded.

Primary Recognition Method

Select OCR or VLM as the primary recognition method.

OCR (Optical Character Recognition)

Uses traditional OCR algorithms to recognize text in images.

Language Packs

Select the language packs needed for recognition, supporting multiple languages. Use commas to separate multiple languages, e.g., chi_sim,eng.

When adding new language packs, corresponding data files will be downloaded automatically.

Using Tesseract

Currently uses Tesseract as the OCR engine, supporting multiple language packs.

VLM (Vision Language Model)

Uses AI large models for image understanding and text recognition.

Model Selection

Select the AI model for image recognition. Needs to be pre-configured in model configuration.

Method Comparison

FeatureOCRVLM
SpeedFastSlower
AccuracyHigher for standard textBetter for complex scenes
Network RequirementLocal executionNetwork required
CostFreeMay incur costs
UnderstandingText onlyCan understand image content

Usage Recommendations

  • Pure text extraction: Use OCR for speed and free usage
  • Complex scenarios: Use VLM for better recognition of tables, handwriting, complex layouts
  • Batch processing: Use OCR to save costs and time
  • High accuracy: Use VLM for more accurate recognition results