10 Processing Categories Explained

What Happens After Extraction

The watchtheink extraction pipeline returns a set of FieldValue records for each scan. Processing categories define what happens to those values downstream. There are ten categories, each suited to a different integration pattern.

1. Entity Definition / Evolution

Use extracted data to create or update records in an external system. A new scan of a registration form might create a contact in your CRM, or update an existing one if a matching identifier is found.

2. Answer Validation

Compare extracted values against a known answer key. This is the core of quiz grading: each field's extracted value is checked against the expected answer, and a score is computed.

3. Data Entry into Existing Schema

Map extracted fields to columns in a database table. The platform handles type coercion (text → date, text → number) and constraint checking before writing.

4. Aggregation over Multiple Scans

Collect values from many scans of the same template and compute statistics: totals, averages, distributions. Useful for survey results and exam cohort analysis.

5. Delta / Diff Detection

Compare the current scan against a previous version of the same document. Surface only what changed — useful for form re-submissions and amendment workflows.

6. Multi-Source Aggregation

Combine data from scans of different templates into a single output record. For example, merge a registration form and a consent form into one unified participant record.

7. Exception / Anomaly Detection

Flag scans where extracted values fall outside expected ranges or patterns. Low-confidence fields, missing required values, and out-of-range numbers all trigger exception records.

8. Workflow Triggering via External API

Use extracted values as inputs to a webhook or external API call. When a specific field value is present (e.g. a checkbox marked "urgent"), fire a POST request to your workflow engine.

9. Classification Without Predefined Schema

Ask the AI to classify the document or group fields semantically, even when no template is defined. Useful for inbox triage and ad-hoc document routing.

10. Compliance Activity Logging

Append an immutable audit record for every scan: who submitted it, when, what template was used, and which fields were corrected after initial extraction. Required for regulated workflows.

Each category maps to a distinct API endpoint in the Output / Processing layer. See the API reference for request/response schemas.