Browse the blog

The EU AI Act and your tax software: what 2 August 2026 actually means — TaxItEasy®

Invoice OCR is not Annex-III high-risk under the AI Act. But the transparency, documentation and human-oversight obligations that go live in August 2026 are exactly what you should want from a tax tool anyway.

On 2 August 2026 the full EU AI Act regime, including the high-risk obligations under Annex III, becomes enforceable. For tax software vendors and their customers, this is a useful date to think clearly about what the AI Act actually requires — and what it doesn't.

The one-line thesis

Invoice OCR and field extraction, in isolation, is not listed as high-risk under Annex III of the AI Act. That category is reserved for things like creditworthiness assessment, recruitment, and education scoring. But the AI Act also imposes transparency, documentation and governance obligations that apply across the board to any AI system on the EU market. Those obligations are good. They are also the things you should have wanted from your tax tool's AI features before the AI Act existed.

The dated timeline

The EU AI Act (Regulation (EU) 2024/1689) entered into force on 1 August 2024 and rolls out in four phases:

2024-08-01   AI Act enters into force
2025-02-02   Prohibited-AI rules apply
             AI literacy obligations apply
2025-08-02   General-purpose AI (GPAI) obligations apply
             (technical documentation, instructions for use,
             copyright-compliance, training-data summary)
             Systemic-risk GPAI: model evaluations,
             adversarial testing, incident reporting
2026-08-02   Full regime applies, including high-risk AI
             obligations under Annex III
2027-08-02   Compliance deadline for GPAI models already
             on the market before 2 August 2025

The penalty exposure under the regulation is non-trivial: up to €35 million or 7% of global turnover for prohibited-AI violations, and up to €15 million or 3% for high-risk system breaches. Those are EU-style fines, modelled on the GDPR framework.

Is invoice OCR high-risk?

Annex III of the AI Act lists eight categories of AI use that count as high-risk. The financial ones are narrow: creditworthiness assessment and credit scoring, life and health insurance risk assessment, and AI-driven pricing of those insurance products. They reach into financial decision-making about an individual, not into back-office document processing.

Invoice OCR — reading a vendor name, an amount, a VAT line, a date from a PDF — is not on the Annex III list. It is not making a credit decision, it is not selecting candidates for a job, it is not allocating educational opportunity. It is structured-data extraction from a document.

This matters because some commentary in early 2025 conflated "any AI used in a financial process" with "high-risk under the AI Act". That conflation is wrong: the regulation is specific about which use cases count, and tax-document automation is not on the list.

Where a tax tool's AI would tip into high-risk territory: if it took the extracted data and made an autonomous decision about, say, whether to extend the user credit, deny a tax-deductible claim, or score the user for fraud risk in a way that materially affected them. None of that is what TaxItEasy®'s OCR + field-extraction does. The output is structured data that the user then reviews and approves.

The obligations that do apply

Even though invoice OCR is not high-risk, the AI Act applies a number of cross-cutting obligations to any AI system on the EU market from 2 August 2026. The ones that matter for tax software:

Transparency

Users must be able to tell when they are interacting with an AI system. For a tax tool, that is the obvious case: every extracted field comes from a model, and we say so. There is a Help article describing exactly which fields are AI-extracted, what the confidence score means, and how to correct extraction errors. After August 2026, that level of disclosure is a regulatory expectation, not just a documentation nicety.

Technical documentation

Article 11 of the regulation requires technical documentation describing the AI system: what it does, what data it was trained on (at the GPAI level, training-data summaries), what its known limitations are, what risks it presents and how those are mitigated. For high-risk systems this is a formal requirement; for non-high-risk systems it is the de facto deliverable any serious enterprise procurement process will ask for.

Human oversight design

Article 14 requires that high-risk AI systems be designed for meaningful human oversight. Even outside Annex III, this is the design principle that distinguishes safe-by-design tax tooling from autonomous-by-design tax tooling: every extraction is a suggestion until the user (or their tax advisor) confirms it. The confirmation step is the human oversight, by construction.

Log-keeping

Article 12 requires logging of operations for high-risk systems sufficient to enable traceability. Outside Annex III, the equivalent expectation under GDPR (Art. 30 records of processing) and under good security practice is broadly the same: log enough to reconstruct what happened, retain it for a defined period, make it available on request. Our audit log is the practical implementation.

GPAI provider obligations (already live since 2 August 2025)

If your tax software uses a third-party large language model (in our case, Anthropic's Claude family for some downstream extraction steps), the GPAI provider has separate obligations: technical documentation, instructions for use, copyright-compliance attestation, and a training-data summary. Those obligations apply to the GPAI provider, not directly to your tax tool — but the choice of GPAI provider becomes a vendor-due-diligence item for any tax tool's enterprise customers.

This is why the sub-processor disclosure matters: it is no longer enough to say "we use AI"; the question is which model, run by which provider, in which jurisdiction, with which contractual data-retention regime. Anthropic, OpenAI, Mistral, Aleph Alpha — these are different answers with different regulatory traces.

What enterprise buyers will ask

If you are a procurement function looking at AI-powered tax tools in late 2026 and 2027, the AI Act gives you a script. The questions to expect on the vendor questionnaire:

  1. Is your AI system listed under Annex III of the AI Act? (For an invoice-OCR tool the answer is no — but the vendor should be able to explain why no, not just say it.)
  2. Which third-party AI providers do you rely on, and what is their AI Act compliance status? (This pushes the vendor to disclose their GPAI sub-processors and confirm those providers are in good standing post-2 August 2025.)
  3. What is your technical documentation for the AI components? (Even non-high-risk vendors should have this — model, data flow, known limitations, error rates.)
  4. What is your data-governance policy for inputs to the AI system? (Inputs include the user's documents. The answer should include: where data is processed, retention period at the provider, whether the data is used to train future models.)
  5. What is your human-oversight design? (Are extractions automatically applied, or do they require a confirmation step?)
  6. Where is the AI system hosted, and where does the AI processing happen? (EU residency is not an AI Act requirement, but it is correlated — EU-hosted AI is structurally easier to keep AI-Act-and-GDPR-compliant than US-hosted AI.)
  7. What is your incident-reporting process for AI-related issues? (Serious incidents in high-risk systems must be reported under Article 73; for non-high-risk it is good practice, and enterprise buyers expect it.)

These are not trick questions. A vendor that has been thinking about AI governance since GDPR will have ready answers. A vendor that has been treating "AI" as a marketing-page feature for the last three years will struggle.

What it looks like from four desks

A Paris consultancy doing AI-vendor due diligence

The consultancy is selecting an invoice-management tool for itself and for ten of its small-business clients. The procurement lead opens the AI Act, scans Annex III, confirms invoice OCR is not on the list, and then asks the vendor for the technical documentation. The vendor that can hand over a clear write-up wins the contract. The vendor that says "we comply with the AI Act" without specifics gets eliminated.

A Berlin freelancer who reads policy news

The freelancer is not buying enterprise-scale software, but they care that their tax tool is operating cleanly under the new regime. The most useful thing for them is the transparency-receipt: which model reads my invoices, where does it run, what does it see, what gets logged, how do I correct it. The freelancer reads what the AI reads and how to correct it and gets exactly that.

An Amsterdam DPO at a 200-person consultancy

The DPO is updating the Record of Processing under Article 30 of the GDPR for 2026. Two of the firm's tools now have AI components. The DPO needs to be able to record: which AI providers are sub-processors, where they process data, what their AI Act status is, and what the human-oversight design is for each AI feature. A vendor that has published this information saves the DPO a back-and-forth; a vendor that hasn't, costs the DPO a week of email.

A Madrid SMB owner reading the news cycle

The owner sees a finance-press headline about "the EU AI Act crackdown coming in August" and asks whether their tools are at risk. The honest answer is: the tools they use for OCR are not Annex III, so no, not directly. But the same question is also an invitation to ask the vendor for the transparency-receipt — because the right thing to want from your tax tool is the same thing the AI Act will increasingly require.

What TaxItEasy is doing

The transparency-receipt for the AI components in our pipeline lives in three places:

  • The Help center documents exactly what the AI reads and extracts, how the confidence scoring works, and what to do when the OCR extracted the wrong fields. Human oversight is in the product by design: every extraction is a suggestion the user can correct.
  • The sub-processor list (our-sub-processors) names every AI provider in our pipeline, where they run, and what the contractual data-retention regime is. As of the date of this article, our LLM provider's contractual default for retention is up to 30 days for safety monitoring; the zero-retention addendum we want is in negotiation. We have committed to updating the sub-processor page when that addendum is signed.
  • The data-residency page (where is my data stored) documents that the platform itself runs in DigitalOcean's Frankfurt region (FRA1), with managed PostgreSQL and DigitalOcean Spaces (S3-compatible object storage) all EU-resident. The AI processing step is a separate consideration — that data leaves the platform to the LLM provider — and is disclosed in the sub-processor list, not buried.

We are not making AI Act compliance claims that we cannot back up. The internal honest position: as a SaaS provider whose AI features are not on Annex III, the August 2026 obligations that fall directly on us are mostly the transparency, documentation and human-oversight ones, and those are obligations we treated as design requirements before the AI Act forced the matter.

Sources

Have feedback on this post? [email protected] · or ask the AI chat in the bottom-right corner.