By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Sécurité Helvétique News | AmyrisSécurité Helvétique News | AmyrisSécurité Helvétique News | Amyris
  • Home
  • Compliance
    Compliance
    Show More
    Top News
    AML & KYC: Addressing Key Challenges for 2023 and Beyond
    20 March 2023
    News Roundup: Confidence in AML Lags, but So Do Budgets
    28 April 2024
    Tax Nexus, Reciprocity & More: Navigating Multistate Payroll Tax Withholding Compliance
    6 November 2024
    Latest News
    US Finalizes CMMC Rule: Cybersecurity Verification Now Determines Contract Eligibility for Defense Contractors
    13 December 2025
    Top 10 Risk & Compliance Trends for 2026
    7 December 2025
    How 2025 Redefined Telemarketing Compliance
    1 December 2025
    Advice for the AI Boom: Use the Tools, Not Too Much, Stay in Charge
    25 November 2025
  • Cyber Security
    Cyber Security
    Show More
    Top News
    Julian Assange, inside a DDoS attack, and deepfake traumas • Graham Cluley
    27 June 2024
    FakeBat Loader Malware Spreads Widely Through Drive-by Download Attacks
    3 July 2024
    Ransomware attack on blood-testing service puts lives in danger in South Africa
    9 July 2024
    Latest News
    North Korean Hackers Target Developers with Malicious npm Packages
    30 August 2024
    Russian Hackers Exploit Safari and Chrome Flaws in High-Profile Cyberattack
    29 August 2024
    Vietnamese Human Rights Group Targeted in Multi-Year Cyberattack by APT32
    29 August 2024
    2.5 Million Reward Offered For Cyber Criminal Linked To Notorious Angler Exploit Kit
    29 August 2024
  • Technology
    Technology
    Show More
    Top News
    Amazon October Prime Day: 80+ Early Deals on Tech, Home Goods, TVs, Appliances and More
    5 October 2024
    Adobe’s AI video model is here, and it’s already inside Premiere Pro
    14 October 2024
    Congratulations to the Top MSRC 2024 Q3 Security Researchers! | MSRC Blog
    23 October 2024
    Latest News
    Why XSS still matters: MSRC’s perspective on a 25-year-old threat  | MSRC Blog
    9 September 2025
    Microsoft Bug Bounty Program Year in Review: $13.8M in Rewards | MSRC Blog
    28 August 2025
    Microsoft Bounty Program Year in Review: $16.6M in Rewards  | MSRC Blog
    27 August 2025
    postMessaged and Compromised | MSRC Blog
    26 August 2025
  • Businness
    Businness
    Show More
    Top News
    Sunak secures backing of key Brexiters for N Ireland trade deal
    21 February 2023
    David Bowie’s vast archive donated to V&A Museum
    23 February 2023
    Russia’s war in Ukraine drags into second year with no end in sight By Reuters
    24 February 2023
    Latest News
    Blue Owl Technology Finance stock initiated with Buy rating by B.Riley
    16 December 2025
    Client Challenge
    15 December 2025
    At least 2 killed and 8 injured hurt in shooting at Brown University with suspect still at large
    14 December 2025
    Thailand vows to keep fighting Cambodia, despite Trump's ceasefire claim
    13 December 2025
  • ÉmissionN
    Émission
    Cyber Security Podcasts
    Show More
    Top News
    Stream episode Cybercrime Wire For Nov. 27, 2024. RansomHub Gang Claims Texas City Attack. WCYB Digital Radio. by Cybercrime Magazine podcast
    28 November 2024
    Cybersecurity Today: Email and Other Fraud
    9 December 2024
    Ransomware Minute. Telecom Namibia, Minneapolis Parks & Rec Board. Scott Schober, WCYB Digital Radio
    18 December 2024
    Latest News
    Stream episode Cybercrime Magazine Update: Cybercrime In India. Sheer Volume Overwhelming Police Forces. by Cybercrime Magazine podcast
    3 March 2025
    Autonomous SOC. Why It’s A Breakthrough For The Mid-Market. Subo Guha, SVP of Product, Stellar Cyber
    2 March 2025
    Cyber Safety. Protecting Families From Smart Toy Risks. Scott Schober, Author, "Hacked Again."
    2 March 2025
    Cybercrime News For Feb. 25, 2025. Hackers Steal $49M from Infini Crypto Fintech. WCYB Digital Radio
    2 March 2025
Search
Cyber Security
  • Application Security
  • Darknet
  • Data Protection
  • network vulnerability
  • Pentesting
Compliance
  • LPD
  • RGPD
  • Finance
  • Medical
Technology
  • AI
  • MICROSOFT
  • VERACODE
  • CHECKMARKX
  • WITHSECURE
  • Amyris
  • Contact
  • Disclaimer
  • Privacy Policy
  • About us
© 2023 Sécurité Helvétique NEWS par Amyris Sarl. Tous droits réservés
Reading: Where in the Loop? Testing AI Across 120 Compliance Tasks to Find Out Where Humans Are Most Needed
Share
Sign In
Notification Show More
Font ResizerAa
Sécurité Helvétique News | AmyrisSécurité Helvétique News | Amyris
Font ResizerAa
  • Home
  • Compliance
  • Cyber Security
  • Technology
  • Business
Search
  • Home
    • Compliance
    • Cyber Security
    • Technology
    • Businness
  • Legal Docs
    • Contact us
    • Disclaimer
    • Privacy Policy
    • About us
Have an existing account? Sign In
Follow US
  • Amyris
  • Contact
  • Disclaimer
  • Privacy Policy
  • About us
© 2023 Sécurité Helvétique par Amyris Sarl.
Sécurité Helvétique News | Amyris > Blog > Compliance > Where in the Loop? Testing AI Across 120 Compliance Tasks to Find Out Where Humans Are Most Needed
Compliance

Where in the Loop? Testing AI Across 120 Compliance Tasks to Find Out Where Humans Are Most Needed

webmaster
Last updated: 2025/11/13 at 3:33 PM
webmaster
Share
12 Min Read
SHARE

As compliance teams experiment with AI for everything from risk assessments to policy interpretation, a practical question emerges: Which tasks can be automated reliably, and which still require human judgment? Steph Holmes, director of compliance and ethics strategy at EQS Group, dives into her organization’s research on six AI models, finding that it excels at rule-driven work but struggles in the gray zone where data meets intent and culture intersects with language. The findings suggest oversight should be strategic, not universal, but there’s no doubt the loop isn’t complete without humans. 

When people talk about responsible AI, the concept of “human in the loop” tends to surface quickly, and for good reason. It sounds reassuring, almost self-evident. But as compliance teams start experimenting with AI, a practical question emerges: Where in the loop should humans actually be?

Contents
The ‘messy middle’ of compliance — where AI still struggles (and excels) What Boards & Executives Need to Know (and Ask) About Agentic AI Redefining what ‘good enough’ means in complianceFrom gatekeepers to collaboratorsThe new skillset of compliance: understanding the loopTaking the lead on ethical AI

To move beyond slogans, we needed evidence. To that end, EQS Group tested six AI models across 120 real-world compliance tasks. The goal was to understand how these systems perform in the same scenarios compliance professionals face every day — from risk assessments and policy interpretation to drafting complex executive briefings.

The benchmark findings reveal the models have made remarkable progress but still have their limits. For example: On clear, rule-based work, such as classifying policies or ranking risks, AI models routinely achieved accuracy above 90%. But in more nuanced, judgment-based scenarios — interpreting complex data or assessing proportionality, for example — the results diverged dramatically. In one category, data analysis, the top-performing model scored 88%, while the weakest reached only 28%.

For compliance professionals, this isn’t just a statistic. It means two systems, given the same disclosure or risk scenario, could reach entirely different conclusions. The lesson to take away, then, is not that AI in general is unreliable but that reliability depends on the right context and tools. 

The research provides a rare, data-driven map of where the technology already supports sound decisions — and where human insight remains the deciding factor.

The ‘messy middle’ of compliance — where AI still struggles (and excels)

Much of compliance work happens in what might be called the “messy middle” — the gray zone between clear rules and ethical nuance. It’s where data meets intent and where culture, language and judgment intersect. It’s the difference between a mistake in an expense report and a deliberate scheme, or between an unclear policy and a genuine compliance gap.

Here, the results draw a clear boundary. The latest AI models like Gemini 2.5 Pro and GPT-5 excel at structured, rule-driven tasks — matching data sets, ranking risks, extracting information — with remarkable precision. But when ambiguity rises, so does divergence. Across the 120 tasks tested, the more open-ended the problem, the wider the spread between best and worst AI model performance.

This pattern doesn’t diminish the technology’s general value; AI already performs reliably where the parameters are explicit. The challenge is that compliance rarely deals in perfect clarity. Policies evolve, regulation interpretations vary, and ethical questions require more than binary answers. That is precisely where human interpretation — context, empathy, proportionality — remains irreplaceable. The report makes this visible: AI can lighten the workload, but only people can decide what the results mean.

business using ai concept brain executive running

Featured

What Boards & Executives Need to Know (and Ask) About Agentic AI

by Jim DeLoach
October 28, 2025

Read moreDetails

var jnews_module_65557_0_6914f10551bd4 = {“header_icon”:””,”first_title”:””,”second_title”:””,”url”:””,”header_type”:”heading_6″,”header_background”:””,”header_secondary_background”:””,”header_text_color”:””,”header_line_color”:””,”header_accent_color”:””,”header_filter_category”:””,”header_filter_author”:””,”header_filter_tag”:””,”header_filter_cpt_ctl-stories”:””,”header_filter_cpt_wpm-testimonial-category”:””,”header_filter_text”:”All”,”sticky_post”:false,”post_type”:”post”,”content_type”:”all”,”sponsor”:false,”number_post”:”1″,”post_offset”:0,”unique_content”:”disable”,”include_post”:”65481″,”included_only”:”true”,”exclude_post”:””,”include_category”:””,”exclude_category”:””,”include_author”:””,”include_tag”:””,”exclude_tag”:””,”exclude_visited_post”:false,”ctl-stories”:””,”wpm-testimonial-category”:””,”sort_by”:”latest”,”date_format”:”default”,”date_format_custom”:”Y\/m\/d”,”excerpt_length”:””,”excerpt_ellipsis”:””,”force_normal_image_load”:””,”main_custom_image_size”:”default”,”pagination_mode”:”disable”,”pagination_nextprev_showtext”:””,”pagination_number_post”:4,”pagination_scroll_limit”:0,”ads_type”:”disable”,”ads_position”:1,”ads_random”:””,”ads_image”:””,”ads_image_tablet”:””,”ads_image_phone”:””,”ads_image_link”:””,”ads_image_alt”:””,”ads_image_new_tab”:””,”google_publisher_id”:””,”google_slot_id”:””,”google_desktop”:”auto”,”google_tab”:”auto”,”google_phone”:”auto”,”content”:””,”ads_bottom_text”:””,”el_id”:””,”el_class”:””,”scheme”:””,”column_width”:”auto”,”title_color”:””,”accent_color”:””,”alt_color”:””,”excerpt_color”:””,”block_background”:””,”css”:””,”paged”:1,”column_class”:”jeg_col_3o3″,”class”:”jnews_block_12″};

Redefining what ‘good enough’ means in compliance

While our research highlights clear boundaries to what AI can currently do, it also shows just how far the technology has advanced. The strongest models — Gemini 2.5 Pro and GPT-5 — delivered overall accuracy above 85%, a level likely not possible just a year ago. Equally noteworthy is their reliability: Across structured output and open-ended tasks, only three clear hallucinations were identified, which translates to a rate of just 0.71%.

However, the data also highlights a key distinction often overlooked in the rush toward generative tools: Broad chatbots may feel accessible, but in compliance, accessibility without accuracy creates risk. Domain-specific AI — grounded in structure, governance and context — provides the reliability and accountability that true compliance work demands.

In compliance, reliability has a very specific meaning: not perfection but predictability within clearly defined risk thresholds. A 10-point margin of error for accuracy might be tolerable when reviewing employee sentiment analysis; it is unacceptable when classifying a whistleblower retaliation case. The difference lies not in the technology itself but in where it is applied and how results are reviewed.

This is also where human judgment remains indispensable. AI thrives when the question and output is clearly defined but still struggles when meaning must be inferred. Oversight, therefore, should not be universal — checking every output — but strategic. And this is crucial: Oversight must sit at the decision points where errors carry the highest ethical or legal weight. In practice, that means establishing checkpoints where judgment and proportionality matter most — final case closures, sanctions-screening exceptions or high-value third-party approvals. AI can handle the repetition, while humans must handle the risk, including knowing when AI itself introduces new risk.

From gatekeepers to collaborators

The idea of compliance teams working collaboratively with AI rather than supervising it from above may sound futuristic to some of the more skeptical or risk-averse professionals, but the data hints that this future is closer than we might think. Several of the tested workflows — such as a multi-step conflict-of-interest review process — already chain together tasks that models can complete reliably: categorizing a disclosure, routing it for review, even assessing the potential risk exposure.

Yet the same sequence also shows us the limits. When the task required identification of follow-up information needed to assess the risk more fully or recommending corrective action, performance dropped, and human validation remained essential. For now, that’s the right balance: AI can prepare a decision, not make it.

This gives us a clear trajectory. As model reasoning improves, oversight will need to evolve from gatekeeping to collaboration: humans shaping, training and auditing systems rather than merely reviewing the outputs. That shift won’t happen overnight, especially since most compliance teams are still in the experimentation stage with AI. But if the past year’s pace of model improvement is any indication — 17 percentage-point gains between older and the latest AI models within just nine months — then the profession must prepare now for a role that looks less like supervision and more like partnership.

The new skillset of compliance: understanding the loop

To stay credible, compliance professionals must evolve alongside the technology they increasingly use, manage and for which they are accountable. If this research makes one thing clear, it is that success with AI depends less on technical mastery and more on the ability to frame context and questions well. The strongest models performed best when the tasks and prompts were clear, structured and contextual — when human input gave them direction. In practice, that means compliance professionals must become translators between regulations, policies and AI usage within compliance programs.

The skills this requires are not exotic, but they are increasingly non-negotiable. Compliance professionals will need to understand where AI adds value, where it introduces bias or risk and how to test its reasoning against ethical and legal standards. Eventually, they’ll need to collaborate with data scientists as comfortably as with auditors and learn to read AI outputs with the same critical eye once reserved for human outputs.

Mindset matters just as much as method. Instead of viewing AI as a black box to supervise, compliance teams should see it as a system to design and refine. The most consistent performers — those achieving over 85% accuracy — succeeded when given tasks that mirrored structured workflows: defined, contextual and bounded by clear goals. In the same way, the human side of compliance should evolve more toward managing systems of judgment, not just enforcing rules.

Keeping humans in the loop, in other words, will soon mean more than approving an outcome. It will mean understanding the process that produced it as well as being able to explain it.

Taking the lead on ethical AI

As AI becomes more embedded in compliance operations — from triaging cases to analyzing disclosure patterns — accountability cannot be delegated to algorithms. It must be designed, documented and owned.

Regulatory frameworks are beginning to reflect this reality. The EU AI Act and updated guidance like the DOJ’s 2024 update, both place responsibility squarely on organizations to ensure that AI systems are explainable, auditable and aligned with governance standards. Yet compliance leaders should not wait to be directed. They must lead by defining guardrails, demanding transparency and embedding explainability into every workflow that relies on AI.

Our research provides both a warning and a roadmap. The technology is already capable of extraordinary precision when used correctly, but its variability in judgment-based tasks shows why human oversight remains irreplaceable. AI can automate the what; only humans can define the why and the how.

As compliance enters this next phase of digital transformation, the task is not to defend the human role, but to strengthen it and to ensure that integrity, accountability and ethical reasoning remain at the core of every system we build. The loop isn’t complete without us.

The post Where in the Loop? Testing AI Across 120 Compliance Tasks to Find Out Where Humans Are Most Needed appeared first on Corporate Compliance Insights.

You Might Also Like

US Finalizes CMMC Rule: Cybersecurity Verification Now Determines Contract Eligibility for Defense Contractors

Top 10 Risk & Compliance Trends for 2026

How 2025 Redefined Telemarketing Compliance

Advice for the AI Boom: Use the Tools, Not Too Much, Stay in Charge

Strange Bedfellows: How a Supreme Court Ruling Found Its Perfect Match in the Trump Administration

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Whatsapp Whatsapp LinkedIn Reddit Telegram Email Copy Link Print
Share
Previous Article From Prompt Injection To Account Takeover · Embrace The Red
Next Article From Prompt Injection To Account Takeover · Embrace The Red
Leave a comment Leave a comment

Comments (0) Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

235.3k Followers Like
69.1k Followers Follow
11.6k Followers Pin
56.4k Followers Follow
136k Subscribers Subscribe
4.4k Followers Follow
- Advertisement -
Ad imageAd image

Latest News

From Prompt Injection To Account Takeover · Embrace The Red
Pentesting 16 December 2025
6 Personalized Stationery Sets for a Fancy Kind of Sentimentality
ARCHITECTURE 16 December 2025
Switzerland to tighten rules on military service for dual nationals
SWITZERLAND 16 December 2025
From Prompt Injection To Account Takeover · Embrace The Red
Pentesting 16 December 2025
//

We influence 20 million users and is the number one business and technology news network on the planet

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Loading
Sécurité Helvétique News | AmyrisSécurité Helvétique News | Amyris
Follow US
© 2023 Sécurité Helvétique NEWS par Amyris Sarl. Tous droits réservés
Amyris news letter
Join Us!

Subscribe to our newsletter and never miss our latest news, podcasts etc..

Loading
Zero spam, Unsubscribe at any time.
login Amyris SH
Welcome Back!

Sign in to your account

Lost your password?