How One AI Team Member Keeps 20,000 Web Pages on Brand

Where it started

Tone Tara started because one editor at HAN University was manually checking brand voice across 20,000+ web pages, a task that was not sustainable.

Five years ago, our university website (20,000+ pages) had to be migrated to a new brand style. It happened fast. Too fast to check whether all content actually matched the new tone of voice.

The result: hundreds of legacy pages that were never reviewed. One editor was responsible for the brand voice across the entire site. She had to search for pages manually and read each one to check compliance. That is not a job. That is a sentence.

I was curious: could we score all 20,000 pages automatically?

Phase 1: the scoring sheet

The first version scored all 20,000 HAN web pages on brand voice compliance in a single Google Sheet run, turning months of manual reading into a priority list.

I built a prompt that analysed web pages against our tone of voice guidelines and scored them from 0 (poor) to 10 (perfect) in a Google Sheet.

The entire page inventory, scored in one run. What used to be months of manual reading became a priority list. The editor finally knew where to start.

That solved the diagnostic problem. But it did not help the other 80 editors who write new content every day.

Phase 2: from score sheet to chatbot

Phase 2 turned the scoring prompt into a chatbot that any of the 80+ editors at HAN University could use to check and rewrite their own content in the correct brand voice.

If the scoring prompt worked so well, why not turn it into something every editor could use? A chatbot where they paste their text and get it back in the right tone of voice.

The responsible editor and I started building together. I handled the AI architecture, she provided the data and quality judgment. That split mattered. She knew what good brand voice looked like. I knew how to teach it to the AI.

The iteration that made the difference

Five iterations over several months transformed Tara from mediocre to production-ready, and every improvement came from feeding the AI better knowledge, not changing the technology.

The first version was functional but not good enough. Each version got noticeably better. Not because we changed the technology, but because we gave the AI better knowledge.

Version 1

Uploaded the tone of voice guidelines as a PDF. The AI could read them but produced mediocre results. It followed the rules loosely, like a new hire who read the manual once.

Version 2

Converted the PDF to structured markdown. Better parsing, more precise output. The AI understood the guidelines more accurately.

Version 3

Added example texts that demonstrated good brand voice. Now the AI had a reference point: not just rules, but real examples of what right looks like.

Version 4

Added bad examples with the editor's rewrites. This was the biggest jump in quality. The AI could now see the gap between wrong and right, and how to close it.

Version 5

Added a list of words we do not use, paired with the words we prefer instead. This gave the AI a concrete vocabulary filter on top of the style rules.

The lesson: every version improvement came from better data. The AI platform stayed the same. The knowledge got sharper. That is always the pattern.

Going live early, improving continuously

Tara went live before she was perfect, and real usage from 80+ editors produced better feedback than any internal testing round.

We did not wait until Tara was perfect. We went live relatively early and improved based on real usage.

Active feedback collection. We asked editors directly: where does Tara get it wrong? What suggestions feel off? Every piece of feedback became a concrete improvement.

Training integration. We brought Tara into the existing tone of voice training sessions. Editors saw her in action, tried her during the training, and gave immediate input on what worked and what did not. This created a feedback loop we could not have manufactured otherwise.

What Tara looks like today

Tara analyses text against four brand voice dimensions, scores each dimension, gives specific improvement suggestions, and rewrites text in the correct tone. She knows 12 audience types, each with their own accent on the brand voice. She has a list of 879 outdated words with modern alternatives.

She has been used thousands of times by over 80 editors. The editor who was once responsible for manually checking every page now manages Tara instead of managing pages.

The impact

Tone Tara reduced brand voice review time from 15-20 minutes to 2-3 minutes per text and scaled coverage from 1 editor to 80+ editors across HAN University.

Metric	Before Tara	After Tara
Brand voice checks	1 editor, manual, page by page	80+ editors, instant, self-service
Time per text review	15-20 minutes	2-3 minutes
Coverage	Reactive, one person's capacity	Built into every editor's workflow
Knowledge files	1 PDF, loosely interpreted	7 structured files, continuously updated
Feedback cycle	Editor corrects after the fact	AI corrects during writing
Word list	Not enforced	879 words with preferred alternatives
Usage	N/A	Thousands of uses by 80+ editors

Why this case matters

The Tone Tara case demonstrates four principles that apply to any AI team member built with the BUILD framework: start with a real problem, build with the domain expert, iterate on knowledge not technology, and go live early.

Start with a real problem. Not "let's try AI" but "one person is manually checking 20,000 pages and that is not sustainable."

Build with the domain expert. The responsible editor was not a user of Tara. She was a co-builder. Her knowledge of what good brand voice looks like was the most important input.

Iterate on knowledge, not technology. Every version improvement came from better data: structured guidelines, good examples, bad examples with corrections, word lists. The AI platform stayed the same. The knowledge got sharper.

Go live early, collect feedback. Perfection before launch is a myth. Real usage produces real feedback. Training sessions produced the best feedback of all.

Give it time. Tara was not built in an hour. She was built in months. Some AI team members are quick wins. Others need sustained investment. Both are valid.

The BUILD framework in practice

Tara follows the same BUILD framework as Social Media Maik, but the timescale is different:

Begin with goal: Score 20,000 pages on brand voice compliance, then help 80+ editors write in the correct tone.

Unpack skills: How does the responsible editor actually evaluate brand voice? What does she look for? What mistakes does she correct most often?

Identify knowledge: Tone of voice guidelines, audience descriptions, example texts (good and bad), word lists, spelling rules. Seven knowledge files in total.

Layout instructions: Structured analysis workflow with scoring per dimension, concrete suggestions, and full rewrites.

Debug and improve: Five iterations over several months, driven by editor feedback and training session input.

The difference is that Maik's Debug step took a week. Tara's took months. The principle is the same. The investment scales with the complexity of the task.

How do I build something like this for my team?

Start with the BUILD framework. The five steps are the same whether you are building a quick social media assistant or a complex brand voice checker. What changes is the depth of each step.

For a project like Tara, plan for iteration. Your first version will be mediocre. That is normal. The quality comes from feeding the AI better knowledge over time, ideally together with the person who knows the domain best.

The full BUILD framework guide is available at guuswitjes.com/build-framework.

Frequently asked questions

How long does it take to build an AI brand voice checker?

Tone Tara took several months to build through five iterations. Each version improved because the knowledge behind it got sharper: from a PDF upload to structured guidelines, example texts, bad-to-good rewrites, and a 879-word ban list. Simple AI team members can be built in a day, but complex ones like Tara need sustained investment.

Can AI really check tone of voice accurately?

Yes, when given the right data. A generic AI produces generic results. But with structured brand voice guidelines, good and bad examples with corrections, audience profiles, and a concrete word list, an AI team member can check tone of voice consistently across thousands of pages. The quality depends on the knowledge you feed it, not the tool itself.

What is the BUILD framework for AI team members?

BUILD is a five-step framework for creating AI team members: Begin with your goal, Unpack the skills needed, Identify the knowledge required, Layout the instructions, and Debug and improve. It works for both quick wins (like a social media assistant) and complex projects (like a brand voice checker).

How many people can use one AI team member?

Tone Tara is used by over 80 editors across HAN University. An AI team member can scale to any number of users because it follows the same instructions consistently every time. The knowledge base is built once and serves everyone.

What made the biggest difference in AI output quality?

Adding bad examples with the editor's corrections was the single biggest quality jump. The AI could then see the gap between wrong and right, and how to close it. Every version improvement came from better data, not a change in technology.

How one AI team member keeps 20,000 web pages on brand

Where it started

Phase 1: the scoring sheet

Phase 2: from score sheet to chatbot

The iteration that made the difference

Going live early, improving continuously

What Tara looks like today

The impact

Why this case matters

The BUILD framework in practice

How do I build something like this for my team?

Frequently asked questions

How long does it take to build an AI brand voice checker?

Can AI really check tone of voice accurately?

What is the BUILD framework for AI team members?

How many people can use one AI team member?

What made the biggest difference in AI output quality?

Want to build your AI team member?