From intake to graph: how we model brand DNA
URLs, PDFs, Figma, and social each say something different
From intake to graph: how we model brand DNA
Ingest without losing nuance
A brand is not a palette. It is not a font stack. It is not a tagline. It is a network of relationships: voice shifts by context, colors have roles, messages ladder from broad to specific, and rules constrain behavior differently across channels.
When you flatten that into a spreadsheet (hex codes here, font names there, tagline at the top), you lose the grammar. You keep the nouns but discard the verbs.
BrandMythos starts from a different premise: brand is a graph, not a document. Every source contributes a slice of how the brand shows up in the world, and the job is to model relationships, not just extract data points. The goal is to build something that machines can query: "What tone should I use here? What color? What message shape?"
Five sources, five perspectives
URLs - the public face
Your website is the most visible expression of your brand. We crawl it and extract:
- Visual patterns: which colors appear where, which typography is used for what, what accent colors appear and in which contexts
- Voice patterns: how headlines differ from body copy, how CTAs are phrased, sentiment and formality across different page types
- Structural patterns: navigation hierarchy, page templates, component reuse, information architecture
- Behavioral patterns: what links are emphasized, what words are used in button copy, how urgency is conveyed
A homepage hero section tells us something different than a support FAQ page. Both are brand. Both matter. We infer that the brand voice might be different in different contexts by analyzing actual usage across your site.
PDFs - the canonical reference
Brand guidelines in PDF form contain the most explicit rules: "Use this color for CTAs." "Never use this word in marketing." "This is our messaging hierarchy." These rules are explicit. They are the source of truth for what the brand intends to be.
We parse the document structure:
- Extract numbered rules and constraints
- Identify voice guidelines with tone descriptors
- Find color roles and usage constraints
- Parse typography rules and size relationships
- Identify messaging hierarchies and key phrases
We then cross-reference PDF rules against what the website actually does. Discrepancies are flagged because often the guide says one thing and the site does another. This gives us confidence in both sources: the guide shows intent, the site shows reality.
Figma - the design system
Figma libraries contain design tokens in their purest form: named colors, type styles, spacing scales, component variants. We import these directly and map them to CSS custom properties and Tailwind config values.
The advantage of Figma as a source is precision and structure:
- Colors are already named (Primary, Secondary, Accent, Neutral)
- Typography has relationships (Heading XL, Heading LG, Body, Caption)
- Components are organized in variant matrices
- Spacing systems are regular and scalable
- A designer has already organized hierarchy
We preserve that structure rather than guessing from screenshots. We also extract constraints that live in Figma but not in the PDF: component dimensions, minimum touch targets, spacing between elements.
Drive folders - the institutional memory
Google Drive often contains brand assets that live nowhere else: logo variations, approved photography, presentation templates, internal style guides. We index these and extract:
- File names (tells us what assets exist and how they are categorized)
- Folder structure (tells us the hierarchy of importance)
- Sharing permissions (tells us what is public vs. internal vs. confidential)
- Metadata and descriptions (brand guidance that lives nowhere else)
These assets are often not in the official brand guide because they are too detailed. They represent institutional decisions: which photography style is approved, which logo layouts work, how the logo interacts with text.
Social feeds - the living voice
Your social media presence is where brand voice is most dynamic. The way you respond to comments, the tone of your captions, the style of your stories. All of this is brand data that no PDF captures.
We analyze recent posts across platforms (Twitter, LinkedIn, Instagram, TikTok, etc.) and extract:
- Tone patterns: formality level, emotion intensity, humor, urgency
- Language patterns: sentence length, word choice, use of jargon vs. plain language
- Engagement patterns: how you respond to comments, how you ask questions, how you handle criticism
- Visual patterns: color usage in graphics, typography choices, emoji usage
- Frequency and timing: how often you post, when you post, what days get engagement
This becomes part of the voice model because it shows how the brand actually communicates in real time, without the filter of formal guidelines.
The extraction pipeline
The process of converting five sources into a structured graph has stages:
Stage 1: Ingest and parse
- Crawl the website (headings, typography, colors, voice)
- Parse the PDF (rules, constraints, hierarchies)
- Import Figma components and tokens
- Index Drive files and metadata
- Fetch and analyze social media posts
Stage 2: Extract entities
Every source contributes entities:
- Colors: hex value, name, usage contexts, constraints
- Typography: font name, weights, sizes, usage (heading vs. body), line height rules
- Components: name, variants, when to use, constraints
- Values: stated brand values and what they mean in practice
- Personas: who the brand talks to and how
- Voice contexts: support, marketing, sales, legal, social
- Rules: explicit constraints and guidelines
Stage 3: Infer relationships
Now we build the graph by connecting entities:
- Color Primary -> used in CTAs (from PDF)
- Color Primary -> used in hero sections (from website)
- Color Primary -> never on light backgrounds (from PDF)
- Voice context support -> tone: empathetic (from PDF)
- Voice context support -> tone: concise (from website analysis)
- Typography headline -> font: Instrument Serif (from Figma)
- Persona product manager -> receives marketing voice, not support voice
Stage 4: Validate and cross-reference
Check for contradictions:
- Does the site use colors as described in the PDF?
- Do social posts match the stated voice?
- Are Figma components named consistently with actual website?
- Does the imagery on the site match approved photography styles?
When we find discrepancies, we flag them. Sometimes the PDF is outdated. Sometimes the website has evolved beyond the guide. The goal is not to enforce the PDF blindly, but to understand the current state.
Beyond hex codes: modeling relationships
Extraction is not a color picker exercise. We model when a color is used, how voice shifts by channel, and which messages belong in which context.
The knowledge graph
Below is a simplified JSON-LD shape showing how we model brand entities and their relationships:
{
"@context": "https://schema.org",
"@type": "Brand",
"name": "Example Co",
"slogan": "Clarity at scale",
"knowsAbout": ["B2B SaaS", "design systems"],
"hasBrandValue": [
{ "name": "Clarity", "definition": "No jargon, no fluff. Ideas stated plainly." },
{ "name": "Evidence", "definition": "Every claim backed by data or user research." }
],
"hasOfferCatalog": {
"@type": "OfferCatalog",
"name": "Voice contexts",
"itemListElement": [
{ "@type": "ListItem", "name": "support", "tone": ["Empathetic", "Concise", "Action-oriented"], "example": "We apologize for the issue. Here is how to fix it." },
{ "@type": "ListItem", "name": "marketing", "tone": ["Bold", "Evidence-led"], "example": "87% of teams ship faster with our tool." }
]
},
"hasBrandColor": [
{ "name": "Primary", "hex": "#9c4221", "usage": ["CTAs", "hero sections", "key links"], "avoid": ["body text", "light backgrounds"] },
{ "name": "Accent", "hex": "#d97706", "usage": ["secondary actions", "highlights"], "contrast": "AA on primary" }
],
"typography": [
{ "name": "Headlines", "font": "Instrument Serif", "weights": ["regular"], "tracking": "0.02em" },
{ "name": "Body", "font": "DM Sans", "size": "16px", "lineHeight": "1.6" }
]
}
The graph is not flat. Colors have usage rules and constraints. Voice contexts have tone attributes and example messages. Values have definitions that explain what they mean in practice. Every node connects to others through typed relationships.
This is what machines need. They can query:
- "What tone should I use in a support email?" -> Look up VoiceContext(support) -> get tone array
- "Can I use primary color for body text?" -> Look up Color(primary) -> check avoid list
- "What values drive our messaging?" -> Look up BrandValue entities -> understand constraints
For a deeper dive into building these graphs, see our guide to brand knowledge graphs for AI.
How agents use the graph at runtime
When an agent generates content (a support email, a landing page, a code comment), it:
- Loads the graph into memory or context
- Identifies the context (support email = support voice context)
- Queries the graph for rules that apply ("What tone should I use?")
- Applies the rules to generation ("Use empathetic, concise, action-oriented tone")
- Generates output that satisfies the constraints
- Validates output against the graph ("Does this email contain the tone attributes?")
This is deterministic. The agent is not guessing. It is applying explicit rules from a structured source.
Ship and diff
Every export is versioned like code. When the brand evolves (new color role, voice refinement, updated messaging hierarchy), you see exactly what changed in a pull request.
## Voice - Marketing context
- Bold claims backed by customer stories.
+ Bold claims backed by quantitative data. Customer stories in support copy only.
This is how governance works at scale. Not by sending updated PDFs to a distribution list, but by reviewing brand changes the same way you review code changes: with diffs, approvals, and an audit trail.
The output formats
From a single intake, BrandMythos generates seven output formats:
-
CLAUDE.md - Universal agent instructions for Claude, ChatGPT, Copilot (narrative rules). This is for language models and tells them your voice, values, constraints, and how to behave.
-
.cursorrules - Editor-specific rules for Cursor code completion. These are code-level constraints: naming conventions, component patterns, styling rules.
-
AGENTS.md - Governance file defining what agents can and cannot do. This is the scope boundary: what decisions agents can make autonomously, what requires human review, what is forbidden.
-
Design tokens - CSS custom properties and Tailwind config. Visual rules as executable code that developers can use directly.
-
System prompts - For ChatGPT Custom Instructions and Gemini. Tool-specific instruction formats for other AI platforms.
-
JSON-LD knowledge graph - Machine-queryable brand entities and relationships (the graph itself). This is what agents load to query brand rules at runtime.
-
HTML brand guide - Shareable, hosted reference for humans. A beautiful, readable version of the brand guide for team members and stakeholders.
Each format serves a different tool, a different team, and a different purpose. Together they replace the PDF not with another document, but with loadable infrastructure.
The magic is that all seven formats are generated from a single extracted graph. When you update the brand, you regenerate all seven. Everything stays in sync automatically.
Why this matters now
The research is clear: 88% of companies use AI daily, and 78% of employees bring their own AI tools. Every one of those interactions is an opportunity for brand drift or brand reinforcement.
The companies winning are not the ones with the best PDFs. They are the ones whose brand is embedded in the tools doing the work. They have structured, queryable brand graphs that agents can load and apply at runtime.
Try BrandMythos with your brand. Enter your URL and see your brand DNA extracted in minutes.
Stay in the loop
Get brand intelligence insights delivered
Occasional deep dives on brand systems, AI governance, and what happens when guidelines become loadable infrastructure.
No spam. Unsubscribe anytime.
Share this article
Keep reading
How to Write a CLAUDE.md File for Your Brand (Complete Guide)
The definitive guide to writing a CLAUDE
The definitive guide to writing a CLAUDE
brandmythos.comHow to Write a CLAUDE.md File for Your Brand (Complete Guide)
Apr 2, 2026
What Are Design Tokens and Why Your AI Needs Them
Design tokens are the atomic units of your visual system
What Are Design Tokens and Why Your AI Needs Them
Mar 30, 2026
What Is AGENTS.md? The New Standard for AI Brand Governance
AGENTS
What Is AGENTS.md? The New Standard for AI Brand Governance
Mar 28, 2026
Ready to try it?
See your brand DNA structured for agents
Enter your URL. BrandMythos extracts voice, visuals, and rules into CLAUDE.md, design tokens, and structured graphs your tools can load.