BrandMythos

From intake to graph: how we model brand DNA

URLs, PDFs, Figma, and social each say something different

brandmythos.com

Knowledge GraphBrand DNATechnical

From intake to graph: how we model brand DNA

BrandMythos TeamApr 1, 202614 min read

Ingest without losing nuance

A brand is not a palette. It is not a font stack. It is not a tagline. It is a network of relationships: voice shifts by context, colors have roles, messages ladder from broad to specific, and rules constrain behavior differently across channels.

When you flatten that into a spreadsheet (hex codes here, font names there, tagline at the top), you lose the grammar. You keep the nouns but discard the verbs.

BrandMythos starts from a different premise: brand is a graph, not a document. Every source contributes a slice of how the brand shows up in the world, and the job is to model relationships, not just extract data points. The goal is to build something that machines can query: "What tone should I use here? What color? What message shape?"

Five sources, five perspectives

URLs - the public face

Your website is the most visible expression of your brand. We crawl it and extract:

Visual patterns: which colors appear where, which typography is used for what, what accent colors appear and in which contexts
Voice patterns: how headlines differ from body copy, how CTAs are phrased, sentiment and formality across different page types
Structural patterns: navigation hierarchy, page templates, component reuse, information architecture
Behavioral patterns: what links are emphasized, what words are used in button copy, how urgency is conveyed

A homepage hero section tells us something different than a support FAQ page. Both are brand. Both matter. We infer that the brand voice might be different in different contexts by analyzing actual usage across your site.

PDFs - the canonical reference

Brand guidelines in PDF form contain the most explicit rules: "Use this color for CTAs." "Never use this word in marketing." "This is our messaging hierarchy." These rules are explicit. They are the source of truth for what the brand intends to be.

We parse the document structure:

Extract numbered rules and constraints
Identify voice guidelines with tone descriptors
Find color roles and usage constraints
Parse typography rules and size relationships
Identify messaging hierarchies and key phrases

We then cross-reference PDF rules against what the website actually does. Discrepancies are flagged because often the guide says one thing and the site does another. This gives us confidence in both sources: the guide shows intent, the site shows reality.

Figma - the design system

Figma libraries contain design tokens in their purest form: named colors, type styles, spacing scales, component variants. We import these directly and map them to CSS custom properties and Tailwind config values.

The advantage of Figma as a source is precision and structure:

Colors are already named (Primary, Secondary, Accent, Neutral)
Typography has relationships (Heading XL, Heading LG, Body, Caption)
Components are organized in variant matrices
Spacing systems are regular and scalable
A designer has already organized hierarchy

We preserve that structure rather than guessing from screenshots. We also extract constraints that live in Figma but not in the PDF: component dimensions, minimum touch targets, spacing between elements.

Drive folders - the institutional memory

Google Drive often contains brand assets that live nowhere else: logo variations, approved photography, presentation templates, internal style guides. We index these and extract:

File names (tells us what assets exist and how they are categorized)
Folder structure (tells us the hierarchy of importance)
Sharing permissions (tells us what is public vs. internal vs. confidential)
Metadata and descriptions (brand guidance that lives nowhere else)

These assets are often not in the official brand guide because they are too detailed. They represent institutional decisions: which photography style is approved, which logo layouts work, how the logo interacts with text.

Social feeds - the living voice

Your social media presence is where brand voice is most dynamic. The way you respond to comments, the tone of your captions, the style of your stories. All of this is brand data that no PDF captures.

We analyze recent posts across platforms (Twitter, LinkedIn, Instagram, TikTok, etc.) and extract:

Tone patterns: formality level, emotion intensity, humor, urgency
Language patterns: sentence length, word choice, use of jargon vs. plain language
Engagement patterns: how you respond to comments, how you ask questions, how you handle criticism
Visual patterns: color usage in graphics, typography choices, emoji usage
Frequency and timing: how often you post, when you post, what days get engagement

This becomes part of the voice model because it shows how the brand actually communicates in real time, without the filter of formal guidelines.

The extraction pipeline

The process of converting five sources into a structured graph has stages:

Stage 1: Ingest and parse

Crawl the website (headings, typography, colors, voice)
Parse the PDF (rules, constraints, hierarchies)
Import Figma components and tokens
Index Drive files and metadata
Fetch and analyze social media posts

Stage 2: Extract entities

Every source contributes entities:

Colors: hex value, name, usage contexts, constraints
Typography: font name, weights, sizes, usage (heading vs. body), line height rules
Components: name, variants, when to use, constraints
Values: stated brand values and what they mean in practice
Personas: who the brand talks to and how
Voice contexts: support, marketing, sales, legal, social
Rules: explicit constraints and guidelines

Stage 3: Infer relationships

Now we build the graph by connecting entities:

Color Primary -> used in CTAs (from PDF)
Color Primary -> used in hero sections (from website)
Color Primary -> never on light backgrounds (from PDF)
Voice context support -> tone: empathetic (from PDF)
Voice context support -> tone: concise (from website analysis)
Typography headline -> font: Instrument Serif (from Figma)
Persona product manager -> receives marketing voice, not support voice

Stage 4: Validate and cross-reference

Check for contradictions:

Does the site use colors as described in the PDF?
Do social posts match the stated voice?
Are Figma components named consistently with actual website?
Does the imagery on the site match approved photography styles?

When we find discrepancies, we flag them. Sometimes the PDF is outdated. Sometimes the website has evolved beyond the guide. The goal is not to enforce the PDF blindly, but to understand the current state.

Beyond hex codes: modeling relationships

Extraction is not a color picker exercise. We model when a color is used, how voice shifts by channel, and which messages belong in which context.

The knowledge graph

Below is a simplified JSON-LD shape showing how we model brand entities and their relationships:

{
  "@context": "https://schema.org",
  "@type": "Brand",
  "name": "Example Co",
  "slogan": "Clarity at scale",
  "knowsAbout": ["B2B SaaS", "design systems"],

  "hasBrandValue": [
    { "name": "Clarity", "definition": "No jargon, no fluff. Ideas stated plainly." },
    { "name": "Evidence", "definition": "Every claim backed by data or user research." }
  ],

  "hasOfferCatalog": {
    "@type": "OfferCatalog",
    "name": "Voice contexts",
    "itemListElement": [
      { "@type": "ListItem", "name": "support", "tone": ["Empathetic", "Concise", "Action-oriented"], "example": "We apologize for the issue. Here is how to fix it." },
      { "@type": "ListItem", "name": "marketing", "tone": ["Bold", "Evidence-led"], "example": "87% of teams ship faster with our tool." }
    ]
  },

  "hasBrandColor": [
    { "name": "Primary", "hex": "#9c4221", "usage": ["CTAs", "hero sections", "key links"], "avoid": ["body text", "light backgrounds"] },
    { "name": "Accent", "hex": "#d97706", "usage": ["secondary actions", "highlights"], "contrast": "AA on primary" }
  ],

  "typography": [
    { "name": "Headlines", "font": "Instrument Serif", "weights": ["regular"], "tracking": "0.02em" },
    { "name": "Body", "font": "DM Sans", "size": "16px", "lineHeight": "1.6" }
  ]
}

The graph is not flat. Colors have usage rules and constraints. Voice contexts have tone attributes and example messages. Values have definitions that explain what they mean in practice. Every node connects to others through typed relationships.

This is what machines need. They can query:

"What tone should I use in a support email?" -> Look up VoiceContext(support) -> get tone array
"Can I use primary color for body text?" -> Look up Color(primary) -> check avoid list
"What values drive our messaging?" -> Look up BrandValue entities -> understand constraints

For a deeper dive into building these graphs, see our guide to brand knowledge graphs for AI.

How agents use the graph at runtime

When an agent generates content (a support email, a landing page, a code comment), it:

Loads the graph into memory or context
Identifies the context (support email = support voice context)
Queries the graph for rules that apply ("What tone should I use?")
Applies the rules to generation ("Use empathetic, concise, action-oriented tone")
Generates output that satisfies the constraints
Validates output against the graph ("Does this email contain the tone attributes?")

This is deterministic. The agent is not guessing. It is applying explicit rules from a structured source.

Ship and diff

Every export is versioned like code. When the brand evolves (new color role, voice refinement, updated messaging hierarchy), you see exactly what changed in a pull request.

## Voice - Marketing context
- Bold claims backed by customer stories.
+ Bold claims backed by quantitative data. Customer stories in support copy only.

This is how governance works at scale. Not by sending updated PDFs to a distribution list, but by reviewing brand changes the same way you review code changes: with diffs, approvals, and an audit trail.

The output formats

From a single intake, BrandMythos generates seven output formats:

CLAUDE.md - Universal agent instructions for Claude, ChatGPT, Copilot (narrative rules). This is for language models and tells them your voice, values, constraints, and how to behave.
.cursorrules - Editor-specific rules for Cursor code completion. These are code-level constraints: naming conventions, component patterns, styling rules.
AGENTS.md - Governance file defining what agents can and cannot do. This is the scope boundary: what decisions agents can make autonomously, what requires human review, what is forbidden.
Design tokens - CSS custom properties and Tailwind config. Visual rules as executable code that developers can use directly.
System prompts - For ChatGPT Custom Instructions and Gemini. Tool-specific instruction formats for other AI platforms.
JSON-LD knowledge graph - Machine-queryable brand entities and relationships (the graph itself). This is what agents load to query brand rules at runtime.
HTML brand guide - Shareable, hosted reference for humans. A beautiful, readable version of the brand guide for team members and stakeholders.

Each format serves a different tool, a different team, and a different purpose. Together they replace the PDF not with another document, but with loadable infrastructure.

The magic is that all seven formats are generated from a single extracted graph. When you update the brand, you regenerate all seven. Everything stays in sync automatically.

Why this matters now

The research is clear: 88% of companies use AI daily, and 78% of employees bring their own AI tools. Every one of those interactions is an opportunity for brand drift or brand reinforcement.

The companies winning are not the ones with the best PDFs. They are the ones whose brand is embedded in the tools doing the work. They have structured, queryable brand graphs that agents can load and apply at runtime.

Try BrandMythos with your brand. Enter your URL and see your brand DNA extracted in minutes.

Stay in the loop

Get brand intelligence insights delivered

Occasional deep dives on brand systems, AI governance, and what happens when guidelines become loadable infrastructure.

No spam. Unsubscribe anytime.

Share this article

Keep reading

The ROI of Brand Governance: Building the Business Case for Consistency

Brand governance is not a design initiative

BrandMythos

Brand governance is not a design initiative

brandmythos.com

The ROI of Brand Governance: Building the Business Case for Consistency

Apr 10, 2026

brandmythos.com

BrandMythos

Brand Voice for AI Chatbots: Writing System Prompts That Sound Like You

Your chatbot speaks to customers more often than your...

Brand Voice for AI Chatbots: Writing System Prompts That Sound Like You

Apr 9, 2026

brandmythos.com

BrandMythos

From Figma to AI: How Design Systems Become Code Generators

Your Figma design system is full of decisions: colors,...

From Figma to AI: How Design Systems Become Code Generators

Apr 8, 2026

Ready to try it?

See your brand DNA structured for agents

Enter your URL. BrandMythos extracts voice, visuals, and rules into CLAUDE.md, design tokens, and structured graphs your tools can load.

No credit cardResults in minutesExport to 8+ formats

BrandMythos

From intake to graph: how we model brand DNA

URLs, PDFs, Figma, and social each say something different

brandmythos.com

Knowledge GraphBrand DNATechnical

From intake to graph: how we model brand DNA

BrandMythos TeamApr 1, 202614 min read

Ingest without losing nuance

When you flatten that into a spreadsheet (hex codes here, font names there, tagline at the top), you lose the grammar. You keep the nouns but discard the verbs.

Five sources, five perspectives

URLs - the public face

Your website is the most visible expression of your brand. We crawl it and extract:

Visual patterns: which colors appear where, which typography is used for what, what accent colors appear and in which contexts
Voice patterns: how headlines differ from body copy, how CTAs are phrased, sentiment and formality across different page types
Structural patterns: navigation hierarchy, page templates, component reuse, information architecture
Behavioral patterns: what links are emphasized, what words are used in button copy, how urgency is conveyed

PDFs - the canonical reference

We parse the document structure:

Extract numbered rules and constraints
Identify voice guidelines with tone descriptors
Find color roles and usage constraints
Parse typography rules and size relationships
Identify messaging hierarchies and key phrases

Figma - the design system

The advantage of Figma as a source is precision and structure:

Colors are already named (Primary, Secondary, Accent, Neutral)
Typography has relationships (Heading XL, Heading LG, Body, Caption)
Components are organized in variant matrices
Spacing systems are regular and scalable
A designer has already organized hierarchy

Drive folders - the institutional memory

Google Drive often contains brand assets that live nowhere else: logo variations, approved photography, presentation templates, internal style guides. We index these and extract:

File names (tells us what assets exist and how they are categorized)
Folder structure (tells us the hierarchy of importance)
Sharing permissions (tells us what is public vs. internal vs. confidential)
Metadata and descriptions (brand guidance that lives nowhere else)

Social feeds - the living voice

Your social media presence is where brand voice is most dynamic. The way you respond to comments, the tone of your captions, the style of your stories. All of this is brand data that no PDF captures.

We analyze recent posts across platforms (Twitter, LinkedIn, Instagram, TikTok, etc.) and extract:

Tone patterns: formality level, emotion intensity, humor, urgency
Language patterns: sentence length, word choice, use of jargon vs. plain language
Engagement patterns: how you respond to comments, how you ask questions, how you handle criticism
Visual patterns: color usage in graphics, typography choices, emoji usage
Frequency and timing: how often you post, when you post, what days get engagement

This becomes part of the voice model because it shows how the brand actually communicates in real time, without the filter of formal guidelines.

The extraction pipeline

The process of converting five sources into a structured graph has stages:

Stage 1: Ingest and parse

Crawl the website (headings, typography, colors, voice)
Parse the PDF (rules, constraints, hierarchies)
Import Figma components and tokens
Index Drive files and metadata
Fetch and analyze social media posts

Stage 2: Extract entities

Every source contributes entities:

Colors: hex value, name, usage contexts, constraints
Typography: font name, weights, sizes, usage (heading vs. body), line height rules
Components: name, variants, when to use, constraints
Values: stated brand values and what they mean in practice
Personas: who the brand talks to and how
Voice contexts: support, marketing, sales, legal, social
Rules: explicit constraints and guidelines

Stage 3: Infer relationships

Now we build the graph by connecting entities:

Color Primary -> used in CTAs (from PDF)
Color Primary -> used in hero sections (from website)
Color Primary -> never on light backgrounds (from PDF)
Voice context support -> tone: empathetic (from PDF)
Voice context support -> tone: concise (from website analysis)
Typography headline -> font: Instrument Serif (from Figma)
Persona product manager -> receives marketing voice, not support voice

Stage 4: Validate and cross-reference

Check for contradictions:

Does the site use colors as described in the PDF?
Do social posts match the stated voice?
Are Figma components named consistently with actual website?
Does the imagery on the site match approved photography styles?

Beyond hex codes: modeling relationships

Extraction is not a color picker exercise. We model when a color is used, how voice shifts by channel, and which messages belong in which context.

The knowledge graph

Below is a simplified JSON-LD shape showing how we model brand entities and their relationships:

{
  "@context": "https://schema.org",
  "@type": "Brand",
  "name": "Example Co",
  "slogan": "Clarity at scale",
  "knowsAbout": ["B2B SaaS", "design systems"],

  "hasBrandValue": [
    { "name": "Clarity", "definition": "No jargon, no fluff. Ideas stated plainly." },
    { "name": "Evidence", "definition": "Every claim backed by data or user research." }
  ],

  "hasOfferCatalog": {
    "@type": "OfferCatalog",
    "name": "Voice contexts",
    "itemListElement": [
      { "@type": "ListItem", "name": "support", "tone": ["Empathetic", "Concise", "Action-oriented"], "example": "We apologize for the issue. Here is how to fix it." },
      { "@type": "ListItem", "name": "marketing", "tone": ["Bold", "Evidence-led"], "example": "87% of teams ship faster with our tool." }
    ]
  },

  "hasBrandColor": [
    { "name": "Primary", "hex": "#9c4221", "usage": ["CTAs", "hero sections", "key links"], "avoid": ["body text", "light backgrounds"] },
    { "name": "Accent", "hex": "#d97706", "usage": ["secondary actions", "highlights"], "contrast": "AA on primary" }
  ],

  "typography": [
    { "name": "Headlines", "font": "Instrument Serif", "weights": ["regular"], "tracking": "0.02em" },
    { "name": "Body", "font": "DM Sans", "size": "16px", "lineHeight": "1.6" }
  ]
}

This is what machines need. They can query:

"What tone should I use in a support email?" -> Look up VoiceContext(support) -> get tone array
"Can I use primary color for body text?" -> Look up Color(primary) -> check avoid list
"What values drive our messaging?" -> Look up BrandValue entities -> understand constraints

For a deeper dive into building these graphs, see our guide to brand knowledge graphs for AI.

How agents use the graph at runtime

When an agent generates content (a support email, a landing page, a code comment), it:

Loads the graph into memory or context
Identifies the context (support email = support voice context)
Queries the graph for rules that apply ("What tone should I use?")
Applies the rules to generation ("Use empathetic, concise, action-oriented tone")
Generates output that satisfies the constraints
Validates output against the graph ("Does this email contain the tone attributes?")

This is deterministic. The agent is not guessing. It is applying explicit rules from a structured source.

Ship and diff

Every export is versioned like code. When the brand evolves (new color role, voice refinement, updated messaging hierarchy), you see exactly what changed in a pull request.

## Voice - Marketing context
- Bold claims backed by customer stories.
+ Bold claims backed by quantitative data. Customer stories in support copy only.

The output formats

From a single intake, BrandMythos generates seven output formats:

CLAUDE.md - Universal agent instructions for Claude, ChatGPT, Copilot (narrative rules). This is for language models and tells them your voice, values, constraints, and how to behave.
.cursorrules - Editor-specific rules for Cursor code completion. These are code-level constraints: naming conventions, component patterns, styling rules.
AGENTS.md - Governance file defining what agents can and cannot do. This is the scope boundary: what decisions agents can make autonomously, what requires human review, what is forbidden.
Design tokens - CSS custom properties and Tailwind config. Visual rules as executable code that developers can use directly.
System prompts - For ChatGPT Custom Instructions and Gemini. Tool-specific instruction formats for other AI platforms.
JSON-LD knowledge graph - Machine-queryable brand entities and relationships (the graph itself). This is what agents load to query brand rules at runtime.
HTML brand guide - Shareable, hosted reference for humans. A beautiful, readable version of the brand guide for team members and stakeholders.

Each format serves a different tool, a different team, and a different purpose. Together they replace the PDF not with another document, but with loadable infrastructure.

The magic is that all seven formats are generated from a single extracted graph. When you update the brand, you regenerate all seven. Everything stays in sync automatically.

Why this matters now

The research is clear: 88% of companies use AI daily, and 78% of employees bring their own AI tools. Every one of those interactions is an opportunity for brand drift or brand reinforcement.

Try BrandMythos with your brand. Enter your URL and see your brand DNA extracted in minutes.

Stay in the loop

Get brand intelligence insights delivered

Occasional deep dives on brand systems, AI governance, and what happens when guidelines become loadable infrastructure.

No spam. Unsubscribe anytime.

Share this article

Keep reading

The ROI of Brand Governance: Building the Business Case for Consistency

Brand governance is not a design initiative

BrandMythos

Brand governance is not a design initiative

brandmythos.com

The ROI of Brand Governance: Building the Business Case for Consistency

Apr 10, 2026

brandmythos.com

BrandMythos

Brand Voice for AI Chatbots: Writing System Prompts That Sound Like You

Your chatbot speaks to customers more often than your...

Brand Voice for AI Chatbots: Writing System Prompts That Sound Like You

Apr 9, 2026

brandmythos.com

BrandMythos

From Figma to AI: How Design Systems Become Code Generators

Your Figma design system is full of decisions: colors,...

From Figma to AI: How Design Systems Become Code Generators

Apr 8, 2026

Ready to try it?

See your brand DNA structured for agents

Enter your URL. BrandMythos extracts voice, visuals, and rules into CLAUDE.md, design tokens, and structured graphs your tools can load.

No credit cardResults in minutesExport to 8+ formats