How to Set Up a Sparx EA Repository from Scratch: Step-by-Step Guide

Published: 2026-04-18 Category: How To Offering relevance: Deploy

Direct Answer

Setting up a Sparx EA repository from scratch involves seven sequential steps: choosing your database platform, installing Pro Cloud Server, creating the initial repository, configuring MDG extensions, setting up user security, establishing naming conventions and package structure, and running a domain pilot before full rollout. Each step has meaningful consequences for long-term repository health — decisions made at setup time are difficult and disruptive to reverse later. The most consequential decisions are naming conventions and package structure, made in steps 6 and 7. Get these right upfront. This guide walks through each step with the decision criteria you need. Sparx Services’ Deploy engagement covers all seven steps with expert governance design — for teams that do not want to navigate these decisions alone.

Key Takeaways

Repository setup has seven steps — and the early steps constrain the later ones.
Database choice (SQL Server/MySQL/PostgreSQL) depends on your existing infrastructure, not just EA requirements.
Pro Cloud Server (PCS) is required for team use — it is not optional.
MDG configuration determines what element types your team can model — activate ArchiMate first, add others as needed.
Package structure and naming conventions are the most consequential governance decisions at setup time.
Run a domain pilot (one architecture domain) before rolling out to the full team.
Sparx Services Deploy engagement covers all seven steps, including governance design decisions.

Step 1: Choose Your Database

The Sparx EA repository is a relational database. For team use (multi-user, server-hosted), you choose from:

SQL Server (Microsoft):

Best for: organisations with existing SQL Server infrastructure (Windows environments, Microsoft-stack organisations)
Advantages: mature enterprise support, strong tooling, Azure SQL compatibility for cloud hosting, Windows Authentication integration
Considerations: SQL Server licenses add cost if not already in the environment; requires DBA familiarity

MySQL:

Best for: Linux-hosted environments, organisations with existing MySQL capability, lower-cost setups
Advantages: open source, widely deployed, lower cost, good performance for EA repository sizes
Considerations: less native Windows integration than SQL Server; requires MySQL administration knowledge

PostgreSQL:

Best for: organisations already using PostgreSQL, cloud-native environments (AWS RDS PostgreSQL, Azure Database for PostgreSQL)
Advantages: open source, strong JSON/JSONB support, excellent performance, widely supported by cloud providers
Considerations: less common in traditional enterprise Windows environments

Decision criterion: Match your existing DBA capabilities and infrastructure. The Sparx EA repository is a standard relational database — it does not require special database features. If your team already manages SQL Server, use SQL Server. If you are building on Linux or AWS and your team knows PostgreSQL, use PostgreSQL. Do not introduce a new database platform for the EA repository alone.

Cloud vs on-premises: Cloud-hosted databases (Azure SQL, Amazon RDS, Google Cloud SQL) are recommended for new installations — managed backup, high availability, and scaling without DBA overhead. On-premises deployment is appropriate where data sovereignty or air-gap requirements mandate it.

Step 2: Install Pro Cloud Server

Pro Cloud Server (PCS) is Sparx Systems’ middleware layer that enables multi-user access to the Sparx EA repository. Without PCS, Sparx EA can only access a repository from a local file or direct database connection — neither is appropriate for team use.

What PCS provides:

Managed database connections — architects connect to the repository through PCS, not directly to the database
Floating license management — Sparx EA license pool managed through PCS (architects check out a license when connecting, return it when disconnecting)
Authentication and security — user access managed through PCS, with optional Active Directory integration
Remote access — teams can access the repository over HTTPS, not requiring VPN in all configurations
EA GraphLink integration — required for EA GraphLink Interface A and B (BI and AI connectivity)

Installation steps (high-level):

Install PCS on a server (Windows Server recommended for enterprise; Linux is supported)
Configure PCS to connect to your chosen database
Create the initial repository database through PCS
Configure authentication (local PCS users or Active Directory integration)
Test connection from a Sparx EA client

Cloud deployment: PCS runs on a VM. For cloud-hosted repositories, PCS on a cloud VM in the same region as the database is the standard configuration. Azure VM + Azure SQL, or AWS EC2 + RDS PostgreSQL, are common patterns.

Step 3: Create the Initial Repository

With PCS running and connected to the database, create the initial Sparx EA repository:

Open Sparx EA and connect to PCS
Create a new project (the repository) in PCS — this creates the Sparx EA schema in the database
Apply the initial model structure — at this stage, the repository is empty
Set up the initial package hierarchy (see Step 6)
Take a baseline of the empty, correctly-structured repository — your clean starting point

One repository vs multiple: Most organisations run a single production EA repository. Some large organisations run separate repositories by division or geography. The trade-off: one repository enables cross-domain analysis and single-source-of-truth governance; multiple repositories require data federation for enterprise-level analysis. Start with one unless there is a compelling reason for separation (data sovereignty, organisational independence).

Step 4: Configure MDG Extensions

MDG Technology extensions determine which element types, relationship types, and diagram types are available to your modeling team. This configuration is made at the repository level (not per user).

What to activate for a standard EA program:

ArchiMate 3.x MDG (mandatory for EA programs): Activate the built-in ArchiMate MDG. This provides the full ArchiMate notation — Strategy, Business, Application, Technology, Physical, and Implementation layers with all element and relationship types.

BPMN 2.0 MDG (activate if process modeling is in scope): For organisations that will model business processes alongside architecture (common in operational architecture and digital transformation programs).

TOGAF Content Metamodel MDG (activate if TOGAF-aligned): For programs that produce TOGAF deliverables (Principles, Requirements, Gaps, Work Packages). This MDG type adds TOGAF-specific element types to the repository.

SysML MDG (activate only if systems engineering is in scope): For programs that intersect with systems engineering — defence, aerospace, engineering-intensive industries. Do not activate SysML for a standard enterprise IT architecture program; it adds complexity that is not needed.

Custom organisational MDG (design and activate early): Your custom MDG — defining organisation-specific stereotypes, tagged values, and validation rules — should be designed (Amplify engagement) and activated early in the repository lifecycle. Retrofitting governance tagged values to an existing repository is painful.

What not to activate: Every MDG extension adds to the toolbox and diagram type list. Activating extensions that your team will not use creates clutter. Apply only what your program needs; add more as scope expands.

Step 5: Set Up User Security

With PCS managing access, configure role-based security before any architects start modeling:

User roles to define:

Repository Administrator: Full access — can manage packages, change MDG configuration, reset element locks, manage users. One or two people only.

Domain Architect (Read/Write): Read/write access to their assigned domain package; read-only access to other domains. Standard architect role. Example: Business Architect has full access to the Business Architecture package; read-only to Application Architecture and Technology Architecture packages.

Read-Only Stakeholder: Read access to specified packages or the entire repository. For project managers, business analysts, or executives who need to view architecture content but not modify it.

Automated Account: For scripts, EA GraphLink, and automation tools. Read-only access, separate credentials from human users.

Active Directory integration: If your organisation uses Active Directory, configure PCS to authenticate against AD. This allows Sparx EA repository access to be managed through existing identity management — no separate password management for the EA tool.

Package security in Sparx EA: Beyond PCS role definitions, Sparx EA’s model security feature allows package-level locking and protection. Domain architects can lock their packages against modification by others when working on sensitive content.

Step 6: Establish Naming Conventions and Package Structure

This is the most important governance decision in the repository setup. Decisions made here are very difficult to change after content is loaded — renaming conventions mid-flight produces inconsistency; restructuring package hierarchies breaks relationships and diagram references.

Package structure guidance:

Top-level packages should mirror your EA framework structure. For a TOGAF/ArchiMate-aligned program:

“ [Repository Root] ├── Architecture Vision │ ├── Principles │ └── Requirements ├── Business Architecture │ ├── Business Capabilities │ ├── Business Processes │ ├── Organisation │ └── Business Information ├── Application Architecture │ ├── Application Portfolio │ ├── Application Services │ └── Integration Architecture ├── Technology Architecture │ ├── Technology Standards │ ├── Infrastructure │ └── Cloud Platform ├── Decision Register │ ├── Strategic Decisions │ └── Domain Decisions └── Working / In Progress “

Adapt the package structure to your EA framework and your organisation’s domain model. Do not over-engineer the hierarchy — three levels is usually sufficient. Deep hierarchies become difficult to navigate.

Naming conventions:

Applications: Use canonical names consistent with how the application is known in procurement and contracts. “Salesforce CRM” not “SF” or “SFDC”. Prefer “Vendor Product” format for commercial software; “BusinessName Function” format for custom applications (“Finance Payment Processing”).

Capabilities: Noun + qualifier format. “Customer Onboarding”, “Real-Time Inventory Visibility”. Never verb-form (“Onboard Customers”).

Processes: Verb + noun format. “Process Insurance Claim”, “Onboard New Employee”. Processes are activities; name them as activities.

Elements generally: Title Case. No abbreviations unless universally understood in your organisation. Consistent vocabulary (use “Application” not alternating “Application” / “System” / “Solution”).

Document naming conventions in a Principles package in the repository — so they are a first-class architecture artefact, not a separate document that gets lost.

Step 7: Run a Domain Pilot Before Full Rollout

Do not attempt to populate the entire repository simultaneously. A domain pilot — building out one architecture domain fully before extending to others — de-risks the rollout and validates governance decisions before they are applied broadly.

Choosing the pilot domain:

Select a domain that:

Has an engaged, available architect
Has well-understood content (so modeling effort focuses on tool use, not content discovery)
Is representative of your broader architecture scope
Is important enough to demonstrate value to stakeholders

Application Portfolio is often a good pilot domain — it is tangible, executives care about it, and it tests the key MDG tagged values (lifecycle status, ownership, capability linkage) without requiring complex relationship modeling.

Pilot validation:

At the end of the domain pilot:

Run MDG validation against the pilot package — resolve all violations
Demonstrate the domain content to a stakeholder audience — test whether the package structure and naming conventions communicate effectively
Review the modeling experience with the pilot architect — are the MDG configuration and toolbox setup working smoothly?

Capture learnings before rollout:

The pilot will reveal gaps in your MDG design, naming convention decisions that need adjustment, and package structure issues. Fix these before rolling out to additional domains. The cost of fixing governance decisions in a one-domain pilot is a fraction of the cost of fixing them across a fully-populated repository.

What Sparx Services Does in Deploy

The Deploy engagement covers all seven steps:

Database selection advisory and configuration
Pro Cloud Server installation and configuration (cloud or on-premises)
Repository creation and initial structure
MDG extension selection and activation
User security design and implementation (including AD integration)
Naming convention design and documentation
Package structure design
Domain pilot execution and validation
Team onboarding and modeling standards training

Sparx Services typically completes a Deploy engagement in 4–8 weeks, depending on environment complexity. At the end of the engagement, the EA team has a production-ready repository with governance foundations in place.

Frequently Asked Questions

Q: Can we start with a file-based repository (.qea) and migrate to a database later? Yes. Sparx EA supports migrating a .qea file-based repository to a database-backed repository via the “Transfer Project to Server” function. The migration transfers all elements, relationships, diagrams, and packages. Some manual post-migration verification is recommended (check that relationships resolved correctly, that diagrams render as expected). For teams starting small and scaling up, this migration path works well. Plan the database-hosted migration before the repository grows very large, as migration of large repositories takes longer.

Q: Do we need a dedicated server for Pro Cloud Server, or can it run on a workstation? PCS should run on a dedicated server (or cloud VM) rather than an architect’s workstation. Running PCS on a workstation means the repository is unavailable when that workstation is off or restarted, which is unacceptable for a team repository. A modest VM (2 vCPU, 4GB RAM) is sufficient for most EA teams. Larger teams or high-query-volume environments (with EA GraphLink active) benefit from more resources.

Q: How long does it take to set up a production Sparx EA repository from scratch? With guidance, a production-ready repository setup typically takes 4–8 weeks. The technical installation (database + PCS) can be completed in a day or two. The time investment is in governance design: naming conventions, package structure, MDG configuration, and the domain pilot. These decisions cannot be rushed without risking technical debt. Sparx Services’ Deploy engagement provides structured delivery of all setup steps within this timeframe.

Q: Can we use Azure Active Directory for Sparx EA authentication? Yes. Pro Cloud Server supports Azure Active Directory integration via LDAP or SAML, enabling Sparx EA repository access to be managed through Azure AD. This is the recommended configuration for organisations on Microsoft 365 and Azure. Users authenticate with their corporate Azure AD credentials; no separate Sparx EA password management is required. Sparx Services configures Azure AD integration as part of the Deploy engagement.

Q: What is the recommended backup strategy for a Sparx EA repository? Use the backup tools native to your database platform: daily full database backup plus transaction log backup (for SQL Server) or continuous backup (for cloud-hosted databases via Azure Backup or AWS RDS automated backups). Additionally, take Sparx EA baselines before major governance events (ARB sessions, quarterly reviews) — these are internal model snapshots accessible from within Sparx EA, separate from database-level backups. Both layers are important: database backup for disaster recovery; Sparx EA baselines for model version history.

Q: Is there a size limit on a Sparx EA repository? There is no hard element count limit — Sparx EA repositories with tens of thousands of elements and thousands of diagrams are common in large enterprise programs. Performance scales with database resources and Pro Cloud Server configuration. For very large repositories, database indexing and PCS query optimisation may be needed. EA GraphLink’s caching layer also helps with query performance for large repositories. Sparx Services addresses performance design during the Deploy engagement for larger implementations.

Ready to Set Up Your Sparx EA Repository?

Sparx Services’ Deploy engagement covers every step in this guide — from database selection through pilot completion and team onboarding.

We deliver a production-ready repository with the governance foundation that makes every subsequent EA investment worthwhile.

Talk to Sparx Services about repository setup →