Integrations Inventory and Attributes - The Baseline Integration Representation
Integrations Inventory and Attributes
The Baseline Integration Representation
What Is an Integration
An Integration — also known as a Data and Information Integration — is a governed connection through which data or information moves between two entities. The word “integration” in this context is deliberately broad: it encompasses any distinct, identifiable flow of data or information between a source and a target, regardless of what technology implements it, what format the data takes, or what kinds of entities are connected. A real-time API call between two applications, a nightly batch file transfer from a database to a file system, a manual data entry process where a human operator moves information from one system to another, and an automated EDI transaction from an enterprise system to an external trading partner are all Integrations in the governance sense of this inventory. What qualifies a flow as a governed integration is that it can be uniquely identified, it has a source and a target, it carries a describable payload, and it matters to the enterprise if it fails.
An Integration is also known as a Data and Information Integration to make explicit that both highly structured data — relational records, JSON documents, CSV files, EDI transactions, database row sets — and unstructured information — PDF documents, Word files, images, audio recordings, binary attachments — flow through governed integrations. The Integrations Inventory governs all of them. The distinction between structured data and unstructured information is captured in the Integration Payload attribute, not in whether something qualifies as an integration. An enterprise that moves PDF contracts through an SFTP process and JSON customer records through a REST API has two governed integrations, both of which belong in this inventory.
The Entity-Agnostic Model
The most important structural feature of the Integrations Inventory is that it is entity-agnostic. The source and target of any integration can be any recognized Noun Type in the IF4IT Enterprise Inventory Management taxonomy — not just Applications. This is not a minor implementation detail: it is the design decision that makes this inventory genuinely complete as the enterprise’s data flow map.
A traditional “application integration map” only captures Application-to-Application connections. In practice, enterprise data flows involve many other entity types on one or both ends. The full set of supported permutations includes but is not limited to: Application-to-Application (the traditional case), Application-to-Database (an application writing directly to a database it does not own), Database-to-File System (a database export to a file drop), File System-to-Application (a file ingestion process), Application-to-Queue (an application publishing to a message broker), Queue-to-Application (an application consuming from a message broker), Application-to-Human (a report, dashboard, or notification delivered to a person), Human-to-Application (a manual data entry or form submission), Application-to-External Partner (an outbound feed to a third party), External Partner-to-Application (an inbound feed from a third party), Cloud Service-to-Database (a SaaS platform writing to an enterprise data store), and any other combination the enterprise encounters. Every one of these permutations is governed by the same three-attribute Source pattern (Entity Type, Entity ID, Entity Semantic ID) and the same three-attribute Target pattern, making the record structure consistent regardless of what kinds of entities are connected.
This entity-agnostic model means the Integrations Inventory is the enterprise’s complete data flow map — not just the application layer’s integration map. Organizations that have governed only Application-to-Application integrations have documented a fraction of their actual data flow landscape. Database-to-database connections, file-based transfers, and human-mediated data movements are frequently the most fragile and least-governed flows in the enterprise, and they are the ones most likely to harbor compliance risks, key-person dependencies, and undocumented sensitive data movement.
A Reified Relationship
An Integration is a reified relationship — a relationship between two entities that has been promoted to a first-class governed record (a Noun Instance) with its own identity, its own attributes, its own lifecycle, and its own governance. The concept of reification is important for understanding why the Integrations Inventory is structured as it is and what governance capabilities it enables that a simple integration map does not.
In a simple integration map, the connection between Application A and Application B is just a line on a diagram. It has no attributes of its own: no payload, no sensitivity, no technology, no owner, no risk rating, no lifecycle status. Reification transforms that line into a governed Noun Instance: the Integration itself becomes something the enterprise can own, assess, monitor, retire, and query. Once reified, an integration can carry attributes — and those attributes unlock governance capabilities that are impossible with a simple map. The enterprise can query: which integrations carry PII? Which are point-to-point with no middleware? Which have no retry logic? Which are Deprecated but not yet retired? These are answerable questions when the integration is a governed record. They are unanswerable when it is just a line on a diagram.
The Integrations Inventory also enables the derivation of many additional relationships that are not explicitly recorded. Because every integration record names its source and target, the inventory can be used to derive: all integrations for a given application (by querying Source Entity Semantic ID or Target Entity Semantic ID), the full integration dependency graph for any system (by traversing upstream and downstream integration chains), the set of all integrations carrying a specific data type (by querying Integration Payload), and the set of all cross-boundary integrations (by querying for External in Source or Target Environment). These derived views require no additional data entry — they emerge automatically from the governed baseline record.
Cross-Environment Data Flows
One of the most underappreciated governance capabilities of the Integrations Inventory is its ability to capture and govern cross-environment data flows — integrations where the source and target operate in different IT environments. The Source Environment and Target Environment attributes on every integration record make this visible at scale.
The most common and most governance-sensitive cross-environment flow is Production-to-non-Production: a process that copies or streams Production data into a Development, Systems Integration Testing, or User Acceptance Testing environment for testing purposes. This practice is extremely common, often informally established, and frequently the source of data privacy compliance failures. When Production data containing PII, PHI, or PCI information flows into a UAT environment without appropriate data masking, the enterprise has a regulatory exposure that may not be visible through any other governance instrument. The Integrations Inventory makes these flows visible immediately: a query for Source Environment = Production and Target Environment ≠ Production produces the complete list of integrations that require data sanitization governance review.
External is a valid value for both Source Environment and Target Environment, indicating that the entity at that end of the integration exists outside the enterprise boundary — a third-party partner, a regulator, a customer platform, or an external cloud service. When either environment is External, the integration crosses the enterprise boundary and requires enhanced governance: security controls appropriate for external-facing connections, evaluation for cross-border data flow applicability, and compliance assessment against regulatory frameworks that govern data sharing with external parties. The External environment value is not a separate attribute — it is part of the same Source and Target Environment attribute that governs internal environment classification, making the governance model consistent across all integration types.
Data Sensitivities
The Data Sensitivities attribute is a multi-value Crawl-level attribute that captures the sensitivity classifications of data or information transmitted through each integration. It is positioned at Crawl maturity because sensitivity governance is a minimum viable requirement, not an advanced capability — an enterprise that does not know which integrations carry regulated data cannot make defensible decisions about encryption, access control, monitoring priority, or regulatory compliance.
The critical governance insight that motivates this attribute is the per-integration granularity it provides. Two applications may have five integrations between them, but only one of those integrations transmits PII. The others carry operational metadata, status updates, or reference data with no sensitivity implications. Without per-integration sensitivity tagging, the enterprise must either treat all five integrations as sensitive (over-broad, creating unnecessary compliance overhead) or rely on application-level sensitivity classification (insufficient, because the application handling sensitive data does not necessarily transmit it through every integration it participates in). The Data Sensitivities attribute on each individual integration record resolves this ambiguity.
The suggested baseline value set is: PII (Personally Identifiable Information), PHI (Protected Health Information), PCI (Payment Card Industry data), PFI (Protected Financial Information), Confidential (internal sensitive but not regulated), Regulated (subject to regulatory handling requirements not captured by the above categories), and None (no sensitive data transmitted). None is an explicit governance statement — a practitioner who populates it as None is asserting that no sensitive data flows through this integration. This assertion is auditable. Multiple values are supported and semicolon-delimited: an integration that carries both patient identifiers and payment information is tagged PII; PHI; PCI. Organizations are encouraged to extend this value set to match their specific regulatory environment and data classification taxonomy.
The Baseline Integration Record
The Baseline Integration Record is the minimum set of attributes required to create a governance-useful integration record — one that can be used for dependency analysis, sensitivity governance, environment-pair analysis, and integration technology inventory. Every integration record should be populated to this baseline before any Walk attributes are added. An inventory where 100% of known integrations have all 14 Crawl attributes populated is more governable than one where 30% of integrations have 40 attributes each.
| # | Attribute | Maturity | Description |
|---|---|---|---|
| 1 | Integration Semantic ID | Crawl | A human-readable unique name for the integration, typically constructed by concatenating Source Entity Semantic ID, Target Entity Semantic ID, and Integration Payload. Example: Salesforce-SAP-CustomerProfile. |
| 2 | Source Entity Type | Crawl | The type of entity initiating or sending data. Valid values include Application, Database, File System, Queue, API, Human, External Partner, and others. |
| 3 | Source Entity ID | Crawl | A unique identifier for the source entity as recognized within the source system or its governing inventory. |
| 4 | Source Entity Semantic ID | Crawl | The human-readable name for the source entity — drawn from its governing inventory record where one exists. |
| 5 | Source Environment | Crawl | The operating environment of the source entity. Valid values: Development, Systems Integration Testing, User Acceptance Testing, Production, Disaster Recovery, External. |
| 6 | Integration Type | Crawl | The broad category of integration mechanism. Examples: SFTP, ETL, ELT, API, AFT, Event/Message, File Transfer, Database-to-Database, Webhook. |
| 7 | Integration Payload | Crawl | The business-level description of the data or information being exchanged. Examples: Customer Profile, Bank Payment, Claims Record, Regulatory Filing, PDF Document. |
| 8 | Data Sensitivities | Crawl | The sensitivity classifications of data moving through this integration. [Multi-Value]: PII; PHI; PCI; PFI; Confidential; Regulated; None. |
| 9 | Integration Technology | Crawl | The specific technology, tool, or platform implementing this integration. Examples: Informatica IICS, MuleSoft, Python, Azure Data Factory, AWS EventBridge, custom code. |
| 10 | Target Entity Type | Crawl | The type of entity receiving data. Valid values mirror Source Entity Type: Application, Database, File System, Queue, API, Human, External Partner, and others. |
| 11 | Target Entity ID | Crawl | A unique identifier for the target entity as recognized within the target system or its governing inventory. |
| 12 | Target Entity Semantic ID | Crawl | The human-readable name for the target entity — drawn from its governing inventory record where one exists. |
| 13 | Target Environment | Crawl | The operating environment of the target entity. Valid values mirror Source Environment: Development, SIT, UAT, Production, DR, External. |
| 14 | Lifecycle Status | Crawl | The current governance state of this integration. Valid values: Proposed, Active, Under Review, Deprecated, Retired. |
These 14 attributes collectively identify the integration (Semantic ID), describe both ends of the connection (Source and Target Entity Type, ID, Semantic ID, and Environment), characterize what moves through it (Integration Payload, Data Sensitivities), identify how it is implemented (Integration Type, Integration Technology), and establish its governance state (Lifecycle Status). Note that this inventory makes a deliberate exception to the standard IF4IT inventory practice of requiring a Description attribute: for an Integration Noun Instance, the structured attributes — particularly the combination of Source Entity Semantic ID, Target Entity Semantic ID, Integration Payload, Integration Type, and Integration Technology — collectively provide a more precise and less ambiguous description of the integration than any prose paragraph would. The Description attribute is omitted from the Integrations Inventory by design.
Value to APM, TPM, and the Enterprise Model
The Integrations Inventory delivers three categories of governance value that no other inventory can provide: integration complexity quantification for APM, integration technology visibility for TPM, and data flow graph structure for the Enterprise Model.
For APM, the integration count per application — a Calculated attribute derived from this inventory — is the most reliable leading indicator of application retirement cost and risk. Portfolio rationalization decisions made without integration data systematically underestimate the effort required to retire or replace any system that participates in the integration fabric. An application with 47 integrations requires 47 integration remediation workstreams before it can be retired. The Integrations Inventory makes this complexity visible and quantifiable before investment decisions are made, not after they have been committed. It also enables APM to answer sensitivity-weighted dependency questions: which integrations does this application participate in? Which of those carry PII or PHI? What is the regulatory exposure of retiring this application without a transition plan for those integrations?
For TPM, the Integration Technology attribute across all integration records provides the complete integration tooling landscape. Every distinct value is a nomination for a Software Technology record. Aggregating Integration Technology values reveals how many integration platforms the enterprise operates, which are redundant, which are approaching end of life, and which carry the highest integration load. TPM can then govern the integration technology stack with the same rigor it applies to application technologies — rationalizing platforms, managing vendor dependencies, and sequencing modernization investments based on actual integration data.
For the Enterprise Model, each integration record contributes edges to the Enterprise Model graph: a directed connection from a Source entity node to a Target entity node, carrying payload, technology, sensitivity, and environment metadata on the relationship. This structure enables AI-assisted graph traversal that crosses inventory boundaries — from a regulatory obligation, through the integrations that carry regulated data, to the applications and technologies involved, in a single traversal path. The Integrations Inventory is the primary source of data flow lineage in the Enterprise Model graph.
Reverse-Engineering and Reconciliation
The Integrations Inventory is the only inventory that can reverse-engineer and reconcile every other inventory through the Source Entity ID and Target Entity ID attributes. Because every integration record references both a source entity and a target entity — with their types, system IDs, and semantic names — the inventory can be used to validate the completeness of every other Noun Type inventory it touches.
Any entity that appears as a Source Entity or Target Entity in an integration record but does not have a corresponding record in its governing Noun Type inventory is a gap in that inventory. An Application Semantic ID appearing in integration records but not in the Applications Inventory signals an ungoverned application. A Database ID appearing in integration records but not in the Data Stores Inventory signals an undocumented data store. This gap-detection capability is bidirectional: the Integrations Inventory can validate the Applications Inventory, and the Applications Inventory can validate the Integrations Inventory. Running an automated reconciliation between the two — matching Source Entity Semantic IDs and Target Entity Semantic IDs against Applications Inventory records — produces a list of integration discovery gaps that manual interviews would miss.
The Integration Technology attribute provides the same reconciliation value for the Software Technologies Inventory. Every distinct Integration Technology value across all integration records is a potential gap in the Software Technologies Inventory. An automated scan of all Integration Technology values compared against Software Technology Semantic IDs produces a list of integration tools in use that have not been formally inventoried or governed. This is how the Integrations Inventory seeds the Software Technologies Inventory — not through manual population, but through automated reconciliation against what practitioners have already documented at the integration level.
Copyright for the International Foundation for Information Technology (IF4IT): 2008 - Present
Legal Disclaimers