Terraform Cloud API Integration Design
Executive Summary
This document defines the architecture for integrating Terraform Cloud (TFC) with the foundations-workspace platform for automated workspace discovery, state ingestion, and event-driven updates.
1. Authentication Architecture
1.1 Token Types and Selection
enum TFCTokenType {
ORGANIZATION = 'organization', // Recommended: Read-only, org-scoped
TEAM = 'team', // Alternative: Team-scoped access
USER = 'user' // Avoid: Too permissive
}
interface TFCCredential {
orgName: string;
tokenType: TFCTokenType;
token: string;
scopes: string[];
createdAt: Date;
expiresAt?: Date;
rotationPolicy: TokenRotationPolicy;
}
interface TokenRotationPolicy {
enabled: boolean;
rotationIntervalDays: number;
gracePeriodDays: number;
notificationChannels: string[];
}
1.2 Multi-Organization Credential Management
interface OrganizationCredentialStore {
// Stores credentials per client/organization
credentials: Map<string, TFCCredential[]>;
// Secret backend integration (Vault, AWS Secrets Manager, etc.)
secretBackend: SecretBackend;
async getCredential(orgName: string): Promise<TFCCredential>;
async rotateCredential(orgName: string): Promise<void>;
async validateCredential(credential: TFCCredential): Promise<boolean>;
}
class SecretBackend {
async store(key: string, value: string, metadata: object): Promise<void>;
async retrieve(key: string): Promise<string>;
async delete(key: string): Promise<void>;
async rotate(key: string): Promise<string>;
}
Credential Storage Strategy:
- Store in HashiCorp Vault or AWS Secrets Manager
- Never commit tokens to version control
- Use environment-specific credentials (dev, staging, prod)
- Implement automatic rotation with 30-day grace periods
- Alert on rotation failures
1.3 Authentication Flow
2. API Client Architecture
2.1 Core Client Design
interface TerraformCloudClientConfig {
baseUrl: string; // https://app.terraform.io/api/v2
credential: TFCCredential;
rateLimiter: RateLimiter;
retryPolicy: RetryPolicy;
timeout: number; // milliseconds
}
class TerraformCloudClient {
private config: TerraformCloudClientConfig;
private httpClient: AxiosInstance;
private rateLimiter: RateLimiter;
constructor(config: TerraformCloudClientConfig) {
this.config = config;
this.httpClient = this.createHttpClient();
this.rateLimiter = config.rateLimiter;
}
// ==================== AUTHENTICATION ====================
async authenticate(): Promise<void> {
try {
const response = await this.httpClient.get('/account/details');
if (response.status !== 200) {
throw new AuthenticationError('Invalid TFC token');
}
} catch (error) {
throw new AuthenticationError(`Authentication failed: ${error.message}`);
}
}
// ==================== WORKSPACE OPERATIONS ====================
async listWorkspaces(
orgName: string,
filter?: WorkspaceFilter
): Promise<PaginatedResult<Workspace>> {
await this.rateLimiter.acquire();
const params: any = {
'page[size]': filter?.pageSize || 100,
'page[number]': filter?.pageNumber || 1
};
if (filter?.search) {
params['search[name]'] = filter.search;
}
if (filter?.tags) {
params['filter[tag-names]'] = filter.tags.join(',');
}
const response = await this.httpClient.get(
`/organizations/${orgName}/workspaces`,
{ params }
);
return {
data: response.data.data.map(this.parseWorkspace),
meta: response.data.meta,
links: response.data.links
};
}
async getWorkspace(workspaceId: string): Promise<Workspace> {
await this.rateLimiter.acquire();
const response = await this.httpClient.get(
`/workspaces/${workspaceId}`,
{
params: {
'include': 'current-state-version,organization'
}
}
);
return this.parseWorkspace(response.data.data);
}
async listWorkspacesWithPagination(
orgName: string,
filter?: WorkspaceFilter
): AsyncGenerator<Workspace[], void, unknown> {
let pageNumber = 1;
let hasMore = true;
while (hasMore) {
const result = await this.listWorkspaces(orgName, {
...filter,
pageNumber,
pageSize: 100
});
yield result.data;
hasMore = result.meta.pagination.currentPage < result.meta.pagination.totalPages;
pageNumber++;
}
}
// ==================== STATE VERSION OPERATIONS ====================
async getCurrentStateVersion(workspaceId: string): Promise<StateVersion | null> {
await this.rateLimiter.acquire();
const response = await this.httpClient.get(
`/workspaces/${workspaceId}/current-state-version`
);
if (response.status === 404) {
return null; // No state version yet
}
return this.parseStateVersion(response.data.data);
}
async downloadState(stateVersionId: string): Promise<TerraformState> {
await this.rateLimiter.acquire();
// Get the download URL
const response = await this.httpClient.get(
`/state-versions/${stateVersionId}`
);
const downloadUrl = response.data.data.attributes['hosted-state-download-url'];
if (!downloadUrl) {
throw new Error('State download URL not available');
}
// Download the actual state (doesn't count against rate limit)
const stateResponse = await axios.get(downloadUrl);
return stateResponse.data;
}
async listStateVersions(
workspaceId: string,
limit: number = 10
): Promise<StateVersion[]> {
await this.rateLimiter.acquire();
const response = await this.httpClient.get(
`/workspaces/${workspaceId}/state-versions`,
{
params: {
'page[size]': limit,
'page[number]': 1
}
}
);
return response.data.data.map(this.parseStateVersion);
}
// ==================== CHANGE DETECTION ====================
async detectNewWorkspaces(
orgName: string,
since: Date
): Promise<Workspace[]> {
const allWorkspaces: Workspace[] = [];
for await (const batch of this.listWorkspacesWithPagination(orgName)) {
const newWorkspaces = batch.filter(ws =>
new Date(ws.attributes.createdAt) > since
);
allWorkspaces.push(...newWorkspaces);
}
return allWorkspaces;
}
async detectStateChanges(
workspaceIds: string[]
): Promise<StateChange[]> {
const changes: StateChange[] = [];
// Batch requests to respect rate limits
const batches = this.createBatches(workspaceIds, 20);
for (const batch of batches) {
const batchPromises = batch.map(async (wsId) => {
const stateVersion = await this.getCurrentStateVersion(wsId);
return { workspaceId: wsId, stateVersion };
});
const results = await Promise.all(batchPromises);
for (const result of results) {
if (result.stateVersion) {
changes.push({
workspaceId: result.workspaceId,
stateVersionId: result.stateVersion.id,
createdAt: result.stateVersion.attributes.createdAt,
serial: result.stateVersion.attributes.serial
});
}
}
}
return changes;
}
// ==================== HELPER METHODS ====================
private createHttpClient(): AxiosInstance {
return axios.create({
baseURL: this.config.baseUrl,
timeout: this.config.timeout,
headers: {
'Authorization': `Bearer ${this.config.credential.token}`,
'Content-Type': 'application/vnd.api+json'
}
});
}
private parseWorkspace(data: any): Workspace {
return {
id: data.id,
type: data.type,
attributes: {
name: data.attributes.name,
createdAt: data.attributes['created-at'],
updatedAt: data.attributes['updated-at'],
terraformVersion: data.attributes['terraform-version'],
locked: data.attributes.locked,
executionMode: data.attributes['execution-mode'],
vcsRepo: data.attributes['vcs-repo'],
workingDirectory: data.attributes['working-directory'],
tags: data.attributes['tag-names'] || []
},
relationships: data.relationships
};
}
private parseStateVersion(data: any): StateVersion {
return {
id: data.id,
type: data.type,
attributes: {
createdAt: data.attributes['created-at'],
serial: data.attributes.serial,
size: data.attributes.size,
hostedStateDownloadUrl: data.attributes['hosted-state-download-url'],
resources: data.attributes.resources || []
}
};
}
private createBatches<T>(items: T[], batchSize: number): T[][] {
const batches: T[][] = [];
for (let i = 0; i < items.length; i += batchSize) {
batches.push(items.slice(i, i + batchSize));
}
return batches;
}
}
2.2 Rate Limiting Strategy
interface RateLimiterConfig {
requestsPerSecond: number; // TFC: 30 req/sec
burstCapacity: number;
backoffStrategy: BackoffStrategy;
}
enum BackoffStrategy {
EXPONENTIAL = 'exponential',
LINEAR = 'linear',
EXPONENTIAL_WITH_JITTER = 'exponential_jitter'
}
class RateLimiter {
private tokens: number;
private lastRefill: number;
private config: RateLimiterConfig;
private queue: Array<() => void> = [];
constructor(config: RateLimiterConfig) {
this.config = config;
this.tokens = config.burstCapacity;
this.lastRefill = Date.now();
// Refill tokens continuously
setInterval(() => this.refill(), 1000);
}
async acquire(): Promise<void> {
if (this.tokens > 0) {
this.tokens--;
return;
}
// Wait for token availability
return new Promise<void>((resolve) => {
this.queue.push(resolve);
});
}
private refill(): void {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
const tokensToAdd = Math.floor(elapsed * this.config.requestsPerSecond);
this.tokens = Math.min(
this.tokens + tokensToAdd,
this.config.burstCapacity
);
this.lastRefill = now;
// Process queued requests
while (this.tokens > 0 && this.queue.length > 0) {
this.tokens--;
const resolve = this.queue.shift()!;
resolve();
}
}
async executeWithBackoff<T>(
fn: () => Promise<T>,
maxRetries: number = 5
): Promise<T> {
let attempt = 0;
while (attempt < maxRetries) {
try {
return await fn();
} catch (error: any) {
if (error.response?.status === 429) {
// Rate limited
const delay = this.calculateBackoff(attempt);
console.log(`Rate limited, waiting ${delay}ms before retry ${attempt + 1}`);
await this.sleep(delay);
attempt++;
} else {
throw error;
}
}
}
throw new Error(`Max retries exceeded after ${maxRetries} attempts`);
}
private calculateBackoff(attempt: number): number {
const baseDelay = 1000; // 1 second
switch (this.config.backoffStrategy) {
case BackoffStrategy.EXPONENTIAL:
return baseDelay * Math.pow(2, attempt);
case BackoffStrategy.LINEAR:
return baseDelay * (attempt + 1);
case BackoffStrategy.EXPONENTIAL_WITH_JITTER:
const exponential = baseDelay * Math.pow(2, attempt);
const jitter = Math.random() * 1000;
return exponential + jitter;
default:
return baseDelay;
}
}
private sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
3. Workspace Auto-Discovery
3.1 Discovery Service Architecture
interface DiscoveryConfig {
organizations: string[];
schedule: string; // cron expression
filters: DiscoveryFilter[];
mappingRules: MappingRule[];
}
interface DiscoveryFilter {
type: 'name_pattern' | 'tags' | 'vcs_repo' | 'created_after';
pattern?: string;
tags?: string[];
repoPattern?: string;
date?: Date;
}
interface MappingRule {
// Maps TFC workspace → Backstage entity
sourcePattern: string; // Regex for workspace name
targetTemplate: string; // Backstage entity template
tenantExtraction: TenantExtractionStrategy;
businessUnitExtraction: BusinessUnitExtractionStrategy;
}
enum TenantExtractionStrategy {
FROM_ORG_NAME = 'org_name',
FROM_WORKSPACE_PREFIX = 'workspace_prefix',
FROM_TAGS = 'tags',
FROM_VCS_REPO = 'vcs_repo',
STATIC = 'static'
}
enum BusinessUnitExtractionStrategy {
FROM_WORKSPACE_NAME = 'workspace_name',
FROM_TAGS = 'tags',
FROM_VCS_REPO_PATH = 'vcs_repo_path',
STATIC = 'static'
}
class WorkspaceDiscoveryService {
private client: TerraformCloudClient;
private config: DiscoveryConfig;
private cache: WorkspaceCache;
private eventBus: EventBus;
constructor(
client: TerraformCloudClient,
config: DiscoveryConfig,
cache: WorkspaceCache,
eventBus: EventBus
) {
this.client = client;
this.config = config;
this.cache = cache;
this.eventBus = eventBus;
}
async discoverAll(): Promise<DiscoveryResult> {
const startTime = Date.now();
const results: DiscoveryResult = {
totalWorkspaces: 0,
newWorkspaces: 0,
updatedWorkspaces: 0,
errors: [],
duration: 0
};
for (const org of this.config.organizations) {
try {
const orgResult = await this.discoverOrganization(org);
results.totalWorkspaces += orgResult.totalWorkspaces;
results.newWorkspaces += orgResult.newWorkspaces;
results.updatedWorkspaces += orgResult.updatedWorkspaces;
} catch (error: any) {
results.errors.push({
organization: org,
error: error.message,
timestamp: new Date()
});
}
}
results.duration = Date.now() - startTime;
// Emit discovery completed event
await this.eventBus.publish({
type: 'discovery.completed',
payload: results,
timestamp: new Date()
});
return results;
}
async discoverOrganization(orgName: string): Promise<OrganizationDiscoveryResult> {
const result: OrganizationDiscoveryResult = {
organization: orgName,
totalWorkspaces: 0,
newWorkspaces: 0,
updatedWorkspaces: 0,
workspaces: []
};
// Retrieve cached workspaces
const cachedWorkspaces = await this.cache.getWorkspaces(orgName);
const cachedIds = new Set(cachedWorkspaces.map(ws => ws.id));
// Discover all workspaces
for await (const batch of this.client.listWorkspacesWithPagination(orgName)) {
const filteredBatch = this.applyFilters(batch);
for (const workspace of filteredBatch) {
result.totalWorkspaces++;
if (!cachedIds.has(workspace.id)) {
// New workspace
result.newWorkspaces++;
await this.processNewWorkspace(workspace, orgName);
} else {
// Check for updates
const cached = cachedWorkspaces.find(ws => ws.id === workspace.id)!;
if (this.hasChanged(workspace, cached)) {
result.updatedWorkspaces++;
await this.processUpdatedWorkspace(workspace, orgName);
}
}
result.workspaces.push(workspace);
}
}
// Update cache
await this.cache.setWorkspaces(orgName, result.workspaces);
return result;
}
private applyFilters(workspaces: Workspace[]): Workspace[] {
return workspaces.filter(workspace => {
return this.config.filters.every(filter => {
switch (filter.type) {
case 'name_pattern':
return new RegExp(filter.pattern!).test(workspace.attributes.name);
case 'tags':
return filter.tags!.some(tag =>
workspace.attributes.tags.includes(tag)
);
case 'vcs_repo':
if (!workspace.attributes.vcsRepo) return false;
return new RegExp(filter.repoPattern!).test(
workspace.attributes.vcsRepo.identifier
);
case 'created_after':
return new Date(workspace.attributes.createdAt) > filter.date!;
default:
return true;
}
});
});
}
private async processNewWorkspace(
workspace: Workspace,
orgName: string
): Promise<void> {
// Extract metadata
const metadata = this.extractMetadata(workspace, orgName);
// Emit event for downstream processing
await this.eventBus.publish({
type: 'workspace.discovered',
payload: {
workspace,
metadata,
organization: orgName
},
timestamp: new Date()
});
}
private async processUpdatedWorkspace(
workspace: Workspace,
orgName: string
): Promise<void> {
const metadata = this.extractMetadata(workspace, orgName);
await this.eventBus.publish({
type: 'workspace.updated',
payload: {
workspace,
metadata,
organization: orgName
},
timestamp: new Date()
});
}
private extractMetadata(
workspace: Workspace,
orgName: string
): WorkspaceMetadata {
const metadata: WorkspaceMetadata = {
tenant: this.extractTenant(workspace, orgName),
businessUnit: this.extractBusinessUnit(workspace),
tags: workspace.attributes.tags,
vcsRepo: workspace.attributes.vcsRepo?.identifier,
environment: this.extractEnvironment(workspace)
};
return metadata;
}
private extractTenant(workspace: Workspace, orgName: string): string {
for (const rule of this.config.mappingRules) {
if (new RegExp(rule.sourcePattern).test(workspace.attributes.name)) {
switch (rule.tenantExtraction) {
case TenantExtractionStrategy.FROM_ORG_NAME:
return orgName;
case TenantExtractionStrategy.FROM_WORKSPACE_PREFIX:
// Extract prefix before first hyphen
const match = workspace.attributes.name.match(/^([^-]+)-/);
return match ? match[1] : orgName;
case TenantExtractionStrategy.FROM_TAGS:
const tenantTag = workspace.attributes.tags.find(tag =>
tag.startsWith('tenant:')
);
return tenantTag ? tenantTag.split(':')[1] : orgName;
case TenantExtractionStrategy.FROM_VCS_REPO:
if (workspace.attributes.vcsRepo) {
// Extract org from repo identifier (e.g., "org/repo")
return workspace.attributes.vcsRepo.identifier.split('/')[0];
}
return orgName;
case TenantExtractionStrategy.STATIC:
return orgName;
}
}
}
return orgName; // Default to org name
}
private extractBusinessUnit(workspace: Workspace): string | undefined {
for (const rule of this.config.mappingRules) {
if (new RegExp(rule.sourcePattern).test(workspace.attributes.name)) {
switch (rule.businessUnitExtraction) {
case BusinessUnitExtractionStrategy.FROM_WORKSPACE_NAME:
// Extract segment from name (e.g., "client-bu-env" → "bu")
const parts = workspace.attributes.name.split('-');
return parts.length > 1 ? parts[1] : undefined;
case BusinessUnitExtractionStrategy.FROM_TAGS:
const buTag = workspace.attributes.tags.find(tag =>
tag.startsWith('business-unit:') || tag.startsWith('bu:')
);
return buTag ? buTag.split(':')[1] : undefined;
case BusinessUnitExtractionStrategy.FROM_VCS_REPO_PATH:
if (workspace.attributes.vcsRepo && workspace.attributes.workingDirectory) {
// Extract from path (e.g., "terraform/business-units/marketing")
const match = workspace.attributes.workingDirectory.match(/business-units\/([^\/]+)/);
return match ? match[1] : undefined;
}
return undefined;
case BusinessUnitExtractionStrategy.STATIC:
return undefined; // No static value in this case
}
}
}
return undefined;
}
private extractEnvironment(workspace: Workspace): string {
// Try tags first
const envTag = workspace.attributes.tags.find(tag =>
tag.startsWith('env:') || tag.startsWith('environment:')
);
if (envTag) {
return envTag.split(':')[1];
}
// Try workspace name
const name = workspace.attributes.name.toLowerCase();
if (name.includes('prod')) return 'production';
if (name.includes('staging') || name.includes('stage')) return 'staging';
if (name.includes('dev')) return 'development';
return 'unknown';
}
private hasChanged(current: Workspace, cached: Workspace): boolean {
return (
current.attributes.updatedAt !== cached.attributes.updatedAt ||
current.attributes.terraformVersion !== cached.attributes.terraformVersion ||
JSON.stringify(current.attributes.tags) !== JSON.stringify(cached.attributes.tags)
);
}
}
3.2 Workspace Caching Strategy
interface WorkspaceCache {
getWorkspaces(orgName: string): Promise<Workspace[]>;
setWorkspaces(orgName: string, workspaces: Workspace[]): Promise<void>;
getWorkspace(workspaceId: string): Promise<Workspace | null>;
invalidate(orgName?: string): Promise<void>;
}
class RedisWorkspaceCache implements WorkspaceCache {
private redis: Redis;
private ttl: number = 3600; // 1 hour
constructor(redis: Redis) {
this.redis = redis;
}
async getWorkspaces(orgName: string): Promise<Workspace[]> {
const key = `tfc:workspaces:${orgName}`;
const cached = await this.redis.get(key);
if (!cached) {
return [];
}
return JSON.parse(cached);
}
async setWorkspaces(orgName: string, workspaces: Workspace[]): Promise<void> {
const key = `tfc:workspaces:${orgName}`;
await this.redis.setex(key, this.ttl, JSON.stringify(workspaces));
// Also cache individual workspaces
for (const workspace of workspaces) {
const wsKey = `tfc:workspace:${workspace.id}`;
await this.redis.setex(wsKey, this.ttl, JSON.stringify(workspace));
}
}
async getWorkspace(workspaceId: string): Promise<Workspace | null> {
const key = `tfc:workspace:${workspaceId}`;
const cached = await this.redis.get(key);
return cached ? JSON.parse(cached) : null;
}
async invalidate(orgName?: string): Promise<void> {
if (orgName) {
const key = `tfc:workspaces:${orgName}`;
await this.redis.del(key);
} else {
// Invalidate all
const keys = await this.redis.keys('tfc:*');
if (keys.length > 0) {
await this.redis.del(...keys);
}
}
}
}
4. Event-Driven Updates
4.1 Webhook Receiver Architecture
interface WebhookConfig {
endpoint: string; // /webhooks/terraform-cloud
signingSecret: string;
eventFilters: string[];
}
interface WebhookPayload {
payload_version: number;
notification_configuration_id: string;
run_url: string;
run_id: string;
run_message: string;
run_created_at: string;
run_created_by: string;
workspace_id: string;
workspace_name: string;
organization_name: string;
notifications: TFCNotification[];
}
interface TFCNotification {
message: string;
trigger: string; // "run:completed", "run:errored", etc.
run_status: string; // "applied", "errored", "planned", etc.
run_updated_at: string;
run_updated_by: string;
}
class TerraformCloudWebhookReceiver {
private config: WebhookConfig;
private eventQueue: EventQueue;
private verifier: SignatureVerifier;
constructor(
config: WebhookConfig,
eventQueue: EventQueue,
verifier: SignatureVerifier
) {
this.config = config;
this.eventQueue = eventQueue;
this.verifier = verifier;
}
async handleWebhook(
request: Request
): Promise<WebhookResponse> {
try {
// 1. Verify signature
const signature = request.headers.get('X-TFE-Notification-Signature');
const body = await request.text();
if (!this.verifier.verify(body, signature, this.config.signingSecret)) {
return {
status: 401,
message: 'Invalid signature'
};
}
// 2. Parse payload
const payload: WebhookPayload = JSON.parse(body);
// 3. Filter events
if (!this.shouldProcess(payload)) {
return {
status: 200,
message: 'Event filtered'
};
}
// 4. Queue for processing
await this.eventQueue.enqueue({
type: 'tfc.webhook',
payload,
receivedAt: new Date()
});
return {
status: 202,
message: 'Event queued for processing'
};
} catch (error: any) {
console.error('Webhook processing error:', error);
return {
status: 500,
message: 'Internal server error'
};
}
}
private shouldProcess(payload: WebhookPayload): boolean {
// Only process successful applies
const successfulApply = payload.notifications.some(n =>
n.trigger === 'run:completed' &&
n.run_status === 'applied'
);
if (!successfulApply) {
return false;
}
// Check event filters
if (this.config.eventFilters.length === 0) {
return true;
}
return this.config.eventFilters.some(filter =>
payload.notifications.some(n => n.trigger === filter)
);
}
}
class SignatureVerifier {
verify(payload: string, signature: string, secret: string): boolean {
const hmac = crypto.createHmac('sha512', secret);
hmac.update(payload);
const expectedSignature = hmac.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(signature),
Buffer.from(expectedSignature)
);
}
}
4.2 Event Processing Pipeline
interface EventQueue {
enqueue(event: QueuedEvent): Promise<void>;
dequeue(): Promise<QueuedEvent | null>;
acknowledge(eventId: string): Promise<void>;
requeue(eventId: string, delay: number): Promise<void>;
}
interface QueuedEvent {
id?: string;
type: string;
payload: any;
receivedAt: Date;
attempts?: number;
maxAttempts?: number;
}
class WebhookEventProcessor {
private queue: EventQueue;
private client: TerraformCloudClient;
private stateIngestionService: StateIngestionService;
private running: boolean = false;
constructor(
queue: EventQueue,
client: TerraformCloudClient,
stateIngestionService: StateIngestionService
) {
this.queue = queue;
this.client = client;
this.stateIngestionService = stateIngestionService;
}
async start(): Promise<void> {
this.running = true;
while (this.running) {
try {
const event = await this.queue.dequeue();
if (!event) {
await this.sleep(1000);
continue;
}
await this.processEvent(event);
await this.queue.acknowledge(event.id!);
} catch (error: any) {
console.error('Event processing error:', error);
}
}
}
stop(): void {
this.running = false;
}
private async processEvent(event: QueuedEvent): Promise<void> {
const payload = event.payload as WebhookPayload;
try {
// Get workspace details
const workspace = await this.client.getWorkspace(payload.workspace_id);
// Get current state version
const stateVersion = await this.client.getCurrentStateVersion(payload.workspace_id);
if (!stateVersion) {
console.log(`No state version for workspace ${payload.workspace_id}`);
return;
}
// Download and ingest state
const state = await this.client.downloadState(stateVersion.id);
await this.stateIngestionService.ingestState({
workspace,
stateVersion,
state,
trigger: 'webhook',
runId: payload.run_id
});
console.log(`Successfully processed state for workspace ${workspace.attributes.name}`);
} catch (error: any) {
console.error(`Failed to process event for workspace ${payload.workspace_id}:`, error);
// Requeue with exponential backoff
const attempts = (event.attempts || 0) + 1;
const maxAttempts = event.maxAttempts || 5;
if (attempts < maxAttempts) {
const delay = Math.pow(2, attempts) * 1000; // Exponential backoff
await this.queue.requeue(event.id!, delay);
} else {
console.error(`Max attempts reached for event ${event.id}, discarding`);
}
throw error;
}
}
private sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
4.3 Queue Implementation (BullMQ)
import { Queue, Worker, QueueEvents } from 'bullmq';
import Redis from 'ioredis';
class BullMQEventQueue implements EventQueue {
private queue: Queue;
private connection: Redis;
constructor(redisUrl: string) {
this.connection = new Redis(redisUrl);
this.queue = new Queue('tfc-events', {
connection: this.connection
});
}
async enqueue(event: QueuedEvent): Promise<void> {
await this.queue.add(
event.type,
event.payload,
{
attempts: event.maxAttempts || 5,
backoff: {
type: 'exponential',
delay: 1000
},
removeOnComplete: {
age: 86400, // Keep for 24 hours
count: 1000
},
removeOnFail: {
age: 604800 // Keep failures for 7 days
}
}
);
}
async dequeue(): Promise<QueuedEvent | null> {
// Handled by Worker in BullMQ
return null;
}
async acknowledge(eventId: string): Promise<void> {
// Handled automatically by Worker
}
async requeue(eventId: string, delay: number): Promise<void> {
// Handled automatically by Worker with backoff
}
createWorker(processor: (event: QueuedEvent) => Promise<void>): Worker {
return new Worker(
'tfc-events',
async (job) => {
await processor({
id: job.id,
type: job.name,
payload: job.data,
receivedAt: new Date(job.timestamp),
attempts: job.attemptsMade
});
},
{
connection: this.connection,
concurrency: 5
}
);
}
}
5. Multi-Organization Support
5.1 Organization Configuration
interface OrganizationConfig {
id: string;
name: string;
tfcOrgName: string;
credentialRef: string; // Reference to credential in secret store
tenantId: string; // Maps to Backstage tenant
discoveryEnabled: boolean;
webhookEnabled: boolean;
discoverySchedule: string; // Cron expression
filters: DiscoveryFilter[];
mappingRules: MappingRule[];
}
class MultiOrgManager {
private configs: Map<string, OrganizationConfig>;
private clients: Map<string, TerraformCloudClient>;
private credentialStore: OrganizationCredentialStore;
constructor(credentialStore: OrganizationCredentialStore) {
this.configs = new Map();
this.clients = new Map();
this.credentialStore = credentialStore;
}
async loadConfigurations(configPath: string): Promise<void> {
// Load from configuration file or database
const configs = await this.readConfigFile(configPath);
for (const config of configs) {
this.configs.set(config.id, config);
// Initialize client for each org
await this.initializeClient(config);
}
}
private async initializeClient(config: OrganizationConfig): Promise<void> {
// Get credential from secure store
const credential = await this.credentialStore.getCredential(config.tfcOrgName);
const client = new TerraformCloudClient({
baseUrl: 'https://app.terraform.io/api/v2',
credential,
rateLimiter: new RateLimiter({
requestsPerSecond: 25, // Conservative limit
burstCapacity: 50,
backoffStrategy: BackoffStrategy.EXPONENTIAL_WITH_JITTER
}),
retryPolicy: {
maxRetries: 3,
initialDelay: 1000,
maxDelay: 10000
},
timeout: 30000
});
await client.authenticate();
this.clients.set(config.id, client);
}
getClient(orgId: string): TerraformCloudClient {
const client = this.clients.get(orgId);
if (!client) {
throw new Error(`No client found for organization ${orgId}`);
}
return client;
}
getConfig(orgId: string): OrganizationConfig {
const config = this.configs.get(orgId);
if (!config) {
throw new Error(`No configuration found for organization ${orgId}`);
}
return config;
}
listOrganizations(): OrganizationConfig[] {
return Array.from(this.configs.values());
}
async enforceIsolation(
userId: string,
orgId: string
): Promise<boolean> {
// Check if user has access to this organization
// This would integrate with your authorization system
const config = this.getConfig(orgId);
const tenantId = config.tenantId;
// Check user's tenant membership
const userTenants = await this.getUserTenants(userId);
return userTenants.includes(tenantId);
}
private async getUserTenants(userId: string): Promise<string[]> {
// Integration with your user management system
// This is a placeholder
return [];
}
private async readConfigFile(path: string): Promise<OrganizationConfig[]> {
// Implementation depends on config format (YAML, JSON, etc.)
return [];
}
}
6. Error Handling & Recovery
6.1 Error Types and Strategies
enum TFCErrorType {
AUTHENTICATION = 'authentication',
RATE_LIMIT = 'rate_limit',
NOT_FOUND = 'not_found',
API_UNAVAILABLE = 'api_unavailable',
NETWORK = 'network',
STATE_DOWNLOAD = 'state_download',
VALIDATION = 'validation'
}
class TFCError extends Error {
constructor(
message: string,
public type: TFCErrorType,
public recoverable: boolean,
public retryAfter?: number,
public originalError?: Error
) {
super(message);
this.name = 'TFCError';
}
}
class ErrorHandler {
async handle(error: Error, context: ErrorContext): Promise<ErrorHandlingResult> {
const tfcError = this.categorizeError(error);
switch (tfcError.type) {
case TFCErrorType.AUTHENTICATION:
return this.handleAuthenticationError(tfcError, context);
case TFCErrorType.RATE_LIMIT:
return this.handleRateLimitError(tfcError, context);
case TFCErrorType.API_UNAVAILABLE:
return this.handleApiUnavailableError(tfcError, context);
case TFCErrorType.STATE_DOWNLOAD:
return this.handleStateDownloadError(tfcError, context);
default:
return this.handleGenericError(tfcError, context);
}
}
private categorizeError(error: Error): TFCError {
if (error instanceof TFCError) {
return error;
}
if (axios.isAxiosError(error)) {
const status = error.response?.status;
switch (status) {
case 401:
case 403:
return new TFCError(
'Authentication failed',
TFCErrorType.AUTHENTICATION,
true,
undefined,
error
);
case 429:
const retryAfter = parseInt(error.response?.headers['retry-after'] || '60');
return new TFCError(
'Rate limit exceeded',
TFCErrorType.RATE_LIMIT,
true,
retryAfter,
error
);
case 404:
return new TFCError(
'Resource not found',
TFCErrorType.NOT_FOUND,
false,
undefined,
error
);
case 502:
case 503:
case 504:
return new TFCError(
'API unavailable',
TFCErrorType.API_UNAVAILABLE,
true,
60,
error
);
default:
return new TFCError(
error.message,
TFCErrorType.NETWORK,
true,
undefined,
error
);
}
}
return new TFCError(
error.message,
TFCErrorType.VALIDATION,
false,
undefined,
error
);
}
private async handleAuthenticationError(
error: TFCError,
context: ErrorContext
): Promise<ErrorHandlingResult> {
console.error(`Authentication error for org ${context.organization}:`, error);
// Attempt credential rotation
try {
await context.credentialStore.rotateCredential(context.organization);
return {
action: ErrorAction.RETRY,
delay: 1000,
message: 'Credential rotated, retrying'
};
} catch (rotationError) {
return {
action: ErrorAction.FAIL,
message: 'Credential rotation failed',
requiresManualIntervention: true
};
}
}
private async handleRateLimitError(
error: TFCError,
context: ErrorContext
): Promise<ErrorHandlingResult> {
const delay = (error.retryAfter || 60) * 1000;
console.warn(`Rate limited for org ${context.organization}, waiting ${delay}ms`);
return {
action: ErrorAction.RETRY,
delay,
message: `Rate limited, retry after ${error.retryAfter}s`
};
}
private async handleApiUnavailableError(
error: TFCError,
context: ErrorContext
): Promise<ErrorHandlingResult> {
// Check if circuit breaker should open
const failures = await context.circuitBreaker.recordFailure(context.organization);
if (failures > 5) {
await context.circuitBreaker.open(context.organization);
return {
action: ErrorAction.FAIL,
message: 'Circuit breaker opened due to repeated API failures',
circuitBreakerOpen: true
};
}
return {
action: ErrorAction.RETRY,
delay: (error.retryAfter || 60) * 1000,
message: 'API temporarily unavailable, retrying'
};
}
private async handleStateDownloadError(
error: TFCError,
context: ErrorContext
): Promise<ErrorHandlingResult> {
// State download failures are often transient
return {
action: ErrorAction.RETRY,
delay: 5000,
maxRetries: 3,
message: 'State download failed, retrying'
};
}
private async handleGenericError(
error: TFCError,
context: ErrorContext
): Promise<ErrorHandlingResult> {
if (error.recoverable) {
return {
action: ErrorAction.RETRY,
delay: 5000,
maxRetries: 3,
message: `Recoverable error: ${error.message}`
};
}
return {
action: ErrorAction.FAIL,
message: `Unrecoverable error: ${error.message}`
};
}
}
interface ErrorContext {
organization: string;
workspaceId?: string;
operation: string;
credentialStore: OrganizationCredentialStore;
circuitBreaker: CircuitBreaker;
}
interface ErrorHandlingResult {
action: ErrorAction;
delay?: number;
maxRetries?: number;
message: string;
requiresManualIntervention?: boolean;
circuitBreakerOpen?: boolean;
}
enum ErrorAction {
RETRY = 'retry',
SKIP = 'skip',
FAIL = 'fail'
}
class CircuitBreaker {
private failures: Map<string, number> = new Map();
private openCircuits: Set<string> = new Set();
private resetTimers: Map<string, NodeJS.Timeout> = new Map();
async recordFailure(key: string): Promise<number> {
const current = this.failures.get(key) || 0;
const updated = current + 1;
this.failures.set(key, updated);
return updated;
}
async recordSuccess(key: string): Promise<void> {
this.failures.delete(key);
this.close(key);
}
async open(key: string): Promise<void> {
this.openCircuits.add(key);
// Auto-close after 5 minutes
const timer = setTimeout(() => {
this.close(key);
}, 5 * 60 * 1000);
this.resetTimers.set(key, timer);
}
close(key: string): void {
this.openCircuits.delete(key);
this.failures.delete(key);
const timer = this.resetTimers.get(key);
if (timer) {
clearTimeout(timer);
this.resetTimers.delete(key);
}
}
isOpen(key: string): boolean {
return this.openCircuits.has(key);
}
}
6.2 Partial Processing Strategy
class PartialProcessingManager {
async processWorkspacesBatch(
workspaces: Workspace[],
processor: (workspace: Workspace) => Promise<void>
): Promise<BatchResult> {
const results: BatchResult = {
total: workspaces.length,
succeeded: 0,
failed: 0,
errors: []
};
// Process in batches of 10
const batches = this.createBatches(workspaces, 10);
for (const batch of batches) {
const promises = batch.map(async (workspace) => {
try {
await processor(workspace);
results.succeeded++;
} catch (error: any) {
results.failed++;
results.errors.push({
workspaceId: workspace.id,
workspaceName: workspace.attributes.name,
error: error.message,
timestamp: new Date()
});
}
});
await Promise.allSettled(promises);
}
return results;
}
private createBatches<T>(items: T[], batchSize: number): T[][] {
const batches: T[][] = [];
for (let i = 0; i < items.length; i += batchSize) {
batches.push(items.slice(i, i + batchSize));
}
return batches;
}
}
interface BatchResult {
total: number;
succeeded: number;
failed: number;
errors: BatchError[];
}
interface BatchError {
workspaceId: string;
workspaceName: string;
error: string;
timestamp: Date;
}
7. Implementation Examples
7.1 Complete Discovery Flow
// Example: Full discovery and ingestion flow
async function runDiscoveryAndIngestion() {
// 1. Initialize components
const credentialStore = new VaultCredentialStore(vaultClient);
const multiOrgManager = new MultiOrgManager(credentialStore);
await multiOrgManager.loadConfigurations('./config/organizations.yaml');
// 2. Create discovery service
const cache = new RedisWorkspaceCache(redisClient);
const eventBus = new EventBusImpl();
// 3. Run discovery for all organizations
for (const orgConfig of multiOrgManager.listOrganizations()) {
if (!orgConfig.discoveryEnabled) continue;
const client = multiOrgManager.getClient(orgConfig.id);
const discoveryService = new WorkspaceDiscoveryService(
client,
{
organizations: [orgConfig.tfcOrgName],
schedule: orgConfig.discoverySchedule,
filters: orgConfig.filters,
mappingRules: orgConfig.mappingRules
},
cache,
eventBus
);
try {
const result = await discoveryService.discoverAll();
console.log(`Discovery completed for ${orgConfig.name}:`);
console.log(` Total: ${result.totalWorkspaces}`);
console.log(` New: ${result.newWorkspaces}`);
console.log(` Updated: ${result.updatedWorkspaces}`);
console.log(` Duration: ${result.duration}ms`);
} catch (error) {
console.error(`Discovery failed for ${orgConfig.name}:`, error);
}
}
// 4. Process discovered workspaces
eventBus.subscribe('workspace.discovered', async (event) => {
const { workspace, metadata, organization } = event.payload;
// Ingest workspace metadata
await database.workspaces.upsert({
id: workspace.id,
name: workspace.attributes.name,
organization,
tenant: metadata.tenant,
businessUnit: metadata.businessUnit,
environment: metadata.environment,
vcsRepo: metadata.vcsRepo,
tags: metadata.tags
});
// Trigger state ingestion
const client = multiOrgManager.getClient(organization);
const stateVersion = await client.getCurrentStateVersion(workspace.id);
if (stateVersion) {
const state = await client.downloadState(stateVersion.id);
await stateIngestionService.ingestState({
workspace,
stateVersion,
state,
trigger: 'discovery'
});
}
});
}
7.2 Webhook Processing Flow
// Example: Webhook receiver and processor
async function setupWebhookProcessing() {
// 1. Initialize queue
const eventQueue = new BullMQEventQueue(process.env.REDIS_URL!);
// 2. Create webhook receiver
const webhookConfig: WebhookConfig = {
endpoint: '/webhooks/terraform-cloud',
signingSecret: process.env.TFC_WEBHOOK_SECRET!,
eventFilters: ['run:completed']
};
const verifier = new SignatureVerifier();
const receiver = new TerraformCloudWebhookReceiver(
webhookConfig,
eventQueue,
verifier
);
// 3. Setup HTTP endpoint
app.post('/webhooks/terraform-cloud', async (req, res) => {
const response = await receiver.handleWebhook(req);
res.status(response.status).json({ message: response.message });
});
// 4. Start event processor
const multiOrgManager = new MultiOrgManager(credentialStore);
await multiOrgManager.loadConfigurations('./config/organizations.yaml');
const worker = eventQueue.createWorker(async (event) => {
const payload = event.payload as WebhookPayload;
// Find the client for this organization
const orgConfig = multiOrgManager
.listOrganizations()
.find(c => c.tfcOrgName === payload.organization_name);
if (!orgConfig) {
console.error(`No configuration for org ${payload.organization_name}`);
return;
}
const client = multiOrgManager.getClient(orgConfig.id);
// Process the state update
const workspace = await client.getWorkspace(payload.workspace_id);
const stateVersion = await client.getCurrentStateVersion(payload.workspace_id);
if (stateVersion) {
const state = await client.downloadState(stateVersion.id);
await stateIngestionService.ingestState({
workspace,
stateVersion,
state,
trigger: 'webhook',
runId: payload.run_id
});
}
});
console.log('Webhook processing started');
}
8. Monitoring and Observability
interface MetricsCollector {
recordAPICall(org: string, endpoint: string, duration: number): void;
recordRateLimit(org: string): void;
recordError(org: string, errorType: TFCErrorType): void;
recordDiscovery(org: string, result: DiscoveryResult): void;
recordStateIngestion(org: string, workspaceId: string, duration: number): void;
}
class PrometheusMetricsCollector implements MetricsCollector {
private apiCallDuration: Histogram;
private rateLimitCounter: Counter;
private errorCounter: Counter;
private discoveryGauge: Gauge;
private stateIngestionDuration: Histogram;
constructor(registry: Registry) {
this.apiCallDuration = new Histogram({
name: 'tfc_api_call_duration_seconds',
help: 'Duration of TFC API calls',
labelNames: ['organization', 'endpoint'],
registers: [registry]
});
this.rateLimitCounter = new Counter({
name: 'tfc_rate_limit_total',
help: 'Total rate limit occurrences',
labelNames: ['organization'],
registers: [registry]
});
this.errorCounter = new Counter({
name: 'tfc_errors_total',
help: 'Total errors by type',
labelNames: ['organization', 'error_type'],
registers: [registry]
});
this.discoveryGauge = new Gauge({
name: 'tfc_discovery_workspaces',
help: 'Number of workspaces discovered',
labelNames: ['organization', 'type'],
registers: [registry]
});
this.stateIngestionDuration = new Histogram({
name: 'tfc_state_ingestion_duration_seconds',
help: 'Duration of state ingestion',
labelNames: ['organization', 'workspace'],
registers: [registry]
});
}
recordAPICall(org: string, endpoint: string, duration: number): void {
this.apiCallDuration.observe({ organization: org, endpoint }, duration / 1000);
}
recordRateLimit(org: string): void {
this.rateLimitCounter.inc({ organization: org });
}
recordError(org: string, errorType: TFCErrorType): void {
this.errorCounter.inc({ organization: org, error_type: errorType });
}
recordDiscovery(org: string, result: DiscoveryResult): void {
this.discoveryGauge.set(
{ organization: org, type: 'total' },
result.totalWorkspaces
);
this.discoveryGauge.set(
{ organization: org, type: 'new' },
result.newWorkspaces
);
this.discoveryGauge.set(
{ organization: org, type: 'updated' },
result.updatedWorkspaces
);
}
recordStateIngestion(org: string, workspaceId: string, duration: number): void {
this.stateIngestionDuration.observe(
{ organization: org, workspace: workspaceId },
duration / 1000
);
}
}
9. Configuration Example
# config/organizations.yaml
organizations:
- id: client-a
name: "Client A"
tfcOrgName: "client-a-prod"
credentialRef: "vault:secret/tfc/client-a"
tenantId: "client-a"
discoveryEnabled: true
webhookEnabled: true
discoverySchedule: "0 */6 * * *" # Every 6 hours
filters:
- type: tags
tags:
- "managed:true"
- "environment:production"
- type: name_pattern
pattern: "^(prod|staging)-.*"
mappingRules:
- sourcePattern: "^([^-]+)-([^-]+)-.*"
targetTemplate: "infrastructure-resource"
tenantExtraction: FROM_WORKSPACE_PREFIX
businessUnitExtraction: FROM_WORKSPACE_NAME
- id: client-b
name: "Client B"
tfcOrgName: "client-b-infrastructure"
credentialRef: "vault:secret/tfc/client-b"
tenantId: "client-b"
discoveryEnabled: true
webhookEnabled: false
discoverySchedule: "0 0 * * *" # Daily at midnight
filters:
- type: created_after
date: "2024-01-01T00:00:00Z"
mappingRules:
- sourcePattern: ".*"
targetTemplate: "infrastructure-resource"
tenantExtraction: FROM_ORG_NAME
businessUnitExtraction: FROM_TAGS
10. Next Steps
-
Implementation Priority:
- ✅ API client with authentication
- ✅ Rate limiting and error handling
- ✅ Workspace discovery service
- ✅ Multi-organization support
- ✅ Webhook receiver
- ✅ Event processing pipeline
-
Integration Points:
- State ingestion service (from previous design)
- Backstage catalog (workspace → entity mapping)
- Secrets management (Vault, AWS Secrets Manager)
- Monitoring (Prometheus, Grafana)
- Message queue (BullMQ, RabbitMQ)
-
Testing Strategy:
- Unit tests for API client
- Integration tests with TFC sandbox
- Load testing for rate limiter
- End-to-end tests for discovery flow
- Chaos testing for error handling
-
Security Considerations:
- Credential rotation automation
- Webhook signature verification
- Multi-tenant data isolation
- Audit logging for all operations
- Least-privilege API tokens
-
Operational Readiness:
- Runbook for common failures
- Alerting rules (rate limits, auth failures)
- Capacity planning (queue size, cache limits)
- Backup and recovery procedures