Public cloud — Analytics
Managed Kafka for event streaming. OpenSearch and ClickHouse for storage and query. Managed dashboards and a logs platform on top. Hosted on intSignal infrastructure — and connected to your data wherever it lives.
Event Streaming + Observability & Analytics under one on-call team.
Stream from and to AWS, Azure, GCP, or on-premises sources.
TLS on every connection, AES-256 at rest.
Access logs, retention controls, evidence cadence.
The data flow
Analytics isn't one product — it's five stages, and every stage has gotchas. We operate all five so you debug your data, not your data pipeline.
STAGE 01
Pull data in from applications, hyperscaler services, change-data-capture, and external systems.
STAGE 02
Buffer, replicate, and route events with durability and ordering guarantees. Replicate across regions.
STAGE 03
Index for search, store columnar for OLAP, tier logs by retention. Pick the engine that fits the query.
STAGE 04
Full-text search, sub-second OLAP, log search across retention tiers. Standard interfaces, no lock-in.
STAGE 05
Managed dashboards across all data sources. Embed in your app or hand them to stakeholders.
Sub-platform 01 · Event Streaming
Managed Apache Kafka with the surrounding ecosystem: Connect for ingest and egress, MirrorMaker for cross-cluster replication, schema registry, and TLS/SASL everywhere. Same operating model whether your producers live on intSignal or in a hyperscaler.
EVENT BUS
Multi-broker cluster with rack awareness, replication factor tuned per topic, and schema registry. Producers and consumers connect with standard Kafka clients.
java · producer.java
// Standard Kafka client. TLS + SASL.
Properties props = new Properties();
props.put("bootstrap.servers", "kafka.intsignal.io:9093");
props.put("security.protocol", "SASL_SSL");
props.put("sasl.mechanism", "SCRAM-SHA-512");
props.put("key.serializer", StringSerializer.class.getName());
props.put("value.serializer", KafkaAvroSerializer.class.getName());
props.put("schema.registry.url", "https://sr.intsignal.io");
try (var producer = new KafkaProducer<String, Order>(props)) {
producer.send(new ProducerRecord<>("orders", order.id(), order));
}
// → ack from majority replicas, schema validatedINGEST · EGRESS
Pull data from databases (PostgreSQL CDC, MySQL binlog), object storage, hyperscaler queues, or external APIs. Push to sinks like ClickHouse, OpenSearch, or S3. We run connectors, version them, and monitor them.
json · postgres-source.json
// PostgreSQL CDC → Kafka topic (Debezium)
{
"name": "orders-cdc",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "db.intsignal.io",
"database.dbname": "orders",
"plugin.name": "pgoutput",
"slot.name": "kafka_orders",
"table.include.list": "public.orders,public.line_items",
"topic.prefix": "orders"
}
}
// → orders.public.orders, orders.public.line_itemsREPLICATION
Replicate topics across clusters, regions, or clouds. Disaster-recovery clusters, multi-region active-active, or migration from an existing Kafka deployment — same tool, different shapes.
properties · mm2.properties
# Replicate from on-prem Kafka to intSignal
clusters = onprem, intsignal
onprem.bootstrap.servers = kafka.onprem.lan:9093
intsignal.bootstrap.servers = kafka.intsignal.io:9093
# Topics to replicate, regex supported
onprem->intsignal.enabled = true
onprem->intsignal.topics = orders.*, payments.*, audit.*
# Translate consumer-group offsets across clusters
sync.group.offsets.enabled = true
emit.checkpoints.enabled = true
# → ready for failover or migrationCross-cloud · cross-region
Most analytics workloads aren't greenfield. The data is already somewhere — in an existing Kafka, in an RDS instance, in Cloud Pub/Sub, in Azure Event Hubs, in object storage logs. We connect to it.
Kafka Connect provides hundreds of source and sink connectors. MirrorMaker handles cluster-to-cluster replication for HA, migration, or active-active topologies across intSignal-hosted and hyperscaler-resident Kafka.
Sub-platform 02 · Observability & Analytics
OpenSearch for search and log workloads. ClickHouse for sub-second analytical queries on huge tables. A logs data platform that knows about retention tiers and access policy. Managed dashboards on top of all of it.
SEARCH · INDEX
Full-text search, log analytics, and aggregations. Managed cluster with hot/warm/cold node tiers, index lifecycle policies, and snapshot-based backups. API-compatible with Elasticsearch 7.x clients.
http · search.sh
# Aggregation query against logs index
$ curl -X POST "https://os.intsignal.io/logs-*/_search" \
-H "Content-Type: application/json" -d '{
"size": 0,
"query": {
"range": { "@timestamp": { "gte": "now-1h" } }
},
"aggs": {
"errors_by_service": {
"terms": { "field": "service.keyword", "size": 10 },
"aggs": { "rate": { "rate": { "unit": "minute" } } }
}
}
}'
# → top 10 services by error rate, last hourCOLUMNAR OLAP
Sub-second analytical queries over billions of rows. Distributed tables, materialized views, and TTL-driven retention. Standard SQL — your analysts and BI tools already speak it.
sql · events.sql
-- Real-time aggregation over a billion rows
SELECT
toStartOfMinute(event_time) AS minute,
country,
uniqExact(user_id) AS users,
sum(amount) AS revenue
FROM events_distributed
WHERE event_time >= now() - INTERVAL 1 HOUR
AND event_type = 'purchase'
GROUP BY minute, country
ORDER BY minute DESC, revenue DESC
LIMIT 100;
-- → Elapsed: 0.2s, 1.2B rows scanned, 14MB readLOG PLATFORM
End-to-end log management: ingest with FluentBit / Vector, route into OpenSearch or object storage based on retention class, search across tiers, and apply access policies per-tenant or per-source.
yaml · fluentbit.conf
# Ship app logs to intSignal logs platform
[INPUT]
Name tail
Path /var/log/app/*.log
Tag app.*
Parser json
[FILTER]
Name modify
Match *
Add environment prod
Add cluster prod-aws
[OUTPUT]
Name http
Match *
Host logs.intsignal.io
Port 443
TLS on
Header Authorization Bearer ${LOGS_TOKEN}VISUALIZATION
Grafana and OpenSearch Dashboards, hosted and operated. Data sources pre-wired to OpenSearch, ClickHouse, Prometheus, and your logs platform. SSO integration and per-team folder permissions.
yaml · dashboard.yaml
# Dashboards as code — version-controlled in Git
apiVersion: grafana.com/v1
kind: Dashboard
metadata:
name: api-latency-overview
folder: platform-team
spec:
datasource:
- name: "clickhouse-prod"
type: "grafana-clickhouse-datasource"
panels:
- title: "p99 latency by endpoint"
type: "timeseries"
query: |
SELECT toStartOfMinute(ts) AS t,
endpoint,
quantile(0.99)(latency_ms)
FROM api_logs
WHERE ts >= now() - INTERVAL 1 HOUR
GROUP BY t, endpointWhat people build
If your project looks like one of these, the right stack is mostly already chosen. We'll walk you through it in the consultation.
REAL-TIME PRODUCT ANALYTICS
App emits clickstream events. Kafka ingests them. ClickHouse stores them columnar. Grafana shows live conversion funnels and cohort metrics that update as users click.
CENTRALIZED OBSERVABILITY
FluentBit ships logs from every cluster. OpenSearch indexes them with retention tiers. Dashboards expose error rates, p99 latencies, and saturation across services.
CDC INTO ANALYTICS
Kafka Connect Debezium captures PostgreSQL row changes. ClickHouse materializes them into wide analytical tables. Reports run on a copy that's seconds behind production.
SECURITY EVENT PIPELINE
Audit and access events stream into Kafka. They route to hot OpenSearch indices for active investigation and to cold storage for long-tail retention required by compliance.
MULTI-CLOUD DR
MirrorMaker replicates topics between a primary cluster on AWS and a DR cluster on intSignal. Consumer offsets are translated so failover is a configuration change, not a code change.
EMBEDDED CUSTOMER ANALYTICS
ClickHouse stores customer event data with per-tenant isolation. Grafana panels are embedded into your app via signed URLs. Customers see their own data, scoped automatically.
Hardening and operating practices aligned to the frameworks your assessors recognize. intSignal is not the certified entity for most of these — we deliver the controls and evidence that make your audit possible. Where required, we partner with FedRAMP-authorized providers for federal scoping.
HARDENING
Engine hardening with documented exceptions.
SOC 2
Controls and evidence cadence ready for audit.
ISO
Cloud-services control narratives.
HIPAA
Encryption, access, audit; BAA via partner.
FEDERAL
Authorized hyperscaler regions integrated.
DATACENTER
Hosting facility carries its own attestations.
FAQ
If yours isn't here, ask in the consultation — we'd rather flag the awkward bits early than discover them in production.
On intSignal infrastructure, in our hosting facility. But the data doesn't have to live there — Kafka Connect and MirrorMaker pull from and push to your hyperscaler workloads, on-premises systems, or external SaaS, so we're rarely the only place your data sits.
Yes. Kafka Connect has source connectors for AWS MSK, Kinesis, Azure Event Hubs (via Kafka API), Google Pub/Sub, plus database CDC and object-storage watchers. MirrorMaker handles full cluster replication when you have an existing Kafka deployment.
OpenSearch is best for full-text search, log analytics, and ad-hoc exploration of semi-structured data. ClickHouse is best for high-cardinality, aggregation-heavy analytical SQL over very large tables. Many customers run both — we'll help you pick during the consultation. There's no penalty for running both since the operator is the same.
OpenSearch was forked from Elasticsearch 7.10 and remains API-compatible with that line. Most clients, Beats, Logstash, and Elasticsearch 7.x integrations continue to work without changes. If you're on Elasticsearch 8 with X-Pack features specifically, we'll review the migration path with you.
Several layers. Kafka topics support ACLs per principal. OpenSearch document-level security and ClickHouse row policies enforce tenant scoping at query time. Dashboards apply additional scope via signed URLs or SSO group membership. We design the isolation model with you during onboarding.
The logs platform supports hot / warm / cold / frozen tiers with automatic transitions. Hot for active investigation (full search, fast), cold and frozen for compliance retention (searchable, slower, cheaper). You set the policy per log source; we operate the transitions.
Yes. The managed dashboards are an option, not a requirement. OpenSearch, ClickHouse, and the logs platform all expose standard APIs. Bring Metabase, Superset, Tableau, Power BI, Looker, or anything else that speaks HTTP or SQL.
Tell us where your data lives today, what you want to know about it, and your retention and compliance constraints. We'll propose a stack and walk you through the operating model.