RapidStream

Mark Schmeiser

In the intricate world of microservices, making sure communication flows smoothly and debugging is efficient is extremely important. Among the many tools available to us, correlation IDs stand out as essential tools, especially in event-driven architectures.

These IDs act like a thread connecting different actions that come from the same source, whether it’s an HTTP call or a message moving through the system. In this blog post, we explore how we manage correlation IDs in our Node.js microservice environment, providing insights into our approach and its wider implications.

The Importance of Correlation IDs

In event-driven architecture, correlation IDs act as guiding lights, showing the path of each request as it moves through various services. By assigning a unique identifier to every transaction, we facilitate easier debugging processes, allowing developers to track a request’s journey across the system. Whether examining logs or dissecting event chains, correlation IDs offer valuable context, enhancing observability and troubleshooting capabilities of the system.

Implementation in Node.js

Our strategy for incorporating correlation IDs into our Node.js setup involves careful integration at every stage of the transaction lifecycle:

1. HTTP Requests

To smoothly insert correlation IDs into HTTP requests, we’ve designed an Express middleware based upon cls-rtracer. This middleware intercepts incoming requests, checks for a correlation ID header, and assigns a UUIDv4 if one isn’t present. It then propagates this correlation ID in the response header, ensuring continuity throughout the request-response cycle.

import { expressMiddleware } from 'cls-rtracer'
import { v4 as uuidv4 } from 'uuid'

type HandlerArgs = {
  correlationIdHeaderName?: string
}

const requestCorrelationIdMiddleware = (args?: HandlerArgs) => {
  const correlationIdHeaderName = args?.correlationIdHeaderName || 'x-correlation-id'

  return expressMiddleware({
    // Respect request header flag (default: false).
    // If set to true, the middleware/plugin will always use a value from
    // the specified header (if the value is present).
    useHeader: true,
    // Add request id to response header (default: false).
    // If set to true, the middleware/plugin will add request id to the specified
    // header. Use headerName option to specify header name.
    echoHeader: true,
    // Request/response header name, case in-sensitive (default: 'X-Request-Id').
    // Used if useHeader/echoHeader is set to true.
    headerName: correlationIdHeaderName,
    // A custom function to generate your request ids (default: UUID v1).
    requestIdFactory: uuidv4,
  })
}

export default requestCorrelationIdMiddleware

2. Logging Mechanism

Logging is crucial for system monitoring and diagnostics. Using Winston, a versatile logging library, we’ve developed a custom log format that integrates with cls-rtracer, a tool leveraging AsyncLocalStorage API to retrieve correlation IDs from HTTP headers. By embedding correlation IDs directly into log statements, we offer unparalleled insight into the system’s inner workings.

import { runWithId } from 'cls-rtracer'

// for BullMQ job consumption
const consumeJob = async(job: Job) => {
  // make sure you set the correlation id into all jobs
  const correlationId = job.data?.correlationId
  
  try {
    await runWithId(async () => {
    // your normal job execution - here all log output
    // will be enriched with the correlation id
    ...
    }, correlationId)
  } finally {
    // here we normally do some performance logging
    // and put it to Prometheus metrics
    ...
  }
}

// for RabbitMQ it is very similar
await channel.basicConsume(queueName, { noAck }, async (message) => {
  // make sure you set the correlation id on publishing
  // rabbit supports it as message property
  // you do not need to put it into the message body
  const correlationId = message.properties.correlationId

  try {
    await runWithId(async () => {
      // your message handling
      ...
    }, correlationId)
  } finally {
    // here we normally do some performance logging
    // and put it to Prometheus metrics
    ...
  }
}

3. Message and Job Consumption

Keeping context continuity during message or job consumption can be challenging. However, with cls-rtracer’s runWithId method, we’ve devised a robust solution. By encapsulating message and job consumption within a context that includes the correlation ID, we ensure traceability even in internal processes, enriching logs with invaluable contextual information.

import { runWithId } from 'cls-rtracer'

const consumeJob = async(job: Job) => {
	// make sure you set the correlation id into all jobs
  const correlationId = job.data?.correlationId

	try {
	  await runWithId(async () => {
			// your normal job execution - here all log output
			// will be enriched with the correlation id
			...
		}, correlationId)
  } finally {
		// here we normally do some performance logging
		// and put it to Prometheus metrics
		...
	}
}

In an era focused on data privacy regulations, safeguarding sensitive information is crucial. To align with GDPR guidelines, we’ve implemented a method called cleanSensibleData within our centralized logging format. This method obfuscates personally identifiable information such as IP addresses and email addresses, reducing the risk of inadvertently exposing data in logs.

// with these regular expressions we filter out sensible data
// you can add checks for access tokens or other secrets also
// keep in mind: JSON.parse + JSON.stringfy combination is expensive
const cleanSensibleData = winston.format((info) => {
    try {
      return JSON.parse(
        JSON.stringify(info)
          .replace(REGEX_IP_ADDRESS, 'xxx.xxx.xxx.xxx')
          .replace(REGEX_EMAIL, 'xxx@$<domain>'),
      )
    } catch (e) {
      return info
    }
  })

Here our regular expressions - in case you are interested.

export const REGEX_IP_ADDRESS = /(\d{1,3}\.){3}\d{1,3}/g

export const REGEX_EMAIL =
  /(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?<domain>(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]))/g

Conclusion

In the dynamic realm of microservice architecture, managing correlation IDs effectively becomes a crucial factor for smooth communication and robust debugging capabilities. Through careful integration into our Node.js ecosystem, we’ve enhanced the visibility and traceability of transactions, empowering developers to navigate distributed systems with confidence. As we continue to refine our practices, we remain committed to sharing our insights and supporting fellow developers in their pursuit of operational excellence. If you need guidance on implementing similar strategies in your Node.js setup or navigating GDPR compliance challenges, don’t hesitate to reach out. Together, we can tackle the complexities of modern software development and propel our systems toward unparalleled resilience and efficiency.