Optimizing Large Payloads for Sync/Async Communications

Arvind Kumar
5 min readJan 11, 2025

--

When we talk about data transfer over the network among different microservices, there can be two ways to accomplish this — Sync and Async.

Synchronous Communication: In synchronous communication, the sender waits for the receiver to process the message and respond, creating a real-time, blocking interaction. Examples include HTTP REST APIs or RPCs.

Asynchronous Communication: In asynchronous communication, the sender sends the message and continues processing without waiting for an immediate response. Examples include message queues (Kafka, RabbitMQ) or event-driven architectures.

https://youtube.com/@codefarm0

Let’s deep dive into both these strategies →

Sync Communications

In Spring Boot (latest versions), sending a large payload as a response can be optimized using several strategies, depending on the nature of the data, its size, and the client-server requirements. Here’s a detailed guide:

1. Use HTTP Compression

  • Enable GZIP compression to reduce the payload size.
  • Configure application.yml or application.properties to enable compression:
server:
compression:
enabled: true
min-response-size: 1024 # Minimum response size to trigger compression (in bytes)
mime-types: application/json,application/xml,text/html,text/plain # MIME types to compress

2. Paginate Data

  • For large datasets (e.g., lists), avoid sending the entire payload at once. Use pagination to send data in chunks.
  • Use Spring’s Pageable with @RequestParam:
@GetMapping("/data")
public Page<MyEntity> getData(Pageable pageable) {
return myEntityRepository.findAll(pageable);
}
  • Example URL: /data?page=0&size=50

3. Stream Data

  • Stream large data sets directly to the client to minimize memory usage.
  • Use Spring WebFlux for reactive streaming or traditional ResponseEntity with streaming.
  • Example with StreamingResponseBody:
@GetMapping("/stream-data")
public ResponseEntity<StreamingResponseBody> streamLargeData() {
StreamingResponseBody responseBody = outputStream -> {
List<MyEntity> data = fetchLargeData(); // Fetch data in batches
for (MyEntity entity : data) {
outputStream.write(entity.toString().getBytes());
outputStream.flush();
}
};
return ResponseEntity.ok()
.contentType(MediaType.APPLICATION_JSON)
.body(responseBody);
}

4. Use Chunked Transfer Encoding

  • For HTTP 1.1, enable chunked transfer to send data in chunks instead of buffering the entire response in memory.
  • Example using ResponseBodyEmitter:
@GetMapping("/chunked-data")
public ResponseBodyEmitter getChunkedData() {
ResponseBodyEmitter emitter = new ResponseBodyEmitter();
new Thread(() -> {
try {
List<MyEntity> data = fetchLargeData();
for (MyEntity entity : data) {
emitter.send(entity);
}
emitter.complete();
} catch (Exception e) {
emitter.completeWithError(e);
}
}).start();
return emitter;
}

5. Use File Downloads

  • If the payload is extremely large, consider saving it as a file (e.g., CSV, JSON) on the server and provide a download link.
  • Example:
@GetMapping("/download")
public ResponseEntity<Resource> downloadFile() {
File file = generateLargeFile(); // Generate a file containing the large payload
InputStreamResource resource = new InputStreamResource(new FileInputStream(file));
return ResponseEntity.ok()
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=large-data.json")
.body(resource);
}

6. Asynchronous Processing

  • Use @Async methods or Spring WebFlux to handle large responses without blocking the main thread.
  • For WebFlux:
@GetMapping(value = "/large-data", produces = MediaType.APPLICATION_NDJSON_VALUE)
public Flux<MyEntity> getLargeData() {
return myEntityService.fetchLargeData(); // Returns a Flux stream
}

7. Optimize Serialization

  • Use efficient serialization libraries like Jackson, Gson, or Protobuf for large JSON payloads.
  • Configure Jackson for better performance:
@Bean
public ObjectMapper objectMapper() {
ObjectMapper mapper = new ObjectMapper();
mapper.configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false);
mapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
return mapper;
}

8. Use Content Negotiation

  • Support multiple response formats (e.g., JSON, XML, or binary) to allow clients to request the most efficient format.
  • Enable Spring’s ContentNegotiation:
@Configuration
public class WebConfig extends WebMvcConfigurer {
@Override
public void configureContentNegotiation(ContentNegotiationConfigurer configurer) {
configurer.favorPathExtension(true)
.favorParameter(false)
.ignoreAcceptHeader(false)
.defaultContentType(MediaType.APPLICATION_JSON)
.mediaType("xml", MediaType.APPLICATION_XML);
}
}

Key Considerations:

  1. Memory Usage: Ensure the solution doesn’t cause OutOfMemoryError for large payloads.
  2. Timeouts: Large responses might exceed client/server timeouts; tune configurations accordingly.
  3. Security: Always validate and sanitize large responses to avoid overexposure of sensitive data.
  4. Client Compatibility: Ensure the client application can handle the response format and size.

Async Communications

Sending large payloads over Kafka/RabitMQ as events requires careful consideration to ensure efficient delivery, reliability, and scalability. Kafka is designed for high throughput and low latency, but large messages can strain brokers, consumers, and producers. Here are optimized strategies:

1. Split Large Payloads into Chunks

Strategy: Divide the payload into smaller chunks, send them as individual events, and reassemble them at the consumer end.

Implementation:

  • Add metadata (e.g., chunkId, totalChunks, messageId) to identify and reconstruct the message.
  • Example payload structure:
{
"messageId": "1234",
"chunkId": 1,
"totalChunks": 5,
"data": "chunk data here..."
}

Producer:

for (int i = 0; i < chunks.size(); i++) {
kafkaTemplate.send("large-topic", new ChunkPayload(messageId, i, totalChunks, chunks.get(i)));
}

Consumer:

  • Accumulate chunks until all parts are received.
  • Reassemble the message in the correct order.
@KafkaListener(topics = "large-topic")
public void consumeChunk(ChunkPayload payload) {
// Store chunks and reassemble
chunkStore.addChunk(payload);
}

2. Compress the Payload

Strategy: Compress the payload before sending to reduce message size.

Implementation:

Use compression libraries like GZIP, Snappy, or Zstandard.

  • Compress data in the producer:
String payload = largePayload.toString();
byte[] compressedData = compress(payload);
kafkaTemplate.send("large-topic", compressedData);
  • Decompress data in the consumer:
@KafkaListener(topics = "large-topic")
public void consume(byte[] compressedData) {
String decompressedPayload = decompress(compressedData);
}

Kafka Configuration:

  • Enable compression for Kafka producers:
compression.type=gzip # or snappy, lz4

3. Store the Payload Externally

Strategy: Store the large payload in an external storage system (e.g., S3, HDFS, or database) and send only a reference (e.g., URL or key) via Kafka.

Implementation:

  • Producer: Store the payload externally and Publish the reference (e.g., URL or file key) as the Kafka message.
String fileReference = externalStorageService.store(largePayload);
kafkaTemplate.send("large-topic", fileReference);
  • Consumer: Retrieve the data from external storage using the reference.
@KafkaListener(topics = "large-topic")
public void consume(String fileReference) {
String largePayload = externalStorageService.retrieve(fileReference);
}

4. Optimize Kafka Broker Configuration

Increase Kafka broker limits to handle larger payloads

Message size limits:

message.max.bytes=10485760  # 10MB (default: 1MB)
replica.fetch.max.bytes=10485760
  • Producer and consumer:
max.request.size=10485760
fetch.message.max.bytes=10485760

**Note: Avoid very large messages as they may degrade Kafka’s performance.

5. Batch Events

Strategy: Break the payload into smaller records and send them as a batch. Consumers process them together.

Implementation:

  • Producer sends multiple small events in a single batch:
List<String> records = breakIntoSmallerRecords(largePayload);
kafkaTemplate.send("large-topic", records);
  • Consumer processes the batch
@KafkaListener(topics = "large-topic", containerFactory = "batchFactory")
public void consumeBatch(List<String> records) {
// Process all records
process(records);
}

6. Use Avro/Protobuf for Serialization

Strategy: Use compact serialization formats to reduce payload size.

Implementation:

  • Define the schema (e.g., Avro or Protobuf).
  • Serialize the payload before sending and deserialize it at the consumer.
  • Example (Avro):
SpecificRecord record = new MyAvroRecord(largePayload);
byte[] serializedData = serializeAvro(record);
kafkaTemplate.send("large-topic", serializedData);

7. Leverage Kafka Streams for Real-time Processing

Strategy: Use Kafka Streams to process and reconstruct large payloads dynamically.

Implementation:

  • Split and process payloads in a distributed fashion.
  • Use a KStream to aggregate chunks.
KStream<String, ChunkPayload> stream = streamsBuilder.stream("large-topic");
stream.groupByKey()
.aggregate(
() -> new ArrayList<>(),
(key, value, aggregate) -> {
aggregate.add(value);
return aggregate;
},
Materialized.with(Serdes.String(), new JsonSerde<>(ArrayList.class))
);

8. Use Kafka Headers for Metadata

  • Strategy: Include metadata in Kafka headers for chunk tracking or compression details.
  • Implementation:
ProducerRecord<String, String> record = new ProducerRecord<>("large-topic", payload);
record.headers().add("isCompressed", "true".getBytes());
kafkaTemplate.send(record);

9. Error Handling and Retry

  • Use Kafka’s retry mechanism for fault-tolerant delivery.
  • Configure retry logic in the consumer:
retry.backoff.ms=500
max.poll.records=10

Recommendations:

  • Split + External Storage is ideal for very large payloads (>10MB).
  • Compression works best for moderately large payloads (~1–10MB).
  • Use Protobuf/Avro to reduce size and maintain compatibility.

— — — -

Comment for any feedback/thoughts and clap if you liked the article.

--

--

Arvind Kumar
Arvind Kumar

Written by Arvind Kumar

Staff Engineer @Chegg || Passionate about technology || https://youtube.com/@codefarm0

No responses yet