Skip to content

Increase initial buffer size for stroom.data.store.impl.fs.standard.BlockGZIPOutputFile #5316

@at055612

Description

@at055612

Currently stroom.data.store.impl.fs.standard.BlockGZIPOutputFile has two underlying BlockByteArrayOutputStreams each of which is initialised with a byte[] of size 32 bytes. Each time there is insufficient space in the byte[], ByteArrayOutputStream will create a new byte[] double the size and array copy the contents over. At most the byte[] will be the Block GZip block size (1_000_000). For most Event and Raw Event streams, it will hit this limit. It will take 15 array copies to get from 32bytes to 1_000_000bytes.

It would be better to initialise the BlockByteArrayOutputStreams with a more appropriate intial size. The initial size is largely dependant on stream type, so this is best done in stroom.data.store.impl.fs.standard.FsPathHelper#getOutputStream. We could have a stroom prop that is a Map<String, Integer> (streamType => intitialSize).

Events and Raw Events could probably had a default prop value of 1_000_000, while Meta & Errors,

Based on looking at the file sizes in a stroom env, the default prop values could be:

  • Raw Events - 1_000_000
  • Raw Reference - 500_000
  • Events - 1_000_000
  • Reference - 500_000
  • Records - 8_192
  • Detections - 8_192
  • Errors - 1_024
  • Meta - 1_024
  • Context - 1_024

And for the index buffer for writing the

        this.indexBuffer = new BlockByteArrayOutputStream();

This should probably use 8_192.

Metadata

Metadata

Assignees

No one assigned

    Labels

    refactorCode refactor, tidy up, dependency uplift, etc.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions