Skip to content

Limit maximum request size#15

Open
michaJlS wants to merge 1 commit into
masterfrom
resolving-too-large-request-issue
Open

Limit maximum request size#15
michaJlS wants to merge 1 commit into
masterfrom
resolving-too-large-request-issue

Conversation

@michaJlS

@michaJlS michaJlS commented Mar 4, 2019

Copy link
Copy Markdown
Contributor

We've run into issue when the single request consisting of several records exceeded AWS imposed request size limit, which is 5MB. To overcome that issue we can be dividing buffered records into several requests instead of sending just single one.

My approach won't help if a single record exceeds AWS limits.

I am making as well the buffer size configurable for the client.

- sending several requests instead of single if the messages exceed AWS
  max size limit
@michaJlS michaJlS requested a review from kcaipq March 4, 2019 16:05
@@ -21,11 +21,15 @@ object KinesisGraphStage {
val ProvisionedThroughputExceededExceptionCode = "ProvisionedThroughputExceededException"

private[KinesisGraphStage] val maxBufferSize = 500 // hard limit imposed by AWS

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call number of bytes and number of records "size" - that can be confusing, but nothing better has came to my mind yet.

* The trick is that it then puts any failed items back into the buffer
*/
class KinesisGraphStage[A : ToPutRecordsRequest](putRecords: PutRecords, streamName: String)
class KinesisGraphStage[A : ToPutRecordsRequest](putRecords: PutRecords, streamName: String, sendingThreshold: Int)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made sendingThreshold. Not sure if I should expose a setter method for it as well.

val zero = 0 -> List.empty[PutRecordsRequestEntry] -> List.empty[List[PutRecordsRequestEntry]]
entries.foldLeft(zero) { case (((total, newList), lists), entry) =>
val size = entry.getData.limit() // an approximation, as in the request it will have different size
if (total + size > maxRequestSize)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so here is a caveat. Size of string != size of record. We may be adding some metadata later to the message, so it can be bigger in the end? Then we shouldn't pack up to maxRequestSize of bytes, and instead like 90% of maxRequestSize? Should I just hardcode it or make it configurable?

/**
* Split
*/
private def splitBySize(entries: List[PutRecordsRequestEntry]): List[List[PutRecordsRequestEntry]] = {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will work if and only if each record size < maxRequestSize. If not, then our request will be rejected by aws anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you meant to say the request size < maxRequestSize

Comment thread src/main/scala/com/timeout/KinesisGraphStage.scala
private def splitBySize(entries: List[PutRecordsRequestEntry]): List[List[PutRecordsRequestEntry]] = {
val zero = 0 -> List.empty[PutRecordsRequestEntry] -> List.empty[List[PutRecordsRequestEntry]]
entries.foldLeft(zero) { case (((total, newList), lists), entry) =>
val size = entry.getData.limit() // an approximation, as in the request it will have different size

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need a check for 1MB per record

val zero = 0 -> List.empty[PutRecordsRequestEntry] -> List.empty[List[PutRecordsRequestEntry]]
entries.foldLeft(zero) { case (((total, newList), lists), entry) =>
val size = entry.getData.limit() // an approximation, as in the request it will have different size
if (total + size > maxRequestSize)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can turn total + size into a variable, so no need to evaluate twice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants