Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure DynamoDBExport READ_THROUGHPUT For On-Demand Billing Mode #176

Open
SomeKidXD opened this issue Mar 21, 2023 · 0 comments

Comments

@SomeKidXD
Copy link

Hi there, my team is running into an issue with this library due to how READ_THROUGHPUT is set for On-Demand tables. The code defaults to the DynamoDB default of 40,000 for On-Demand tables:

// If not specified at the table level, set a hard coded value of 40,000
jobConf.set(DynamoDBConstants.READ_THROUGHPUT,
DynamoDBConstants.DEFAULT_CAPACITY_FOR_ON_DEMAND.toString());
jobConf.set(DynamoDBConstants.WRITE_THROUGHPUT,
DynamoDBConstants.DEFAULT_CAPACITY_FOR_ON_DEMAND.toString());

However, we have increased the quota for our account with AWS to a value higher than this. Is there a way to get this READ_THROUGHPUT reflected in this library?

One possible way I found is if we set the readRatio to be higher than 1:

jobConf.set(DynamoDBConstants.THROUGHPUT_READ_PERCENT, readRatio.toString());

I believe the library allows this, and I think the correct calculations are reflected:

public InputSplit[] getSplits(JobConf conf, int desiredSplits) throws IOException {
double readPercentage = Double.parseDouble(conf.get(DynamoDBConstants
.THROUGHPUT_READ_PERCENT, DynamoDBConstants.DEFAULT_THROUGHPUT_PERCENTAGE));
if (readPercentage <= 0) {
throw new RuntimeException("Invalid read percentage: " + readPercentage);
}
log.info("Read percentage: " + readPercentage);
double maxReadThroughputAllocated = ((double) conf.getLong(DynamoDBConstants.READ_THROUGHPUT,
1));
double maxWriteThroughputAllocated = ((double) conf.getLong(DynamoDBConstants
.WRITE_THROUGHPUT, 1));
if (maxReadThroughputAllocated < 1.0) {
throw new RuntimeException("Read throughput should not be less than 1. Read throughput "
+ "percent: " + maxReadThroughputAllocated);
}
int configuredReadThroughput = (int) Math.floor(maxReadThroughputAllocated * readPercentage);
if (configuredReadThroughput < 1) {
configuredReadThroughput = 1;
}

My question is, is this an acceptable usage of the readRatio parameter? Or should a more sustainable way to override the default 40,000 READ_THROUGHPUT for On-Demand tables be built into this library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant