-
Notifications
You must be signed in to change notification settings - Fork 695
Add QOS API's #2148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add QOS API's #2148
Conversation
5e80643
to
5b5337e
Compare
PTAL @klauspost need your feedback here. |
@poornas Why is this in |
This is bucket level settings and it's implemented at the same level as S3 API. Just like bucket inventory APIs that are also not AWS S3 specific we have extended it extensively. |
Sounds much like an admin API to me. But whatever. Is there any docs on the YAML? This doesn't tell too much - maybe it should be here as well? |
@poornas ^^ |
@klauspost , added the yaml docs to aistor PR - didn't expect claude to generate such comprehensive docs on the feature |
@vadmeste @klauspost , PTAL |
type QOSMetric struct { | ||
APIName string `json:"apiName"` | ||
Rule QOSRule `json:"rule"` | ||
Totals CounterMetric `json:"totals"` | ||
Throttled CounterMetric `json:"throttleCount"` | ||
ExceededRateLimit CounterMetric `json:"exceededRateLimitCount"` | ||
ClientDisconnCount CounterMetric `json:"clientDisconnectCount"` | ||
ReqTimeoutCount CounterMetric `json:"reqTimeoutCount"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in terms of overall design, I am still a bit unsure how this ties into the total system.
Rates can only realistically be applied per node. So are these numbers divided by the node count? And each bucket has it's own settings.
So given a servers capacity 'n', for this to be effective it will be divided by the node count and divided with the bucket count.
So each bucket would end up with a very small req/s... So I feel like I'm probably missing the bigger picture. Is there per node QoS setting as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design was initially to rate limit at node level - the way you are describing. However, QOS was defined more like a template and depending on the number of buckets the rate limits set would be arbitrary and not easy for users to decide sane limits.
The scope has since been changed to make QOS bucket centric, - it would be more useful to throttle specific API’s based on workload seen in bucket specific metrics.
The QOS config can be configured on a per bucket and as-needed basis . A rule limiting PutObject API to x concurrent requests for a bucket would imply total limit is x * n (number of nodes). Bucket level QOS will allow taxing callers/API's that are known to be problematic to system performance based off metrics rather than admins setting a limit at node level that may be harder to control
drawback of bucket QOS is that overall limits is harder to infer - prometheus metrics and qos status
will likely help with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not doing to leave this hanging, when someone clearly thinks this is a good idea.
can this be merged |
This is for bucket level Quality of service feature