The purpose of this repository is to provide a set of scalability and performance tests for Red Hat Quay and Project Quay. These tests are not intended to necessarily push Quay to its limit but instead collect metrics on various operations. These metrics are used to determine how changes to Quay affects its performance. A side-effect of these tests is the ability to identify bottlenecks and slow operations.
The test suite is designed to run in-cluster and is fairly simple to start. There are a few prerequisites and they will be explained below.
- Create an Elasticsearch instance. The results will be stored here.
- Deploy a Kubernetes environment. The tests will run within the cluster.
- Deploy Quay, itself.
- In Quay, as a superuser (important), create an organization for testing purposes. Within that organization, create an application for testing purposes. Within that application, create an OAuth Token with all permissions (checkboxes) granted. Hold on to this token as it will be used later.
The test suite will run as a collection of jobs within the Kubernetes cluster. It is recommended to use a separate namespace for the tests.
In this repository, there is a Job YAML file which will deploy the performance tests. This YAML file specifies some environment variables which should be overridden for your specific environment. The deployment file also creates a service account used for the Job(s) and deploys a Redis instance used as a central queue.
- Ensure the Job can run privileged. In Openshift, you may have to run
oc adm policy add-scc-to-user privileged system:serviceaccount:$NAMESPACE:default - Edit the deployment file
deploy/test.job.yaml- Change
QUAY_HOSTto the value of your Quay deployment's URL. This should match the value ofSERVER_HOSTNAMEin Quay'sconfig.yaml. - Change
QUAY_OAUTH_TOKENto the value of the token you created for your application during the prerequisites. - Change
QUAY_ORGto the name of the organization you created during the prerequisites. Example:test. - Change
ES_HOSTto the hostname of your Elasticsearch instance. - Change
ES_PORTto the port number your Elasticsearch instance is listening on.
- Change
- Deploy the performance tests job:
kubectl create -f deploy/test.job.yaml -n $NAMESPACE
At this point, a Job with a single pod should be running. The job will output a decent amount of information to the logs if you'd like to watch its progress. Eventually, the Job gets to a point where it will perform tests against the registry aspects of the container (using podman) and will create other Jobs to execute those operations.
The following environment variables can be specified in the Job's deployment file to change the behavior of the tests.
| Key | Type | Required | Description |
|---|---|---|---|
| QUAY_HOST | string | y | hostname of Quay instance to test |
| QUAY_OAUTH_TOKEN | string | y | Quay Application OAuth Token. Used for authentication purposes on certain API endpoints. |
| QUAY_ORG | string | y | The organization which will contain all created resources during the tests. |
| ES_HOST | string | y | Hostname of the Elasticsearch instance used to store the test results. |
| ES_PORT | string | y | Port of the Elasticsearch instance used for storing test results. |
| BATCH_SIZE | string | n | Number of items to pop off the queue in each batch. This primarily applies to the registry push and pull tests. Do not exceed 400 until the known issue is resolved. |
| CONCURRENCY | int | n | Defaults to 4. The quantity of requests or test executions to perform in parallel. |
v0.0.2
changes:
- Python is used for orchestrating and defining all tests.
- Tests now run within a kubernetes cluster.
- Registry tests are executed concurrently using parallel kubernetes jobs.
- Reduced the number of steps required to run the tests.
known issues:
- The orchestrator job does not cleanup the other jobs it creates. There is no owner attribute specified so they are not cleaned up when the main Job is deleted either.
- The image used for registry operations has an issue where
podman buildwill leave fuse processes running after it has completed. This can cause a situation where all available threads are used. Due to this issue, the batch size for each Job in the "podman push" tests are limited to 400. - The container image uses alpine:edge. This is the only version of Alpine which includes podman. Alpine was chosen as there are complications which arise from trying to perform build/push/pull operations within Kubernetes and Openshift. It seemed to eliminate some of those issues. Eventually, a stable image should be used instead.
- The output logging of some subprocesses is broken and creates very long lines.
- The primary Job does not watch for the failure of its child Jobs.
0.0.1
- The original implementation.
(TODO) This section still needs to be written.