Benchmarking ============ In the `benchmarking` folder is a benchmarking script and associated Dockerfile. The docker image is published at ``https://quay.io/repository/n1analytics/entity-benchmark`` The container/script is configured via environment variables. - ``SERVER``: (required) the url of the server. - ``EXPERIMENT``: json file containing a list of experiments to run. Schema of experiments is defined in `./schema/experiments.json`. - ``DATA_PATH``: path to a directory to store test data (useful to cache). - ``RESULT_PATH``: full filename to write results file. - ``SCHEMA``: path to the linkage schema file used when creating projects. If not provided it is assumed to be in the data directory. - ``TIMEOUT``: this timeout defined the time to wait for the result of a run in seconds. Default is 1200 (20min). Run Benchmarking Container -------------------------- Run the container directly with docker - substituting configuration information as required:: docker run -it -e SERVER=https://testing.es.data61.xyz \ -e RESULTS_PATH=/app/results.json \ quay.io/n1analytics/entity-benchmark:latest By default the container will pull synthetic datasets from an S3 bucket and run default benchmark experiments against the configured ``SERVER``. The default experiments (listed below) are set in ``benchmarking/default-experiments.json``. The output will be printed and saved to a file pointed to by ``RESULTS_PATH`` (e.g. to ``/app/results.json``). Cache Volume ~~~~~~~~~~~~ For speeding up benchmarking when running multiple times you may wish to mount a volume at the ``DATA_PATH`` to store the downloaded test data. Note the container runs as user ``1000``, so any mounted volume must be read and writable by that user. To create a volume using docker:: docker volume create linkage-benchmark-data To copy data from a local directory and change owner:: docker run --rm -v `pwd`:/src \ -v linkage-benchmark-data:/data busybox \ sh -c "cp -r /src/linkage-bench-cache-experiments.json /data; chown -R 1000:1000 /data" To run the benchmarks using the cache volume:: docker run \ --name ${benchmarkContainerName} \ --network ${networkName} \ -e SERVER=${localserver} \ -e DATA_PATH=/cache \ -e EXPERIMENT=/cache/linkage-bench-cache-experiments.json \ -e RESULTS_PATH=/app/results.json \ --mount source=linkage-benchmark-data,target=/cache \ quay.io/n1analytics/entity-benchmark:latest Experiments ----------- Experiments to run can be configured as a simple json document. The default is:: [ { "sizes": ["100K", "100K"], "threshold": 0.95 }, { "sizes": ["100K", "100K"], "threshold": 0.80 }, { "sizes": ["100K", "1M"], "threshold": 0.95 } ] The schema of the experiments can be found in ``benchmarking/schema/experiments.json``.