Skip to content

Commit a1a52d1

Browse files
committed
Merge branch 'release/1.15.0'
2 parents d9e0332 + 2856414 commit a1a52d1

68 files changed

Lines changed: 7173 additions & 1092 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1313

1414
### Fixed
1515

16+
## [1.15.0] - 2022-02-28
17+
18+
### Added
19+
20+
- Add modal to allow organization admins to create templates from projects [#1107](https://github.com/PublicMapping/districtbuilder/pull/1107)
21+
22+
### Changed
23+
24+
- Stop loading archived topology [#1132](https://github.com/PublicMapping/districtbuilder/pull/1132)
25+
- Change formatting of the PVI chart y-axis [#1136](https://github.com/PublicMapping/districtbuilder/pull/1136)
26+
- Refactor backend to use workers and improve performance [#1149](https://github.com/PublicMapping/districtbuilder/pull/1149)
27+
28+
### Fixed
29+
30+
- parameterize REINDEX rules so both staging and prod can exist [#1130](https://github.com/PublicMapping/districtbuilder/pull/1130)
31+
- Duplicated maps keep reference layers [#1137](https://github.com/PublicMapping/districtbuilder/pull/1137)
32+
- Fix bounds for AK mini-map [#1145](https://github.com/PublicMapping/districtbuilder/pull/1145)
33+
- Fix TopologyService health check[#1148](https://github.com/PublicMapping/districtbuilder/pull/1148)
34+
- Fix sidebar styling on firefox [#1155](https://github.com/PublicMapping/districtbuilder/pull/1155)
35+
- Updated PlanScore integration to use multi-step workflow [#1152](https://github.com/PublicMapping/districtbuilder/pull/1148)
36+
- Fix concurrent Jenkins builds breaking node dependencies [#1146](https://github.com/PublicMapping/districtbuilder/pull/1146)
37+
1638
## [1.14.0] - 2022-02-09
1739

1840
### Added
@@ -432,7 +454,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
432454

433455
- Initial release.
434456

435-
[unreleased]: https://github.com/publicmapping/districtbuilder/compare/1.14.0...HEAD
457+
[unreleased]: https://github.com/publicmapping/districtbuilder/compare/1.15.0...HEAD
458+
[1.15.0]: https://github.com/publicmapping/districtbuilder/compare/1.14.0...1.15.0
436459
[1.14.0]: https://github.com/publicmapping/districtbuilder/compare/1.13.0...1.14.0
437460
[1.13.0]: https://github.com/publicmapping/districtbuilder/compare/1.12.1...1.13.0
438461
[1.12.1]: https://github.com/publicmapping/districtbuilder/compare/1.12.0...1.12.1

deployment/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,8 +93,9 @@ This will attempt to apply the plan assembled in the previous step using Amazon'
9393
- Select **Actions** -> **Run Task**
9494
- Below the warning, click the blue link: "Switch to launch type"
9595
- Select the following
96-
- **Launch type**: `Fargate`
9796
- **Cluster**: (See table below)
97+
- **Launch type**: `Fargate`
98+
- **Operating system family**: `Linux`
9899
- **Cluster VPC**: (See table below)
99100
- **Subnets**: Select any named `PrivateSubnet`
100101
- **Select existing security group**: (See table below)

deployment/terraform/scheduled_tasks.tf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
resource "aws_cloudwatch_event_rule" "reindex_task" {
2-
name = "reindex-scheduled-ecs-event-rule"
2+
name = "reindex-scheduled-ecs-event-rule-${var.environment}"
33
// Run monthly at 5AM UTC (midnight EST)
44
schedule_expression = "cron(0 5 1 * ? *)"
55
}
66

77
resource "aws_cloudwatch_event_target" "reindex_task" {
8-
target_id = "reindex-scheduled-ecs-event-rule"
8+
target_id = "reindex-scheduled-ecs-event-rule-${var.environment}"
99
rule = aws_cloudwatch_event_rule.reindex_task.name
1010
arn = aws_ecs_cluster.app.arn
1111
role_arn = aws_iam_role.ecs_task_role.arn
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Server concurrency strategy
2+
3+
## Context
4+
5+
CPU intensive work on the main event thread is hurting our performance and causing our health-checks to fail. Because Node is single-threaded by default, CPU intensive work on the main thread blocks us from responding to other requests.
6+
7+
There is an exploratory branch here [`c214bac6`](https://github.com/PublicMapping/districtbuilder/commit/c214bac64111615e7a213972d512a7d765c68821) that does some initial work at moving the TopoJSON merge operation off of the main thread - it's likely that we'll also want to move imports & exports as well - but this is still unacceptably slow for large regions due to the need to `deserialize` the topojson data for each merge operation.
8+
9+
The Node.js memory model prevents us from directly sharing the TopoJSON object as-is between threads or processes - for the most part Node uses an actor model, using message passing to copy data w/o directly sharing memory.
10+
11+
Node does have two exceptions to this: `ArrayBuffer` can be _transferred_ between threads instead of copied, and `SharedArrayBuffer` can be used on multiple threads at once - but both of these can only be used to store typed numeric arrays - not generic objects.
12+
13+
### Options considered
14+
15+
#### Improve deserialization on-demand performance further
16+
17+
The threading strategy pursued in [`c214bac6`](https://github.com/PublicMapping/districtbuilder/commit/c214bac64111615e7a213972d512a7d765c68821) involved keeping the binary serialized data in a `SharedArrayBuffer`, and deserializing it in the worker for each request.
18+
19+
I did some benchmarking on some of our largest regions using the current `v8.deserialize` method, which showed that deserializing the PA LRC or the TX regions (which are the two largest) took about 6-9 seconds on average.
20+
21+
I attempted to swap `v8` for Protobuf by using `geobuf` version `1.0.1`, which was the last version to support TopoJSON, but saw no significant difference in performance [`c90a760`](https://github.com/PublicMapping/districtbuilder/commit/c90a7602a440c1aea035b8ec676cdc71b15ef086). I _did_ however note a significant size reduction, so we should seriously consider switching to using Protobuffer as part of #1117
22+
23+
I also tried using [`mmap-object`](https://github.com/allenluce/mmap-object) but that caused a performance decline relative to just using `v8` serialization.
24+
25+
#### Fork the TopoJSON library / merge algorithm to store data in SharedArrayBuffer
26+
27+
I did some basic work towards this, but I think we would need to _radically_ rework how the topojson merge operation works and how the format is stored for this to be worthwhile. My experiments involved swapping out the `arcs` array in `Topology` for an [`ndarray`](https://www.npmjs.com/package/ndarray) backed by a `SharedArrayBuffer`, which was *worse* performance-wise than the `v8` (de)serialize route, including attempts to go further by doing the same to the `arcs` array for each feature.
28+
29+
My guess here is that transferring hundreds of thousands of `SharedArrayBuffer` objects is still pretty slow, so the only way to really push this further would be a radical restructuring of the entire way we access the GeometryCollection to store data as a struct-of-arrays, which would limit the number of `SharedArrayBuffer` to a fixed amount determined by the number of fields.
30+
31+
#### Rewrite the CPU intensive tasks in C++
32+
33+
I think this is a viable option. Node.JS has good bindings for C++, and C++ is not subject to the constraints of the Node.JS memory-model - we could easily share the read-only Topology data between threads.
34+
35+
The downsides here are significant though - we'd have to not just rewrite the TopoJSON `mergeArcs` method, but also all of our own business logic that uses TopoJSON data: the district merge operation and project import and export. This is a significant amount of code to rewrite, and while there is a working test harness we don't have many tests - we'd need to write a much more significant test suite tested against the current codebase to feel confident the rewrite was implemented correctly.
36+
37+
We also don't have any real C++ expertise on the team, so we would either need to use off-team resources to build and maintain this part of the codebase, or expect a significant amount of training and learning to be required beforehand.
38+
39+
#### Cache data load in worker threads
40+
41+
For this approach we keep the buffer data available on the main thread and deserialize it as needed, like in the first approach, but we also add an LRU cache in each worker thread. This allows us to skip the deserialization performance penalty after the first request, and setting the `maxSize` parameter on the LRU cache ensures that we don't exceed the memory limits of the system. I made a test branch here [`09df46a5`](https://github.com/PublicMapping/districtbuilder/commit/09df46a50f9ec54cc6e0af8a26fd3ce30302172c) that showed acceptable speeds and I believe there is room for further potential improvements in data caching.
42+
43+
We can route requests to worker threads that already have the data loaded, or to another worker if those are busy, allowing for a degree of concurrency.
44+
45+
The main downsides here are that we're being a bit duplicative (data in both serialized and deserialized formats, data potentially duplicated between threads) with the data we keep around, which could increase our memory requirements. We also go from having our largest datasets being mostly static, to instead constantly serializing and deserializing as we change what data is cached in which worker threads - which has a potential to change the runtime behavior of the application in unexpected ways.
46+
47+
This option becomes much more attractive if we can reduce the time spent loading the loading the buffer (see #1117) to under a few seconds, as we could also load the buffer in the worker thread before caching the deserialized data in that case and not need the memory requirements of keeping the deserialized files around in memory on the main thread.
48+
49+
#### Load data in worker threads
50+
51+
A variation of above keeping data cached in worker threads, but instead of loading the binary data on the main thread we would instead load data directly in each worker thread, with each thread loading a subset of states.
52+
53+
This is conceptually simpler than the caching approach, ensuring the memory footprint of our large datasets stays static and predictable, and avoids the potential of a cache-hit causing a request to be slower, but it limits our concurrency potential; by having only one thread capable of handling any given region, any concurrent requests from other users would have to block.
54+
55+
## Decision
56+
57+
We'll use the caching approach outlined above as option four.
58+
59+
This should be relatively quick to implement, and allow for some degree of concurrent data processing for the same state.
60+
We should evaluate our options for improving file loading performance, and consider whether we can skip the initialization step where we load and cache the binary data, but regardless of that we'll still use an LRU cache in each worker thread.
61+
62+
Switching to a PBF format as part of this would be prudent if we need to keep the pre-caching, as that seemed to reduce the binary size by about 50%. Doing this would require maintaining our own protobuffer bindings for TopoJSON - we could fork `geobuf` version `1.0.1` for this, which was the last version that supported TopoJSON.
63+
64+
## Consequences
65+
66+
We'll need to carefully monitor application performance after making this change, as it will likely affect our steady-state memory usage - and consequently we should probably wait to reduce our instance size until after we make this change, which we should instead defer until a later release.
67+
68+
Long-term the C++ approach might be better for the best-possible performance, but sticking to an entirely Node.JS backend seems worthwhile in terms of long-term maintenance and support.

docker-compose.ci.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,19 @@ version: "2.4"
22
services:
33
server:
44
image: "districtbuilder:${GIT_COMMIT:-latest}"
5+
volumes:
6+
- "/var/cache/district-builder-${EXECUTOR_NUMBER}-server-node-modules:/home/node/app/server/node_modules"
7+
8+
client:
9+
image: node:16-slim
10+
volumes:
11+
- "/var/cache/district-builder-${EXECUTOR_NUMBER}-client-node-modules:/home/node/app/node_modules"
512

613
manage:
714
image: "districtbuilder-manage:${GIT_COMMIT:-latest}"
15+
volumes:
16+
- "/var/cache/district-builder-${EXECUTOR_NUMBER}-server-node-modules:/home/node/app/server/node_modules"
17+
- "/var/cache/district-builder-${EXECUTOR_NUMBER}-manage-node-modules:/home/node/app/manage/node_modules"
818

919
shellcheck:
1020
image: koalaman/shellcheck:stable

docker-compose.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
version: "2.4"
22
services:
33
database:
4-
image: quay.io/azavea/postgis:3-postgres12.2-slim
4+
image: quay.io/azavea/postgis:3-postgres12.4-slim
55
environment:
66
- POSTGRES_USER=districtbuilder
77
- POSTGRES_PASSWORD=districtbuilder
@@ -28,13 +28,15 @@ services:
2828
- JWT_SECRET=insecure
2929
- JWT_EXPIRATION_IN_MS=604800000 # 1 week
3030
- CLIENT_URL=http://localhost:3003
31+
- TOPOLOGY_CACHE_DIRECTORY=/opt/topology
3132
build:
3233
context: ./src/server
3334
dockerfile: Dockerfile
3435
volumes:
3536
- ./src/server:/home/node/app/server
3637
- ./src/shared:/home/node/app/shared
3738
- /var/cache/district-builder-server-node-modules:/home/node/app/server/node_modules
39+
- /var/cache/district-builder-server-topology:/opt/topology
3840
- $HOME/.cache/district-builder:/usr/local/share/.cache/yarn
3941
- $HOME/.aws:/root/.aws:ro
4042
working_dir: /home/node/app/server
@@ -44,7 +46,7 @@ services:
4446
command: start:dev
4547

4648
client:
47-
image: node:16-slim
49+
image: node:16-bullseye-slim
4850
environment:
4951
- BASE_URL=${BASE_URL:-http://server:3005}
5052
- PORT=3003

scripts/cibuild

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ if [ "${BASH_SOURCE[0]}" = "${0}" ]; then
2323
if [ "${1:-}" = "--help" ]; then
2424
usage
2525
else
26-
./scripts/update
26+
CI=1 ./scripts/update
2727

2828
# Build tagged container images
2929
GIT_COMMIT="${GIT_COMMIT}" docker-compose \

scripts/update

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,35 +17,41 @@ if [ "${BASH_SOURCE[0]}" = "${0}" ]; then
1717
if [ "${1:-}" = "--help" ]; then
1818
usage
1919
else
20+
if [[ -n "${CI}" ]]; then
21+
CONFIG_ARGS=( "-f" "docker-compose.yml" "-f" "docker-compose.ci.yml" )
22+
else
23+
CONFIG_ARGS=( "-f" "docker-compose.yml" )
24+
fi
2025
# Ensure container images are current
21-
docker-compose build --pull
26+
docker-compose "${CONFIG_ARGS[@]}" build --pull
2227

2328
# Clean dist directory
24-
docker-compose \
29+
docker-compose "${CONFIG_ARGS[@]}" \
2530
run --rm --no-deps server \
2631
clean
2732

2833
# Update frontend, Yarn dependencies and build static asset bundle
29-
docker-compose \
34+
docker-compose "${CONFIG_ARGS[@]}" \
3035
run --rm --no-deps client \
3136
bash -c "yarn install && yarn compile && yarn build"
3237

3338
# Update backend, Yarn dependencies and build server
34-
docker-compose \
39+
docker-compose "${CONFIG_ARGS[@]}" \
3540
run --rm --no-deps --entrypoint "bash -c" server \
3641
"yarn install && npm rebuild && yarn build"
3742

3843
# Update manage, Yarn dependencies and build
39-
docker-compose \
44+
docker-compose "${CONFIG_ARGS[@]}" \
4045
run --rm --no-deps manage \
4146
bash -c "yarn install && yarn build"
4247

4348
# Bring up PostgreSQL and NestJS in a way that respects
4449
# configured service health checks.
45-
docker-compose \
46-
-f docker-compose.yml \
50+
docker-compose "${CONFIG_ARGS[@]}" \
4751
up -d database server
4852

49-
./scripts/migration
53+
docker-compose "${CONFIG_ARGS[@]}" \
54+
run --rm server \
55+
migration:run
5056
fi
5157
fi

scripts/yarn

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,28 @@ if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
1717
if [[ "${1:-}" == "--help" ]] || [[ ! "${1}" == "client" && ! "${1}" == "server" && ! "${1}" == "manage" ]]; then
1818
usage
1919
elif [[ "${1}" == "server" ]]; then
20-
docker-compose \
21-
run --rm "${1}" \
22-
"${@:2}"
20+
if [[ -n "${CI}" ]]; then
21+
docker-compose \
22+
-f docker-compose.yml \
23+
-f docker-compose.ci.yml \
24+
run --rm "${1}" \
25+
"${@:2}"
26+
else
27+
docker-compose \
28+
run --rm "${1}" \
29+
"${@:2}"
30+
fi
2331
else
24-
docker-compose \
25-
run --rm --no-deps "${1}" \
26-
bash -c "yarn ${*:2}"
32+
if [[ -n "${CI}" ]]; then
33+
docker-compose \
34+
-f docker-compose.yml \
35+
-f docker-compose.ci.yml \
36+
run --rm --no-deps "${1}" \
37+
bash -c "yarn ${*:2}"
38+
else
39+
docker-compose \
40+
run --rm --no-deps "${1}" \
41+
bash -c "yarn ${*:2}"
42+
fi
2743
fi
2844
fi

src/client/actions/projectModals.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,4 @@ export const showAdvancedEditingModal = createAction(
66
"Show advanced editing warning modal"
77
)<boolean>();
88
export const showCopyMapModal = createAction("Show copy map modal")<boolean>();
9-
export const showConvertMapModal = createAction("Show convert map modal")<boolean>();
109
export const setImportFlagsModal = createAction("Show import flags modal")<boolean>();

0 commit comments

Comments
 (0)