The Internet Censorship Dashboard is a project that aggregates data fetched from the OONI API, to provide an overview of the current state of Internet Censorship experienced by users mainly in Southeast Asia. The current form was built a couple of years ago, and recently got funded to get it updated to work better with new APIs.
I would likely cross post the technical report it is finalized, so this post is mainly on the #TMI tech side of the project as a note2self.
So currently there are 3 different components of the project, namely
So there was a couple of security warnings from Github over the years regarding the frontend part of the project. However, upgrading dependencies was never straight forward in the frontend part for some reason. Therefore, in the current iteration, one of the main goals was to get everything updated properly.
First things first, we updated Yarn to Yarn 2
$ yarn set version berry
I spend quite some time rebuilding the project front scratch, only to find out the new pnp mode was the reason why the project wouldn’t build (some dependencies were not made compatible at the time).
Then we needed to upgrade all the dependencies, first we gotta install a plugin
$ yarn plugin import interactive-tools
upgrade-interactive command was made available
$ yarn upgrade-interactive
There were quite a number of huge changes brought into ReactJS in recent years. One of them is the introduction of effect hooks. On the other hand, as I rely on Redux mostly for state management, they also provide some sort of hook similar to the one in ReactJS through Redux toolkit. So no more confusion on container widget and all the horrible class definition. Not only the markup/code for a widget is greatly simplified, a lot of boilerplate was gone too.
Also, no more complicated code to rebuild state in functional style, which greatly help with readability.
On the other hand, the backend part was having a slightly more involved reorganization. In the previous iteration I was still using my own venv wrapper script (and was transitioning to pipenv), so the dependencies was directly written inside the docker file. Now that I use Poetry in nearly everything, I ported my crawler and backend API into sub-projects managed by Poetry.
The main benefit of using Poetry is that the virtual environment can be reused while building the project. I used the opportunity to figure out how to do multistage building while learning how to properly build a container from a project managed by Poetry.
I ended up contributing an answer to the linked question above, which is also posted below
FROM python:3.9-slim as base ENV PYTHONFAULTHANDLER=1 \ PYTHONHASHSEED=random \ PYTHONUNBUFFERED=1 RUN apt-get update && apt-get install -y gcc libffi-dev g++ WORKDIR /app FROM base as builder ENV PIP_DEFAULT_TIMEOUT=100 \ PIP_DISABLE_PIP_VERSION_CHECK=1 \ PIP_NO_CACHE_DIR=1 \ POETRY_VERSION=1.1.3 RUN pip install "poetry==$POETRY_VERSION" RUN python -m venv /venv COPY pyproject.toml poetry.lock ./ RUN . /venv/bin/activate && poetry install --no-dev --no-root COPY . . RUN . /venv/bin/activate && poetry build FROM base as final COPY --from=builder /venv /venv COPY --from=builder /app/dist . COPY docker-entrypoint.sh ./ RUN . /venv/bin/activate && pip install *.whl CMD ["./docker-entrypoint.sh"]
This line was added
RUN . /venv/bin/activate && poetry install --no-dev --no-root
because without it Poetry would attempt to create a virtualenv in a folder name with a hash string. Therefore to override that behavior we needed to do the venv creation and activate it before calling a
--no-dev can be removed if a developer-tools friendly image is required (with formatter, lint and pytest). The benefit of using virtualenv in a multistage build became more apparent in the final image creation. Instead of having to run pip install again, we just had to copy the whole environment over.
Also, as Poetry made exporting a script easier, the
docker-entrypoint.sh could be just as simple as
#!/bin/sh set -e . /venv/bin/activate exec crawler
Even for the hug web application, it was just
#!/bin/sh set -e . /venv/bin/activate exec gunicorn -w 4 --bind 0.0.0.0:8000 backend.index:__hug_wsgi__
While it works for now, I am sure in a couple of years I would probably discover new things and apply them into this project given the opportunity (feel free to contact SinarProject if one is interested in sponsoring the project).