Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring sprint #358

Draft
wants to merge 47 commits into
base: development
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
a5adaa7
create 'src' and 'tests' directories and added '__init__.py' in each
ioduok Jan 22, 2025
21d92f9
copy routines and helpers codes from sip_assembly folder
ioduok Jan 22, 2025
ac68606
update imports
ioduok Jan 23, 2025
df33c6c
redo, copy routines files to new src, create blank py file
ioduok Jan 23, 2025
14c0444
deleted a line
ioduok Jan 23, 2025
26a2cd5
base and test builds template text added
ioduok Jan 23, 2025
1c7df59
updating testing infrastructure
ioduok Jan 24, 2025
fe7598b
update new routines py file
ioduok Jan 24, 2025
8790195
tox in test.yml
ioduok Jan 24, 2025
77fcf48
dockerfile and tox updates
ioduok Jan 27, 2025
51d9d36
workflows updates
ioduok Jan 28, 2025
765605c
undo spacing typos
ioduok Jan 28, 2025
924c8c8
update documentation
ioduok Jan 29, 2025
cc54c0c
documentation formatting
ioduok Jan 29, 2025
9b1a605
license copyright year update
ioduok Jan 29, 2025
c07cc42
workflow updates to match ursa-major
ioduok Jan 29, 2025
925ba3c
remove env variables
ioduok Jan 31, 2025
bcbd728
update python version in requirements.txt
ioduok Jan 31, 2025
338d8ae
update readme
ioduok Jan 31, 2025
b7d4c70
update readme
ioduok Jan 31, 2025
e917ddf
Merge branch 'serverless' of github.com:RockefellerArchiveCenter/forn…
helrond Feb 4, 2025
46fa0b6
write processing config in routine
helrond Feb 4, 2025
f1e9c1d
write rights data in routine
helrond Feb 4, 2025
266482e
add package data test
helrond Feb 4, 2025
fd5b7ff
add extract test
helrond Feb 4, 2025
9837947
add tests for restructuring
helrond Feb 4, 2025
6b83f19
add archive test
helrond Feb 4, 2025
ad44b6d
add cleanup success test
helrond Feb 4, 2025
549bbab
add failed cleanup tests
helrond Feb 4, 2025
cb571f5
moving coveragerc file
helrond Feb 4, 2025
b02dbe1
add data tests
helrond Feb 5, 2025
a1b007a
consistently handle directory creation
helrond Feb 5, 2025
6cbda95
use helper methods and attributes
helrond Feb 5, 2025
bc59cee
ensure none value are replaced with empty strings
helrond Feb 5, 2025
2eb5e8e
improve docstrings
helrond Feb 5, 2025
b4fd316
ignore tox
helrond Feb 5, 2025
d99d0ca
remove unused code
helrond Feb 5, 2025
39140b1
add logging
helrond Feb 5, 2025
aca7595
pass correct args to am client
helrond Feb 5, 2025
1c89cfc
return rights data as dict
helrond Feb 5, 2025
3e240b0
add test dependencies to dependency updates
helrond Feb 5, 2025
49f1339
set variables from env
helrond Feb 5, 2025
c850fb7
actually rstrip
helrond Feb 6, 2025
a97cd90
pin requirements
helrond Feb 13, 2025
0ff88bd
add archivesspace uri to identifiers
helrond Feb 14, 2025
282d342
update fixtures
helrond Feb 14, 2025
671151b
move zodiac values to config
helrond Feb 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[run]
omit =
*test_*
*__init__.py


[report]
omit =
*test_*
*__init__.py
5 changes: 4 additions & 1 deletion .github/workflows/dependencies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
python-version: '3.11'
cache: pip

- name: Install pre-commit and pip-tools
Expand All @@ -41,6 +41,9 @@ jobs:
- name: Run pip-compile
run: pip-compile --upgrade

- name: Update test requirements
run: pip-compile --upgrade tests/test_requirements.in -o tests/test_requirements.txt

- name: Create Pull Request
uses: peter-evans/[email protected]
with:
Expand Down
57 changes: 15 additions & 42 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,11 @@ jobs:

env:
APPLICATION_NAME: fornax
CONTAINER: fornax-web
APPLICATION_PORT: 8003

services:
docker:
image: docker:stable
options: --privileged

steps:
- name: Checkout code
Expand All @@ -30,40 +33,9 @@ jobs:
- name: Clone deploy scripts if not present
run: git clone https://github.com/RockefellerArchiveCenter/deploy_scripts.git;

- name: Substitute environment variables
uses: tvarohohlavy/[email protected]
with:
files: |
$APPLICATION_NAME/config.py.deploy
appspec.yml.deploy
deploy_scripts/create_apache_config.sh.deploy
deploy_scripts/curl_index.sh.deploy
deploy_scripts/curl_status_endpoint.sh.deploy
deploy_scripts/install_dependencies_django.sh.deploy
deploy_scripts/restart_apachectl.sh.deploy
deploy_scripts/run_management_commands_django.sh.deploy
deploy_scripts/set_permissions.sh.deploy
deploy_scripts/stop_cron.sh.deploy

- name: Rename deploy files
run: |
mv $APPLICATION_NAME/config.py.deploy $APPLICATION_NAME/config.py
mv appspec.yml.deploy appspec.yml
mv deploy_scripts/create_apache_config.sh.deploy deploy_scripts/create_apache_config.sh
mv deploy_scripts/curl_index.sh.deploy deploy_scripts/curl_index.sh
mv deploy_scripts/curl_status_endpoint.sh.deploy deploy_scripts/curl_status_endpoint.sh
mv deploy_scripts/install_dependencies_django.sh.deploy deploy_scripts/install_dependencies_django.sh
mv deploy_scripts/restart_apachectl.sh.deploy deploy_scripts/restart_apachectl.sh
mv deploy_scripts/run_management_commands_django.sh.deploy deploy_scripts/run_management_commands_django.sh
mv deploy_scripts/set_permissions.sh.deploy deploy_scripts/set_permissions.sh
mv deploy_scripts/stop_cron.sh.deploy deploy_scripts/stop_cron.sh

- name: Make deploy scripts executable
run: chmod +x deploy_scripts/*.sh

- name: Create deployment zip
run: sudo deploy_scripts/make_zip_django.sh $DEPLOY_ZIP_DIR $DEPLOY_ZIP_NAME

- name: Configure AWS Credentials
uses: aws-actions/[email protected]
with:
Expand All @@ -74,13 +46,14 @@ jobs:
role-duration-seconds: 900
aws-region: ${{ secrets.AWS_REGION }}

- name: Deploy to S3
run: aws s3 cp $DEPLOY_ZIP_DIR s3://$AWS_BUCKET_NAME --recursive
- name: Build docker image
if: ${{ github.ref_name == 'development' }}
run: docker build -t ${{APPLICATION_NAME}} --target build .

- name: Push Docker image to ECR
if: ${{ github.ref_name == 'development' }}
run: bash deploy_scripts/containers/push_image_to_ecr.sh ${{APPLICATION_NAME}}

- name: Deploy to AWS CodeDeploy
run: aws deploy create-deployment
--region ${{ secrets.AWS_REGION }}
--application-name $APPLICATION_NAME
--deployment-config-name CodeDeployDefault.OneAtATime
--deployment-group-name $DEPLOYMENT_GROUP
--s3-location bucket=$AWS_BUCKET_NAME,bundleType=zip,key=$DEPLOY_ZIP_NAME
- name: Retag Docker image
if: ${{ github.ref_name == 'base' }}
run: bash deploy_scripts/containers/add_tag_to_image.sh ${{APPLICATION_NAME}} dev prod
37 changes: 5 additions & 32 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,45 +12,18 @@ jobs:
environment:
name: development

env:
APPLICATION_NAME: fornax
CONTAINER: fornax-web
APPLICATION_PORT: 8003

services:
docker:
image: docker:stable
options: --privileged

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python and cache pip
uses: actions/setup-python@v5
with:
python-version: '3.10'
python-version: '3.11'
cache: 'pip'

- name: Copy config file
run: cp ${{ env.APPLICATION_NAME }}/config.py.example ${{ env.APPLICATION_NAME }}/config.py

- name: Login to Docker
run: echo "${{ secrets.DOCKER_PASSWORD }}" | docker login -u "${{ secrets.DOCKER_USERNAME }}" --password-stdin

- name: Start Docker containers
run: docker compose up -d

- name: Wait for services to be ready
run: ./wait-for-it.sh $CONTAINER:$APPLICATION_PORT -- echo "$CONTAINER is ready"

- name: Install pre-commit
run: |
pip install "pre-commit===2.13.0"
pre-commit install

- name: Run pre-commit checks
run: pre-commit run --all-files --show-diff-on-failure

- name: Install tox
run: pip install tox

- name: Run tests
run: docker compose exec -T $CONTAINER python manage.py test --exclude-tag=integration
run: tox
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
*.pickle
archivematica_transfer_source/*
.coverage
.tox

### Django ###
*.log
Expand Down
26 changes: 11 additions & 15 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,17 +1,13 @@
FROM python:3.10

ENV PYTHONUNBUFFERED 1
FROM python:3.11-slim-buster AS base
WORKDIR /code
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY src src

RUN apt-get update \
&& echo 'slapd/root_password password password' | debconf-set-selections \
&& echo 'slapd/root_password_again password password' | debconf-set-selections \
&& DEBIAN_FRONTEND=noninteractive apt-get -y install sudo \
rsync \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
FROM base AS test
COPY .coveragerc ./
COPY tests tests
RUN pip install -r tests/test_requirements.txt

RUN mkdir /code
WORKDIR /code
ADD requirements.txt /code/
RUN pip install --upgrade pip && pip install -r requirements.txt
ADD . /code/
FROM base AS build
CMD ["python", "src/sip_creator.py"]
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2018 Rockefeller Archive Center
Copyright (c) 2025 Rockefeller Archive Center

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
86 changes: 19 additions & 67 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,82 +4,34 @@ A microservice to create Archivematica-compliant Submission Information Packages

fornax is part of [Project Electron](https://github.com/RockefellerArchiveCenter/project_electron), an initiative to build sustainable, open and user-centered infrastructure for the archival management of digital records at the [Rockefeller Archive Center](http://rockarch.org/).

## Setup
## Getting Started

Install [git](https://git-scm.com/) and clone the repository
If you have [git](https://git-scm.com/) and [Docker](https://store.docker.com/search?type=edition&offering=community) installed, using this repository is as simple as:

$ git clone [email protected]:RockefellerArchiveCenter/fornax.git

git clone https://github.com/RockefellerArchiveCenter/fornax.git
cd fornax
docker build -t fornax .
docker run fornax


Install [Docker](https://store.docker.com/search?type=edition&offering=community) and run docker-compose from the root directory
## Usage

$ cd fornax
$ docker-compose up
This repository is intended to be deployed as an ECS Task in AWS infrastructure.

Once the application starts successfully, you should be able to access the application in your browser at `http://localhost:8003`
### License

When you're done, shut down docker-compose

$ docker-compose down

Or, if you want to remove all data

$ docker-compose down -v

### Configuration

You will need to edit configuration values in `fornax/config.py` to point to your instance of Archivematica.

## Services

fornax has six services, all of which are exposed via HTTP endpoints (see [Routes](#routes) section below):

* Store SIPs - Creates a SIP object.
* SIP Assembly - This is the main service for this application, and consists of the following steps:
* Moving the SIP to the processing directory (SIPS are validated before and after moving).
* Restructuring the SIP for Archivematica compliance by:
* Moving objects in the `data` directory to `data/objects`.
* Adding an empty `logs` directory.
* Adding a `metadata` directory containing a `submissionDocumentation` subdirectory.
* Creating `rights.csv` and adding it to the `metadata` directory.
* Creating submission documentation and adding to the `metadata/submissionDocumentation` subdirectory.
* Adding an identifier to `bag-info.txt` using the `Internal-Sender-Identifier` field.
* Adding a `processingMCP.xml` file which sets processing configurations for Archivematica.
* Updating bag manifests to account for restructuring and changes to files.
* Delivering the SIP to the Archivematica Transfer Source (SIPS are validated before and after moving).
* Create Transfer - starts and approves the next assembled transfer in Archivematica.
* Remove Completed Transfers/Ingests - hides completed transfers or ingests in the Archivematica Dashboard to avoid performance issues.
* Cleanup - removes files from the destination directory.
* Request Cleanup - sends a POST request to another service requesting cleanup of the source directory. fornax only has read access for this directory.

For an example of the data fornax expects to receive (both bags and JSON), see the `fixtures/` directory


### Routes

| Method | URL | Parameters | Response | Behavior |
|--------|-----|---|---|---|
|GET|/sips| |200|Returns a list of SIPs|
|GET|/sips/{id}| |200|Returns data about an individual SIP|
|POST|/sips||200|Creates a SIP object from an transfer in Aurora.|
|POST|/assemble||200|Runs the SIPAssembly routine.|
|POST|/start||200|Starts and approves the next transfer in Archivematica.|
|POST|/remove-transfers||200|Hides transfers in the Archivematica Dashboard.|
|POST|/remove-ingests||200|Hides ingests in the Archivematica Dashboard.|
|POST|/cleanup||200|Removes files from destination directory.|
|POST|/request-cleanup||200|Notifies another service that processing is complete.|
|GET|/status||200|Return the status of the microservice|
|GET|/schema.json||200|Returns the OpenAPI schema for this application|
This code is released under an [MIT License](LICENSE).

## Contributing

## Archivematica Integration Testing
When migrating Archivematica, it is necessary to test that Fornax can start transfers as expected. To run these integration tests, target the Python environment for this application and pass the `tag` flag to the tests management command: `env/bin/python manage.py test --tag=integration`.
This is an open source project and we welcome contributions! If you want to fix a bug, or have an idea of how to enhance the application, the process looks like this:

Running these tests will start a small package in all configured origins. This package will be set to not store the AIP or the DIP, but some manual cleanup will be required.
1. File an issue in this repository. This will provide a location to discuss proposed implementations of fixes or enhancements, and can then be tied to a subsequent pull request.
2. If you have an idea of how to fix the bug (or make the improvements), fork the repository and work in your own branch. When you are done, push the branch back to this repository and set up a pull request. Automated unit tests are run on all pull requests. Any new code should have unit test coverage, documentation (if necessary), and should conform to the Python PEP8 style guidelines.
3. After some back and forth between you and core committers (or individuals who have privileges to commit to the base branch of this repository), your code will probably be merged, perhaps with some minor changes.

## Development
This repository contains a configuration file for git [pre-commit](https://pre-commit.com/) hooks which help ensure that code is linted before it is checked into version control. It is strongly recommended that you install these hooks locally by installing pre-commit and running `pre-commit install`.


## License

This code is released under an [MIT License](LICENSE).
## Tests
New code should have unit tests. Tests can be run using [tox](https://tox.readthedocs.io/).
23 changes: 0 additions & 23 deletions docker-compose.yml

This file was deleted.

18 changes: 0 additions & 18 deletions entrypoint.sh

This file was deleted.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading