ETT-1459: Record first ingest date#185
Open
aelkiss wants to merge 4 commits into
Open
Conversation
* add first ingest date column to feed_audit table * record item in feed_audit at the end of collate * remove record_audit functionality from LocalPairtree (now unused except in development); emit warning (could record to feed_storage for consistency if we want instead, but we aren't really using it..?) * testing with storage classes in collate is a bit messy because of the distinction between depositing to the repo and reading back from the repo * add additional logging options in Stage (need to DRY out though) * additional logging for collate (should log duration; see ETT-824) * add some notes towards ETT-1687 * Mock depositing item for collate tests with mocked storage
This addresses two issues: * We are no longer using symlinks to deposit material into the repository. * When we read from the repo, we just care about the root of the repo (just like e.g. babel apps reading from the repo), not about any symlinks, etc. Specific changes: * remove LinkedPairtree * remove "repository" key in config & references to link_dir / obj_dir; replace with a "repository_root" key * TempDirs keeps track of what it creates; callers can create additional temp dirs that will get cleaned up at the end of a test.
aelkiss
commented
Jun 4, 2026
| `id` varchar(30) NOT NULL, | ||
| `sdr_partition` tinyint(4) DEFAULT NULL, | ||
| `zip_size` bigint(20) DEFAULT NULL, | ||
| `first_ingest_date` datetime NULL DEFAULT CURRENT_TIMESTAMP, |
Member
Author
There was a problem hiding this comment.
This is the change for recording first ingest date; the application side doesn't need to handle it directly at all beyond making sure that something is recorded in feed_audit
aelkiss
commented
Jun 4, 2026
aelkiss
commented
Jun 4, 2026
| my $self = shift; | ||
|
|
||
| return $self->{volume}->get_zip_path(get_config('staging', 'zipfile')) . '.gpg'; | ||
| return $self->{volume}->get_zip_path(get_config('staging', 'zipfile')) . "-$self->{name}.gpg"; |
Member
Author
There was a problem hiding this comment.
This avoids collisions with encrypted zips left over from other storages. They should get cleaned up but don't always in practice.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change moves the functionality for recording items in
feed_auditto the end of the Collate stage rather than a particular storage.It also takes the opportunity to:
link_dirandobj_dirSee comments in more detail on each commit.
I had looked into options for rolling back failed deposits to S3, but the ideas I had didn't work out (see ETT-1483).