Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per segment chunks #8272

Open
wants to merge 137 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 91 commits
Commits
Show all changes
137 commits
Select commit Hold shift + click to select a range
b32a9eb
Update frame provider and media cache
zhiltsov-max Jul 25, 2024
cb4ff93
t
zhiltsov-max Jul 25, 2024
d49233c
t
zhiltsov-max Jul 30, 2024
146a896
Support static chunk building, fix av memory leak, add caching media …
zhiltsov-max Aug 1, 2024
52d1bac
Refactor static chunk generation - extract function, revise threading
zhiltsov-max Aug 2, 2024
0c53436
Refactor and fix task chunk creation from segment chunks, any storage
zhiltsov-max Aug 2, 2024
c166123
Fix chunk number validation
zhiltsov-max Aug 5, 2024
630c97e
Enable formatting for updated components
zhiltsov-max Aug 5, 2024
8d710e7
Remove the checksum field
zhiltsov-max Aug 5, 2024
654a827
Be consistent about returned task chunk types (allow video chunks)
zhiltsov-max Aug 6, 2024
12e5f2a
Support iterator input in video chunk writing
zhiltsov-max Aug 6, 2024
a79a681
Fix type annotation
zhiltsov-max Aug 6, 2024
d5118a2
Refactor video reader memory leak fix, add to reader with manifest
zhiltsov-max Aug 6, 2024
1b429cf
Disable threading in video reading in frame provider
zhiltsov-max Aug 6, 2024
d512312
Fix keyframe search
zhiltsov-max Aug 6, 2024
167ee12
Return frames as generator in dynamic chunk creation
zhiltsov-max Aug 6, 2024
88a9cb2
Update chunk requests in UI
zhiltsov-max Aug 7, 2024
30bf8fd
Update cache indices in FrameDecoder, enable video play
zhiltsov-max Aug 7, 2024
ee3c905
Fix frame retrieval for video
zhiltsov-max Aug 7, 2024
dc03220
Fix frame reading in updated dynamic cache building
zhiltsov-max Aug 7, 2024
4bb8a74
Fix invalid frame quality
zhiltsov-max Aug 9, 2024
f7d2c4c
Fix video reading in media_extractors - exception handling, frame mis…
zhiltsov-max Aug 9, 2024
34d9ca0
Allow disabling static chunks, add seamless switching
zhiltsov-max Aug 9, 2024
8c97967
Extend code formatting
zhiltsov-max Aug 9, 2024
a0fd0ba
Rename function argument
zhiltsov-max Aug 9, 2024
c0480c9
Rename configuration parameter
zhiltsov-max Aug 9, 2024
5caf283
Add av version comment
zhiltsov-max Aug 12, 2024
efbe3a0
Refactor av video reading
zhiltsov-max Aug 12, 2024
fb1284d
Fix manifest access
zhiltsov-max Aug 12, 2024
8edcfc5
Add migration
zhiltsov-max Aug 12, 2024
51a7f83
Update downloading from cloud storage for packed data in task creation
zhiltsov-max Aug 12, 2024
5a2a746
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 12, 2024
65e4174
Update changelog
zhiltsov-max Aug 12, 2024
61f1735
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Aug 12, 2024
34f972f
Update migration name
zhiltsov-max Aug 12, 2024
2bb2b17
Polish some code
zhiltsov-max Aug 12, 2024
3788917
Fix frame retrieval by id
zhiltsov-max Aug 12, 2024
f695ae1
Remove extra import
zhiltsov-max Aug 12, 2024
14a9033
Fix frame access in gt jobs
zhiltsov-max Aug 12, 2024
e8bebe9
Fix frame access in export
zhiltsov-max Aug 12, 2024
bbef52f
Fix frame iteration for frame step and excluded frames, fix export in…
zhiltsov-max Aug 12, 2024
3d5bb52
Remove unused import
zhiltsov-max Aug 13, 2024
0e9c5c8
Fix error check in test
zhiltsov-max Aug 13, 2024
351bdc8
Fix cleanup in test
zhiltsov-max Aug 13, 2024
a71852c
Add handling for disabled static cache during task creation
zhiltsov-max Aug 13, 2024
d90ca0d
Refactor some code
zhiltsov-max Aug 13, 2024
03e749a
Fix downloading for cloud data in task creation
zhiltsov-max Aug 13, 2024
c0822a0
Fix preview reading for projects
zhiltsov-max Aug 13, 2024
56d413f
Fix failing sdk tests
zhiltsov-max Aug 13, 2024
48f4794
Fix other failing sdk tests
zhiltsov-max Aug 13, 2024
5c0cc1a
Improve logging for migration
zhiltsov-max Aug 14, 2024
5abd891
Fix invalid starting index
zhiltsov-max Aug 14, 2024
749b970
Fix frame reading in lambda functions
zhiltsov-max Aug 14, 2024
9105cd3
Fix unintended frame indexing changes
zhiltsov-max Aug 14, 2024
8dafcbe
Fix various indexing errors in media extractors
zhiltsov-max Aug 14, 2024
4cbf82f
Fix temp resource cleanup in server tests
zhiltsov-max Aug 14, 2024
88c34a3
Refactor some code
zhiltsov-max Aug 15, 2024
b0fd006
Remove duplicated tests
zhiltsov-max Aug 15, 2024
2eac04a
Remove extra change
zhiltsov-max Aug 15, 2024
640518c
Fix method name, remove extra method
zhiltsov-max Aug 15, 2024
3a246b3
Remove some shared code in tests, add temp data cleanup
zhiltsov-max Aug 15, 2024
a0704f4
Add checks for successful task creation in tests
zhiltsov-max Aug 15, 2024
cf026ef
Fix invalid variable access in test
zhiltsov-max Aug 15, 2024
f73cef3
Update default cache location in test checks
zhiltsov-max Aug 15, 2024
258c800
Update manifest validation logic, allow manifest input in any task da…
zhiltsov-max Aug 16, 2024
b3ae317
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 16, 2024
5e89ef4
Add task chunk caching, refactor chunk building
zhiltsov-max Aug 16, 2024
c5edcda
Refactor some code
zhiltsov-max Aug 16, 2024
7f5c722
Refactor some code
zhiltsov-max Aug 16, 2024
daf4035
Improve parameter name
zhiltsov-max Aug 16, 2024
8c1b82c
Fix function call
zhiltsov-max Aug 16, 2024
f172865
Add basic test set for meta, frames, and chunks reading in tasks
zhiltsov-max Aug 16, 2024
aacceee
Move class declaration for pylint compatibility
zhiltsov-max Aug 16, 2024
c8dbb7c
Add missing original chunk type field in job responses
zhiltsov-max Aug 16, 2024
6b9a3e9
Add tests for job data access
zhiltsov-max Aug 16, 2024
f5661e4
Update test assets
zhiltsov-max Aug 16, 2024
754757f
Clean imports
zhiltsov-max Aug 16, 2024
0c001a5
Python 3.8 compatibility
zhiltsov-max Aug 16, 2024
a9390eb
Python 3.8 compatibility
zhiltsov-max Aug 17, 2024
d2b1385
Python 3.8 compatibility
zhiltsov-max Aug 17, 2024
c9a5e31
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 19, 2024
621afa7
Add logging into shell command runs, fix invalid redis-cli invocation…
zhiltsov-max Aug 19, 2024
e40ffd1
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Aug 19, 2024
08c9f01
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 19, 2024
92a19f4
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 21, 2024
441d0e7
Allow calling flushall in redis in helm tests
zhiltsov-max Aug 21, 2024
0963f94
Update comment
zhiltsov-max Aug 21, 2024
0d78e63
Update redis cleanup command
zhiltsov-max Aug 21, 2024
f53948d
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 23, 2024
e69f2b7
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 23, 2024
1a9a813
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 23, 2024
e4db8ad
Reuse _get
zhiltsov-max Aug 28, 2024
b1c54f9
Make get_checksum private
zhiltsov-max Aug 28, 2024
5312b00
Add get_raw_data_dirname to the Data model
zhiltsov-max Aug 28, 2024
3c117fe
Make SegmentFrameProvider available in make_frame_provider
zhiltsov-max Aug 28, 2024
98eff81
Remove extra variable
zhiltsov-max Aug 28, 2024
316ec78
Include both cases of CVAT_ALLOW_STATIC_CACHE in CI checks
zhiltsov-max Aug 28, 2024
ebed825
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Aug 28, 2024
92f6083
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 28, 2024
2b6e987
Remove extra import
zhiltsov-max Aug 28, 2024
f67a1a2
Update changelog
zhiltsov-max Sep 5, 2024
d72fe85
Refactor cache keys in media cache
zhiltsov-max Sep 5, 2024
d5bfb88
Refactor selective segment chunk creation
zhiltsov-max Sep 5, 2024
c5a1197
Remove the breaking change in the chunk retrieval API, add a new inde…
zhiltsov-max Sep 6, 2024
a5cf3b7
Update UI to use the new chunk index parameter
zhiltsov-max Sep 7, 2024
069f48c
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 7, 2024
cfdde3f
Update test initialization
zhiltsov-max Sep 7, 2024
843b957
Update changelog
zhiltsov-max Sep 7, 2024
feb92cd
Add backward compatibility for chunk "number" in GT jobs, remove plac…
zhiltsov-max Sep 9, 2024
2424f2b
Update UI to support job chunks with non-sequential frame ids
zhiltsov-max Sep 9, 2024
fe60bdf
Fix job frame retrieval
zhiltsov-max Sep 9, 2024
6ddb6bf
Fix 3d task chunk writing
zhiltsov-max Sep 9, 2024
4fa7b97
Fix frame retrieval in UI
zhiltsov-max Sep 10, 2024
32f1be2
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 10, 2024
0e95b40
Fix chunk availability check
zhiltsov-max Sep 11, 2024
21135b7
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 11, 2024
b311f1e
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 11, 2024
79bb1f7
Remove array comparisons
zhiltsov-max Sep 12, 2024
55a8424
Update validateFrameNumbers
zhiltsov-max Sep 12, 2024
add5ae6
Use builtins for range and binary search, convert frame step into a c…
zhiltsov-max Sep 12, 2024
643d998
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 12, 2024
df90b33
Fix cached chunk indicators in frame player
zhiltsov-max Sep 12, 2024
6ccb7db
Fix chunk predecode logic
zhiltsov-max Sep 13, 2024
1fb68bc
Rename chunkNumber to chunkIndex where necessary
zhiltsov-max Sep 13, 2024
92d0c7a
Fix potential prefetch problem with reverse playback
zhiltsov-max Sep 13, 2024
67c1650
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 13, 2024
3cdc4dc
Move env variable into docker-compose.yml
zhiltsov-max Sep 16, 2024
716042e
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 16, 2024
19279c7
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 16, 2024
bc5ed39
Fix invalid cached chunk display in GT jobs
zhiltsov-max Sep 17, 2024
08ddd28
Fix invalid task preview generation
zhiltsov-max Sep 17, 2024
1d969bd
Refactor CS previews, context image chunk generation, media cache cre…
zhiltsov-max Sep 17, 2024
e2cba8c
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 17, 2024
d135475
Remove extra import
zhiltsov-max Sep 17, 2024
a1638c9
Fix CS preview in response
zhiltsov-max Sep 17, 2024
fc89c01
Add reverse migration
zhiltsov-max Sep 17, 2024
c6e65f6
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions changelog.d/20240812_161617_mzhiltso_job_chunks.md
zhiltsov-max marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
### Added

- A server setting to disable media chunks on the local filesystem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- A server setting to disable media chunks on the local filesystem
- A server setting to disable permanent media chunks on local filesystem

Copy link
Contributor Author

@zhiltsov-max zhiltsov-max Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were only permanent chunks on the local filesystem, what did you want to reflect in the change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regular users (even those who may install self-hosted solution) do not know such implementation details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to clarify that this is exactly about permanent chunks, explicitly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but it's just what is passed in the task data endpoint API: storage_method = 'file_system' | 'cache'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the message

(<https://github.com/cvat-ai/cvat/pull/8272>)
4 changes: 4 additions & 0 deletions changelog.d/20240812_161734_mzhiltso_job_chunks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
### Changed

- Jobs now have separate chunk ids starting from 0, instead of using ones from the task
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to formulate somehow more clear for end user (not sure actually how exactly).
But this record does not say me (as a CVAT user anything). This is probably only important for REST API users

Maybe: Numeration of data chunks in any job always starts with 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way this change is breaking. Thus it means that we have to up major version of REST API.
Hovewer I am not sure we want do it now. Probably worthy of internal discussion

Copy link
Contributor Author

@zhiltsov-max zhiltsov-max Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way this change is breaking. Thus it means that we have to up major version of REST API

SemVer is not used in the server REST API now.

But this record does not say me (as a CVAT user anything). This is probably only important for REST API users

Ok, do you want me to add some tag to clarify that this is for API users or do you want it to be removed from the changelog? REST API users are users as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you decide so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never established SemVer for the server component. There were discussions a couple of years ago, we decided to postpone it. Currently, we can have any changes in the server API in any release, the version is the same as in the release itself.

Copy link
Member

@bsekachev bsekachev Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the beginning:

Hovewer I am not sure we want do it now. Probably worthy of internal discussion

As I, from another hand, see strong reasons do not update version right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of modifying existing chunk_number parameter, we have to add one more and deprecate previous behaviour

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's certainly doable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the index parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented + refactored the UI a little bit. There are 2 potential improvements to be done:

  • add binary search instead of a linear one
  • use lightweight ranges instead of materializing them into number arrays

(<https://github.com/cvat-ai/cvat/pull/8272>)
6 changes: 6 additions & 0 deletions changelog.d/20240812_161912_mzhiltso_job_chunks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
### Fixed

- Various memory leaks in video reading on the server
(<https://github.com/cvat-ai/cvat/pull/8272>)
- Job assignees will not receive frames from adjacent jobs in the boundary chunks
(<https://github.com/cvat-ai/cvat/pull/8272>)
40 changes: 22 additions & 18 deletions cvat-core/src/frames.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ const frameDataCache: Record<string, {
latestFrameDecodeRequest: number | null;
latestContextImagesRequest: number | null;
provider: FrameDecoder;
prefetchAnalizer: PrefetchAnalyzer;
prefetchAnalyzer: PrefetchAnalyzer;
decodedBlocksCacheSize: number;
activeChunkRequest: Promise<void> | null;
activeContextRequest: Promise<Record<number, ImageBitmap>> | null;
Expand Down Expand Up @@ -208,24 +208,26 @@ export class FrameData {
class PrefetchAnalyzer {
#chunkSize: number;
#requestedFrames: number[];
#startFrame: number;

constructor(chunkSize) {
constructor(chunkSize, startFrame) {
this.#chunkSize = chunkSize;
this.#requestedFrames = [];
this.#startFrame = startFrame;
}

shouldPrefetchNext(current: number, isPlaying: boolean, isChunkCached: (chunk) => boolean): boolean {
if (isPlaying) {
return true;
}

const currentChunk = Math.floor(current / this.#chunkSize);
const currentChunk = Math.floor((current - this.#startFrame) / this.#chunkSize);
const { length } = this.#requestedFrames;
const isIncreasingOrder = this.#requestedFrames
.every((val, index) => index === 0 || val > this.#requestedFrames[index - 1]);
if (
length && (isIncreasingOrder && current > this.#requestedFrames[length - 1]) &&
(current % this.#chunkSize) >= Math.ceil(this.#chunkSize / 2) &&
((current - this.#startFrame) % this.#chunkSize) >= Math.ceil(this.#chunkSize / 2) &&
!isChunkCached(currentChunk + 1)
) {
// is increasing order including the current frame
Expand Down Expand Up @@ -262,19 +264,20 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {
imageData: ImageBitmap | Blob;
} | Blob>((resolve, reject) => {
const {
provider, prefetchAnalizer, chunkSize, stopFrame, decodeForward, forwardStep, decodedBlocksCacheSize,
provider, prefetchAnalyzer, chunkSize, startFrame, stopFrame,
decodeForward, forwardStep, decodedBlocksCacheSize,
} = frameDataCache[this.jobID];

const requestId = +_.uniqueId();
const chunkNumber = Math.floor(this.number / chunkSize);
const chunkNumber = Math.floor((this.number - startFrame) / chunkSize);
const frame = provider.frame(this.number);

function findTheNextNotDecodedChunk(searchFrom: number): number {
let firstFrameInNextChunk = searchFrom + forwardStep;
let nextChunkNumber = Math.floor(firstFrameInNextChunk / chunkSize);
let nextChunkNumber = Math.floor((firstFrameInNextChunk - startFrame) / chunkSize);
while (nextChunkNumber === chunkNumber) {
firstFrameInNextChunk += forwardStep;
nextChunkNumber = Math.floor(firstFrameInNextChunk / chunkSize);
nextChunkNumber = Math.floor((firstFrameInNextChunk - startFrame) / chunkSize);
}

if (provider.isChunkCached(nextChunkNumber)) {
Expand All @@ -286,15 +289,15 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {

if (frame) {
if (
prefetchAnalizer.shouldPrefetchNext(
prefetchAnalyzer.shouldPrefetchNext(
this.number,
decodeForward,
(chunk) => provider.isChunkCached(chunk),
) && decodedBlocksCacheSize > 1 && !frameDataCache[this.jobID].activeChunkRequest
) {
const nextChunkNumber = findTheNextNotDecodedChunk(this.number);
const predecodeChunksMax = Math.floor(decodedBlocksCacheSize / 2);
if (nextChunkNumber * chunkSize <= stopFrame &&
if (startFrame + nextChunkNumber * chunkSize <= stopFrame &&
nextChunkNumber <= chunkNumber + predecodeChunksMax
) {
frameDataCache[this.jobID].activeChunkRequest = new Promise((resolveForward) => {
Expand All @@ -316,8 +319,8 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {
provider.cleanup(1);
provider.requestDecodeBlock(
chunk,
nextChunkNumber * chunkSize,
Math.min(stopFrame, (nextChunkNumber + 1) * chunkSize - 1),
startFrame + nextChunkNumber * chunkSize,
Math.min(stopFrame, startFrame + (nextChunkNumber + 1) * chunkSize - 1),
() => {},
releasePromise,
releasePromise,
Expand All @@ -334,7 +337,7 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {
renderHeight: this.height,
imageData: frame,
});
prefetchAnalizer.addRequested(this.number);
prefetchAnalyzer.addRequested(this.number);
return;
}

Expand All @@ -355,7 +358,7 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {
renderHeight: this.height,
imageData: currentFrame,
});
prefetchAnalizer.addRequested(this.number);
prefetchAnalyzer.addRequested(this.number);
return;
}

Expand All @@ -378,8 +381,8 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {
provider
.requestDecodeBlock(
chunk,
chunkNumber * chunkSize,
Math.min(stopFrame, (chunkNumber + 1) * chunkSize - 1),
startFrame + chunkNumber * chunkSize,
Math.min(stopFrame, startFrame + (chunkNumber + 1) * chunkSize - 1),
(_frame: number, bitmap: ImageBitmap | Blob) => {
if (decodeForward) {
// resolve immediately only if is not playing
Expand All @@ -395,7 +398,7 @@ Object.defineProperty(FrameData.prototype.data, 'implementation', {
renderHeight: this.height,
imageData: bitmap,
});
prefetchAnalizer.addRequested(this.number);
prefetchAnalyzer.addRequested(this.number);
}
}, () => {
frameDataCache[this.jobID].activeChunkRequest = null;
Expand Down Expand Up @@ -612,9 +615,10 @@ export async function getFrame(
blockType,
chunkSize,
decodedBlocksCacheSize,
startFrame,
dimension,
),
prefetchAnalizer: new PrefetchAnalyzer(chunkSize),
prefetchAnalyzer: new PrefetchAnalyzer(chunkSize, startFrame),
decodedBlocksCacheSize,
activeChunkRequest: null,
activeContextRequest: null,
Expand Down
7 changes: 5 additions & 2 deletions cvat-data/src/ts/cvat-data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -100,11 +100,13 @@ export class FrameDecoder {
private renderHeight: number;
private zipWorker: Worker | null;
private videoWorker: Worker | null;
private startFrame: number;

constructor(
blockType: BlockType,
chunkSize: number,
cachedBlockCount: number,
startFrame: number,
dimension: DimensionType = DimensionType.DIMENSION_2D,
) {
this.mutex = new Mutex();
Expand All @@ -118,6 +120,7 @@ export class FrameDecoder {
this.renderWidth = 1920;
this.renderHeight = 1080;
this.chunkSize = chunkSize;
this.startFrame = startFrame;
this.blockType = blockType;

this.decodedChunks = {};
Expand Down Expand Up @@ -203,7 +206,7 @@ export class FrameDecoder {
}

frame(frameNumber: number): ImageBitmap | Blob | null {
const chunkNumber = Math.floor(frameNumber / this.chunkSize);
const chunkNumber = Math.floor((frameNumber - this.startFrame) / this.chunkSize);
if (chunkNumber in this.decodedChunks) {
return this.decodedChunks[chunkNumber][frameNumber];
}
Expand Down Expand Up @@ -262,7 +265,7 @@ export class FrameDecoder {
throw new RequestOutdatedError();
}

const chunkNumber = Math.floor(start / this.chunkSize);
const chunkNumber = Math.floor((start - this.startFrame) / this.chunkSize);
this.orderedStack = [chunkNumber, ...this.orderedStack];
this.cleanup();
const decodedFrames: Record<number, ImageBitmap | Blob> = {};
Expand Down
47 changes: 29 additions & 18 deletions cvat/apps/dataset_manager/bindings.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@

from cvat.apps.dataset_manager.formats.utils import get_label_color
from cvat.apps.dataset_manager.util import add_prefetch_fields
from cvat.apps.engine.frame_provider import FrameProvider
from cvat.apps.engine.models import (AttributeSpec, AttributeType, Data, DimensionType, Job,
from cvat.apps.engine.frame_provider import TaskFrameProvider, FrameQuality, FrameOutputType
from cvat.apps.engine.models import (AttributeSpec, AttributeType, DimensionType, Job,
JobType, Label, LabelType, Project, SegmentType, ShapeType,
Task)
from cvat.apps.engine.rq_job_handler import RQJobMetaField
Expand Down Expand Up @@ -240,7 +240,7 @@ def start(self) -> int:

@property
def stop(self) -> int:
return len(self)
return max(0, len(self) - 1)

def _get_queryset(self):
raise NotImplementedError()
Expand Down Expand Up @@ -376,7 +376,7 @@ def _export_tag(self, tag):
def _export_track(self, track, idx):
track['shapes'] = list(filter(lambda x: not self._is_frame_deleted(x['frame']), track['shapes']))
tracked_shapes = TrackManager.get_interpolated_shapes(
track, 0, self.stop, self._annotation_ir.dimension)
track, 0, self.stop + 1, self._annotation_ir.dimension)
for tracked_shape in tracked_shapes:
tracked_shape["attributes"] += track["attributes"]
tracked_shape["track_id"] = track["track_id"] if self._use_server_track_ids else idx
Expand Down Expand Up @@ -432,7 +432,7 @@ def get_frame(idx):

anno_manager = AnnotationManager(self._annotation_ir)
for shape in sorted(
anno_manager.to_shapes(self.stop, self._annotation_ir.dimension,
anno_manager.to_shapes(self.stop + 1, self._annotation_ir.dimension,
# Skip outside, deleted and excluded frames
included_frames=included_frames,
include_outside=False,
Expand Down Expand Up @@ -763,7 +763,7 @@ def start(self) -> int:
@property
def stop(self) -> int:
segment = self._db_job.segment
return segment.stop_frame + 1
return segment.stop_frame

@property
def db_instance(self):
Expand Down Expand Up @@ -1333,7 +1333,7 @@ def add_task(self, task, files):

@attrs(frozen=True, auto_attribs=True)
class ImageSource:
db_data: Data
db_task: Task
is_video: bool = attrib(kw_only=True)

class ImageProvider:
Expand Down Expand Up @@ -1362,25 +1362,29 @@ def video_frame_loader(_):
# optimization for videos: use numpy arrays instead of bytes
# some formats or transforms can require image data
return self._frame_provider.get_frame(frame_index,
quality=FrameProvider.Quality.ORIGINAL,
out_type=FrameProvider.Type.NUMPY_ARRAY)[0]
quality=FrameQuality.ORIGINAL,
out_type=FrameOutputType.NUMPY_ARRAY
).data

return dm.Image(data=video_frame_loader, **image_kwargs)
else:
def image_loader(_):
self._load_source(source_id, source)

# for images use encoded data to avoid recoding
return self._frame_provider.get_frame(frame_index,
quality=FrameProvider.Quality.ORIGINAL,
out_type=FrameProvider.Type.BUFFER)[0].getvalue()
quality=FrameQuality.ORIGINAL,
out_type=FrameOutputType.BUFFER
).data.getvalue()

return dm.ByteImage(data=image_loader, **image_kwargs)

def _load_source(self, source_id: int, source: ImageSource) -> None:
if self._current_source_id == source_id:
return

self._unload_source()
self._frame_provider = FrameProvider(source.db_data)
self._frame_provider = TaskFrameProvider(source.db_task)
self._current_source_id = source_id

def _unload_source(self) -> None:
Expand All @@ -1396,7 +1400,7 @@ def __init__(self, sources: Dict[int, ImageSource]) -> None:
self._images_per_source = {
source_id: {
image.id: image
for image in source.db_data.images.prefetch_related('related_files')
for image in source.db_task.data.images.prefetch_related('related_files')
}
for source_id, source in sources.items()
}
Expand All @@ -1405,7 +1409,7 @@ def get_image_for_frame(self, source_id: int, frame_id: int, **image_kwargs):
source = self._sources[source_id]

point_cloud_path = osp.join(
source.db_data.get_upload_dirname(), image_kwargs['path'],
source.db_task.data.get_upload_dirname(), image_kwargs['path'],
)

image = self._images_per_source[source_id][frame_id]
Expand Down Expand Up @@ -1518,11 +1522,18 @@ def __init__(
is_video = instance_meta['mode'] == 'interpolation'
ext = ''
if is_video:
ext = FrameProvider.VIDEO_FRAME_EXT
ext = TaskFrameProvider.VIDEO_FRAME_EXT

if dimension == DimensionType.DIM_3D or include_images:
if isinstance(instance_data, TaskData):
db_task = instance_data.db_instance
elif isinstance(instance_data, JobData):
db_task = instance_data.db_instance.segment.task
else:
assert False

self._image_provider = IMAGE_PROVIDERS_BY_DIMENSION[dimension](
{0: ImageSource(instance_data.db_data, is_video=is_video)}
{0: ImageSource(db_task, is_video=is_video)}
)

for frame_data in instance_data.group_by_frame(include_empty=True):
Expand Down Expand Up @@ -1604,13 +1615,13 @@ def __init__(
if self._dimension == DimensionType.DIM_3D or include_images:
self._image_provider = IMAGE_PROVIDERS_BY_DIMENSION[self._dimension](
{
task.id: ImageSource(task.data, is_video=task.mode == 'interpolation')
task.id: ImageSource(task, is_video=task.mode == 'interpolation')
for task in project_data.tasks
}
)

ext_per_task: Dict[int, str] = {
task.id: FrameProvider.VIDEO_FRAME_EXT if is_video else ''
task.id: TaskFrameProvider.VIDEO_FRAME_EXT if is_video else ''
for task in project_data.tasks
for is_video in [task.mode == 'interpolation']
}
Expand Down
23 changes: 13 additions & 10 deletions cvat/apps/dataset_manager/formats/cvat.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
import_dm_annotations,
match_dm_item)
from cvat.apps.dataset_manager.util import make_zip_archive
from cvat.apps.engine.frame_provider import FrameProvider
from cvat.apps.engine.frame_provider import FrameQuality, FrameOutputType, make_frame_provider

from .registry import dm_env, exporter, importer

Expand Down Expand Up @@ -1371,16 +1371,19 @@ def dump_project_anno(dst_file: BufferedWriter, project_data: ProjectData, callb
dumper.close_document()

def dump_media_files(instance_data: CommonData, img_dir: str, project_data: ProjectData = None):
frame_provider = make_frame_provider(instance_data.db_instance)

ext = ''
if instance_data.meta[instance_data.META_FIELD]['mode'] == 'interpolation':
ext = FrameProvider.VIDEO_FRAME_EXT

frame_provider = FrameProvider(instance_data.db_data)
frames = frame_provider.get_frames(
instance_data.start, instance_data.stop,
frame_provider.Quality.ORIGINAL,
frame_provider.Type.BUFFER)
for frame_id, (frame_data, _) in zip(instance_data.rel_range, frames):
ext = frame_provider.VIDEO_FRAME_EXT

frames = frame_provider.iterate_frames(
start_frame=instance_data.start,
stop_frame=instance_data.stop,
quality=FrameQuality.ORIGINAL,
out_type=FrameOutputType.BUFFER,
)
for frame_id, frame in zip(instance_data.rel_range, frames):
if (project_data is not None and (instance_data.db_instance.id, frame_id) in project_data.deleted_frames) \
or frame_id in instance_data.deleted_frames:
continue
Expand All @@ -1389,7 +1392,7 @@ def dump_media_files(instance_data: CommonData, img_dir: str, project_data: Proj
img_path = osp.join(img_dir, frame_name + ext)
os.makedirs(osp.dirname(img_path), exist_ok=True)
with open(img_path, 'wb') as f:
f.write(frame_data.getvalue())
f.write(frame.data.getvalue())

def _export_task_or_job(dst_file, temp_dir, instance_data, anno_callback, save_images=False):
with open(osp.join(temp_dir, 'annotations.xml'), 'wb') as f:
Expand Down
Loading
Loading