Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split track metadata from file metadata #1640

Open
pprkut opened this issue Oct 8, 2015 · 10 comments
Open

Split track metadata from file metadata #1640

pprkut opened this issue Oct 8, 2015 · 10 comments
Labels
big feature features we would like to implement

Comments

@pprkut
Copy link
Contributor

pprkut commented Oct 8, 2015

Currently beets treats every track as exactly one physical file. That is, both sets of information end up in one table in the database and are used combined in the code (more or less).
Examples of track metadata include artist name, track title, album name, musicbrainz ids, etc, whereas examples for file metadata include format, sample rate, location, etc.

From my perspective it would be beneficial for a number of features to split file metadata from track metadata, but at the core it all revolves around "duplicates".

Not every duplicate is a duplicate. Maybe someone wants to keep both Vorbis/MP3/AAC and FLAC versions of releases in his collection. That means we have different file metadata for every file (obviously), but the track metadata is the same. Keeping it separate could mean we only need to fetch remote metadata (echonest, musicbrainz, etc) once, and writing it to multiple files. (I think currently we fetch it for every track, even if we've already fetched it before. I might be wrong though). It also means more logical name collision handling. Right now if I have a FLAC and a Vorbis incarnation of a release in my collection, album disambiguation kicks in as it sees them as potentially conflicting, although there really shouldn't be (unless I'm missing something).

When having multiple versions of a track available, it would be really neat to then play with "suitability", i.e. creating playlists based on what suits the intended use case best. Like for example prefering FLAC over lossy when playing audio on the local computer, but prefering lossy over FLAC when streaming, or disregarding FLAC completely when copying to an MP3 player that doesn't support it. I'm sure some people can come up with other use cases :-)

There's also a nice side effect for potential future video support. Sometimes it's handy to keep not only the video file itself, but also an audio version of the track at hand. Again the track metadata doesn't change, but splitting off the file metadata would eventually allow us to keep a video file next to an audio file, and you could again play around with when to prefer which incarnation of the track.

@sampsyo sampsyo added feature features we would like to implement big labels Oct 8, 2015
@sampsyo
Copy link
Member

sampsyo commented Oct 8, 2015

Interesting! The complement to this would be allowing multiple tracks corresponding to the same file, as in #136.

This would be fun to explore. It would be a big change, though, that would have to start with rethinking the architecture.

@twrightsman
Copy link
Contributor

It may be useful to follow the architecture of MusicBrainz since they are already doing the hard conceptual thinking on how to best represent musical metadata. The only thing beets would have to on top of that is manage the actual music files.

@sampsyo
Copy link
Member

sampsyo commented Apr 23, 2017

Hi, @twrightsman—what specifically would you want to adopt from the MB data model?

@twrightsman
Copy link
Contributor

I actually misspoke earlier: beets already does a great job making Items synonymous with recordings and Albums synonymous with releases. My suggestion is that beets has to take the MB data model one step further since it is actually managing the audio data in addition to the metadata. To propose an implementation like @pprkut suggested, the Item object would have to be decoupled from File objects, for example. A given Item object can have many File objects associated with it and these File objects are what store data about the file format, location, bit rate, etc.

Now, the question is how would this work with existing commands/plugins? Unfortunately, I feel like this would have to be addressed per command/plugin because what was only one object before (Item) is now conceptually two (Item or File). For example, beet ls would only need to list Items and not necessary all of their File objects (although a user might want a flag to list all File objects tied to each track). beet ls -p would need to either have configuration to favor a specific format to enforce that all Items only have one path printed or maybe just print the paths for all File objects in the database. Does this sound reasonable so far?

@sampsyo
Copy link
Member

sampsyo commented Apr 24, 2017

Got it; thanks! Yes, you are definitely on the right track. In my view, the main challenge here is keeping things simple: namely, avoiding the complexity of a file/track separation when it's not necessary.

@pprkut
Copy link
Contributor Author

pprkut commented Apr 24, 2017

@sampsyo and me discussed the implementation of this briefly on IRC a while back. I still have the notes and general plan here, just needed to get some other things sorted out first before I could tackle this effectivly. Almost there though :)
The idea was to split this up into smaller steps. First extending the documentation we have on the database side of things so changes in general on that part can be easier. Next was implementing the item/file split on the storage layer only, without exposing the switch to the user yet. That way we already store file information separately, but not allowing multiple files per item yet. That would then be the last step, although there's probably many places to touch here so probably this last step is going to be split up further as well.

@xthursdayx
Copy link

@pprkut I know that this issue is now listed as a closed, but I was wondering if you'd made any progress address this issue? Specifically in relation to keeping multiple formats of the same track in a library (e.g. MP3 and FLAC). Thanks!

@pprkut
Copy link
Contributor Author

pprkut commented Nov 14, 2018

As far as I can see the issue is still open :-)

I didn't make any progress on this, unfortunately, as performance work was more important on my side for now. But it is very high up on my priority list of features to work on. I'll get to this eventually, unless someone else beets (sorry, couldn't resist ;-) ) me to it.

@xthursdayx
Copy link

As far as I can see the issue is still open :-)

Oops, I was looking at that "Closed" flag on the issue #3036 :-|

Thanks for the update; I wish I knew Python well enough to be any help!

@exislow
Copy link

exislow commented Oct 20, 2021

What is the status about this issue? Is this on the roadmap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
big feature features we would like to implement
Projects
None yet
Development

No branches or pull requests

5 participants