Skip to content

Proposal for a 'simple' NILM Metadata schema #16

@JackKelly

Description

@JackKelly

NILM Metadata tries to make it possible to capture pretty much any conceivable scenario. But, as more datasets become available, it appears that a large proportion of datasets could be described using a simpler metadata schema. It would be great to discuss the design of a "Simple NILM Metadata" schema which could exist along side "NILM Metadata". Perhaps CSV is even easier to read than YAML in Matlab, Java etc so it might be nice if we can use CSV. We'd have a check-list to help people decide whether they require the full expressive power of "NILM Metadata" or if they can get by with "Simple NILM Metadata".

The simple schema could also be used for adding metadata to the output of disaggregation algorithms (hence helping to simplify NILMTK disaggregation algorithm implementation); and for describing the training dataset and the responses for any future NILM competition or validation tool (I'm working with a group of MSc students who aim to produce a proof-of-concept NILM validation tool by the end of this term; here's the project spec.)

So, here's an initial proposal, using REDD as an example:

building1_labels.csv

This looks a little like labels.dat in the REDD format except that:

  • we use a comma as a separator (which is standard for CSV, and also allows us to use spaces in strings without using quotes)
  • we use the file suffix csv not dat (so that spreadsheet applications know how to open the file)
  • we use our NILM Metadata controlled vocabulary for appliance names
  • we give an instance number for each appliance
  • if there are multiple appliances measured by a meter then separate them by a semicolon e.g. 6,television#1;light#1
  • we could optionally use a third column to specify the submeter_of property. If this is not specified then we assume that anything that isn't a site meter is downstream of all site meters, and that all site meters should be summed to get the total whole-house power demand. Or maybe we should keep "Simple NILM Metadata" as simple as possible and say that any non-standard wiring hierarchy simply cannot be expressed using "Simple NILM Metadata"?
meter instance, label

1,site meter
2,site meter
3,electric oven#1
4,electric oven#1
5,fridge#1

meter_devices.csv

We also need to specify what is measured in each data file. In NILM Metadata this is done in meter_devices.yaml. In "Simple NILM Metadata" this could be done in a meter_devices.csv files. The file would contain three columns; each row would be a <meter device name>,<key>,<value> tuple. e.g.:

meter device name,key,value

site meters,sample period,1
site meters,measurements,active power;apparent power
submeters,sample period,3
submeters,measurements,active power
submeters,model,eMonitor
submeters,manufacturer,Powerhouse Dynamics

The assumption would be that all meters with the label site meter would take attributes from site meters and all other meters would take attributes from submeters. If this is not the case (e.g. if there are several types of submeter) then we could do the following (and we'd only have to specify this for the meters for which the default assumption does not hold).

meter_devices_mapping.csv
building instance,meter instance,meter device name

1,1,Current Cost
1,2,SCPM

Any thoughts? If you use Matlab / Java / Scala / Julia / C++ etc, would you find it easier to load metadata described using CSV files rather than YAML files? If you maintain a dataset, is there anything in your own dataset that the proposal above cannot express?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions