Disaggregation Using Nilmtk

Disaggregation Using Nilmtk
November 8, 2016
0.1
Update on my Progress
0.2
By : Mengistu Tekalign Tesfaye
0.3
To : Professor Davide Brunelli
In [1]: import numpy as np

import pandas as pd
from os.path import join
from pylab import rcParams
import matplotlib.pyplot as plt
%matplotlib inline
rcParams['figure.figsize'] = (16, 8)
import nilmtk
from nilmtk import DataSet, TimeFrame, MeterGroup, HDFDataStore
from nilmtk.disaggregate import CombinatorialOptimisation ,fhmm_exact
from nilmtk.utils import print_dict
from nilmtk.metrics import f1_score
import warnings
warnings.filterwarnings("ignore")
Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and
organize large amounts of data
NILMTK uses keys in the form /building/elec/meter where i and j are integers starting
from 1. i is the building instance and j is the meter instance
For example, the table storing data from meter instance 1 in building instance 1 would have
the key /building1/elec/meter1
0.4
Loading data
In [2]: data_dir = '/home/teke/nilmtk/data/d/'

we = DataSet(join(data_dir, 'redd.h5'))
0.5
Examine dataset metadata
0.6
Examine metadata for a single house
In [3]: building_number = 2
0.7
Examine sub-metered appliances
In [4]: elec = we.buildings[building_number].elec

elec.appliances
Out[4]: [Appliance(type='fridge', instance=1),
Appliance(type='washer dryer', instance=1),
Appliance(type='dish washer', instance=1),
Appliance(type='light', instance=1),
Appliance(type='electric stove', instance=1),
Appliance(type='sockets', instance=1),
Appliance(type='microwave', instance=1),
Appliance(type='sockets', instance=2),
Appliance(type='waste disposal unit', instance=1)]
This is the data used for trainng and testing the combinatorial optimization and markove hidden model aligorithm
I try to divide the given data into different kinds of applinace based on the amount of power
consumption
Appliance1 (roughly its average power consumption is around 40)

Appliance5 (1850)
Appliance6 (2000)
Appliance7 (2400)
Appliance8 (3000)
The above appliance used as time series data

Index column
The index column is a datetime represented on disk as a nano-second precision UNIX
timestamp stored as an unsigned 64-bit int.
In Python, we used a timezone-aware numpy.datetime64.
The dataframe must be sorted in ascending order on the index (timestamp) column.
1.1
Wiring hierarchy of meters
In [5]: elec.meters[1].when_on().next().head(10)
Out[5]: 2011-04-17 19:18:27-04:00
2011-04-17 19:18:28-04:00
True
True
2
2011-04-17 19:18:29-04:00
True
2011-04-17 19:18:30-04:00
True
2011-04-17 19:18:31-04:00
True
2011-04-17 19:18:32-04:00
True
2011-04-17 19:18:33-04:00
True
2011-04-17 19:18:34-04:00
True
2011-04-17 19:18:35-04:00
True
2011-04-17 19:18:54-04:00
True
Name: (power, apparent), dtype: bool
1.2
Select fridge
In [6]: fridges = nilmtk.global_meter_group.select_using_appliances(type='fridge')
1.3
Proportion of energy per fridge
The energy consumed by each appliance can be expressed as a proportion of the households total
energy. Here we find the range of proportions for each fridge.
In [7]: fridges_restricted = MeterGroup(fridges.meters[:9])
In [8]: # How much energy does the largest-consuming fridge consume in kWh?
fridges.select(building=2).total_energy()
Calculating total_energy for ElecMeterID(instance=9, building=2, dataset='REDD') ..

Out[8]: active
22.396646
dtype: float64
This is the fridge that used for training and testing the disaggregation algorithms
In [9]: we.set_window(start='2011-04-18',end='2011-04-22')
fridges.select(building=2).plot();
2.1
Plot sub-metered data for a single day
In [10]: we.set_window(start='2011-04-18',end='2011-04-22')
elec.mains().plot();
2.2
Plot fraction of energy consumption of each appliance
In [11]: fraction = elec.submeters().fraction_per_meter().dropna()
9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='

In [12]: elec.clear_cache()
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
building2/elec/cache/meter3/
In [13]: elec.submeters().fraction_per_meter()

4
Out[13]: (3, 2, REDD)

(4, 2, REDD)
(5, 2, REDD)
(6, 2, REDD)
(7, 2, REDD)
(8, 2, REDD)
(9, 2, REDD)
(10, 2, REDD)
(11, 2, REDD)
dtype: float64
0.101074
0.074877
0.018925
0.074877
0.000000
0.301417
0.258510
0.170321
0.000000
In [14]: # Create convenient labels

#labels = elec.get_appliance_labels(fraction.index)
plt.figure(figsize=(8,8))
fraction.plot(kind='pie') # , labels=labels);
Out[14]: <matplotlib.axes._subplots.AxesSubplot at 0x7efe1b977050>
2.3
Select meters on the basic of appliance category
In [15]: # Find all appliances with a particular type of motor

elec.select_using_appliances(category='single-phase induction motor')
Out[15]: MeterGroup(meters=
ElecMeter(instance=7, building=2, dataset='REDD', appliances=[Appliance(
ElecMeter(instance=9, building=2, dataset='REDD', appliances=[Appliance(
ElecMeter(instance=10, building=2, dataset='REDD', appliances=[Appliance
)
In [16]: top2=elec.submeters().select_top_k(k=2)
2.4
Training and disaggregation
In [17]: # Train
co = CombinatorialOptimisation()
co.train(top2)
Training model for submeter 'ElecMeter(instance=9, building=2, dataset='REDD', appl

Done training!
In this training stage of the combinatroial optimization, the algorithms uses cluster techniques to get the possible state of the appliance
cluster (data,max_num_cluster)
In [18]: for model in co.model:

print_dict(model)
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
In [19]: # Disaggregate with Combinatorial Optimization

disag_filename = join(data_dir, 'redd-disag.h5')
output = HDFDataStore(disag_filename, 'w')
co.disaggregate(elec.mains(), output)
output.close()
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')

Done loading data all meters for this chunk.
Including vampire_power = 79.8444976807 watts to model...
Estimating power demand for 'ElecMeter(instance=9, building=2, dataset='REDD',
3.1
Examine disaggregated data from the combinatorial optimization algorithm
In [20]: disag = DataSet(disag_filename)

disag_elec = disag.buildings[building_number].elec
disag_elec.plot()
disag.store.close()
appl
appl
appl
appl
appl
appl
appl
appl
3.2
Calculate accuracy of disaggregation

f1 = f1_score(disag_elec, top2)
f1.index = disag_elec.get_labels(f1.index)
f1.plot(kind='bar')
plt.xlabel('appliance');
plt.ylabel('f-score');
disag.store.close()
General machine learning metrics

-
Precision, Recall, F-scor
The F1 score, commonly used in information retrieval, measures accuracy using the statistics
precision p and recall r. Precision is the ratio of true positives (tp) to all predicted positives (tp +
fp). Recall is the ratio of true positives to all actual positives (tp + fn). The F1 score is given by
- F-score = 2p*r/p+r
where
p= t*p/tp+fp,
r=t*p/tp+fn
The F1 metric weights recall and precision equally, and a good retrieval algorithm will maximize both precision and recall simultaneously. Thus moderately good performance on both will
be favored over extremely good performance on one and poor performance on the other.
F-Measure scores range from 0-100%. A score less than 15% means that your KantanMT engine is not performing optimally and a high level of post-editing will be required to finalise your
translations and reach publishable quality.
A score greater than 50% is a very good score and significantly less post-editing will be require
to achieve publishable translation quality.
In [22]: f1
Out[22]: Sockets
0.000000
Fridge
0.789888
dtype: float64
In [23]: elec.clear_cache()
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Using Factorial Hidden Markov Model
In [24]: fhmm = fhmm_exact.FHMM()

fhmm.train(top2)

In [25]: # Disaggregate with Factorial Hidden Markov Model

disag_filename = join(data_dir, 'redd-disag-fhmm.h5')
output = HDFDataStore(disag_filename, 'w')
fhmm.disaggregate(elec.mains(), output)
output.close()
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
for meter ElecMeterID(instance=2,

data all meters for this chunk.
building=2, dataset='REDD')
In [26]: disag_filename = join(data_dir, 'redd-disag-fhmm.h5')

disag = DataSet(disag_filename)
disag_elec.plot()
plt.title("FHMM");
disag.store.close()
10

f1 = f1_score(disag_elec, top2)
f1.index = disag_elec.get_labels(f1.index)
f1.plot(kind='barh')
plt.ylabel('appliance');
plt.xlabel('f-score');
plt.title("FHMM");
disag.store.close()
11
In [28]: f1
Out[28]: Sockets
0.000000
Fridge
0.999081
dtype: float64
In [29]: fridges = nilmtk.global_meter_group.select_using_appliances(type='fridge')
From the disaggregation result for the fridge in the graph and also
from the f-score result (f1-co=0.79 and f1-fhmm=0.999) , we can see
that the markov hidden model is quite accurate eventhough it is more
complex than the combinatorial optimization.
and these days I am also studying on the Algorithms, till the next
tasks. . .
12

Disaggregation Using Nilmtk

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Disaggregation Using Nilmtk

Caricato da

Copyright:

Formati disponibili

Disaggregation Using Nilmtk

By : Mengistu Tekalign Tesfaye

To : Professor Davide Brunelli

In [1]: import numpy as np

In [2]: data_dir = '/home/teke/nilmtk/data/d/'

Examine dataset metadata

Examine metadata for a single house

Examine sub-metered appliances

In [4]: elec = we.buildings[building_number].elec

Appliance1 (roughly its average power consumption is around 40)

The above appliance used as time series data

Wiring hierarchy of meters

In [6]: fridges = nilmtk.global_meter_group.select_using_appliances(type='fridge')

Proportion of energy per fridge

Calculating total_energy for ElecMeterID(instance=9, building=2, dataset='REDD') ..

Plot sub-metered data for a single day

Plot fraction of energy consumption of each appliance

In [11]: fraction = elec.submeters().fraction_per_meter().dropna()

9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='

9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='

Out[13]: (3, 2, REDD)

In [14]: # Create convenient labels

Select meters on the basic of appliance category

In [15]: # Find all appliances with a particular type of motor

9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='

Training and disaggregation

Training model for submeter 'ElecMeter(instance=9, building=2, dataset='REDD', appl

In [18]: for model in co.model:

In [19]: # Disaggregate with Combinatorial Optimization

Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')

Examine disaggregated data from the combinatorial optimization algorithm

In [20]: disag = DataSet(disag_filename)

Calculate accuracy of disaggregation

In [21]: disag = DataSet(disag_filename)

General machine learning metrics

Precision, Recall, F-scor

Using Factorial Hidden Markov Model

In [24]: fhmm = fhmm_exact.FHMM()

Training model for submeter 'ElecMeter(instance=9, building=2, dataset='REDD', appl

In [25]: # Disaggregate with Factorial Hidden Markov Model

for meter ElecMeterID(instance=2,

In [26]: disag_filename = join(data_dir, 'redd-disag-fhmm.h5')

In [27]: disag = DataSet(disag_filename)

In [29]: fridges = nilmtk.global_meter_group.select_using_appliances(type='fridge')

Potrebbero piacerti anche