Sei sulla pagina 1di 12

Disaggregation Using Nilmtk

November 8, 2016

0.1

Update on my Progress

0.2

By : Mengistu Tekalign Tesfaye

0.3

To : Professor Davide Brunelli

In [1]: import numpy as np


import pandas as pd
from os.path import join
from pylab import rcParams
import matplotlib.pyplot as plt
%matplotlib inline
rcParams['figure.figsize'] = (16, 8)
import nilmtk
from nilmtk import DataSet, TimeFrame, MeterGroup, HDFDataStore
from nilmtk.disaggregate import CombinatorialOptimisation ,fhmm_exact
from nilmtk.utils import print_dict
from nilmtk.metrics import f1_score
import warnings
warnings.filterwarnings("ignore")
Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and
organize large amounts of data
NILMTK uses keys in the form /building/elec/meter where i and j are integers starting
from 1. i is the building instance and j is the meter instance
For example, the table storing data from meter instance 1 in building instance 1 would have
the key /building1/elec/meter1

0.4

Loading data

In [2]: data_dir = '/home/teke/nilmtk/data/d/'


we = DataSet(join(data_dir, 'redd.h5'))

0.5

Examine dataset metadata

0.6

Examine metadata for a single house

In [3]: building_number = 2

0.7

Examine sub-metered appliances

In [4]: elec = we.buildings[building_number].elec


elec.appliances
Out[4]: [Appliance(type='fridge', instance=1),
Appliance(type='washer dryer', instance=1),
Appliance(type='dish washer', instance=1),
Appliance(type='light', instance=1),
Appliance(type='electric stove', instance=1),
Appliance(type='sockets', instance=1),
Appliance(type='microwave', instance=1),
Appliance(type='sockets', instance=2),
Appliance(type='waste disposal unit', instance=1)]

This is the data used for trainng and testing the combinatorial optimization and markove hidden model aligorithm
I try to divide the given data into different kinds of applinace based on the amount of power
consumption

Appliance1 (roughly its average power consumption is around 40)


Appliance2 (roughly its average power consumption is around 110)
Appliance3 (roughly its average power consumption is around 200)
Appliance4 (roughly its average power consumption is around 1000)
Appliance5 (1850)
Appliance6 (2000)
Appliance7 (2400)
Appliance8 (3000)

The above appliance used as time series data


Index column
The index column is a datetime represented on disk as a nano-second precision UNIX
timestamp stored as an unsigned 64-bit int.
In Python, we used a timezone-aware numpy.datetime64.
The dataframe must be sorted in ascending order on the index (timestamp) column.

1.1

Wiring hierarchy of meters

In [5]: elec.meters[1].when_on().next().head(10)
Out[5]: 2011-04-17 19:18:27-04:00
2011-04-17 19:18:28-04:00

True
True
2

2011-04-17 19:18:29-04:00
True
2011-04-17 19:18:30-04:00
True
2011-04-17 19:18:31-04:00
True
2011-04-17 19:18:32-04:00
True
2011-04-17 19:18:33-04:00
True
2011-04-17 19:18:34-04:00
True
2011-04-17 19:18:35-04:00
True
2011-04-17 19:18:54-04:00
True
Name: (power, apparent), dtype: bool

1.2

Select fridge

In [6]: fridges = nilmtk.global_meter_group.select_using_appliances(type='fridge')

1.3

Proportion of energy per fridge

The energy consumed by each appliance can be expressed as a proportion of the households total
energy. Here we find the range of proportions for each fridge.
In [7]: fridges_restricted = MeterGroup(fridges.meters[:9])
In [8]: # How much energy does the largest-consuming fridge consume in kWh?
fridges.select(building=2).total_energy()

Calculating total_energy for ElecMeterID(instance=9, building=2, dataset='REDD') ..


Out[8]: active
22.396646
dtype: float64

This is the fridge that used for training and testing the disaggregation algorithms

In [9]: we.set_window(start='2011-04-18',end='2011-04-22')
fridges.select(building=2).plot();

2.1

Plot sub-metered data for a single day

In [10]: we.set_window(start='2011-04-18',end='2011-04-22')
elec.mains().plot();

2.2

Plot fraction of energy consumption of each appliance

In [11]: fraction = elec.submeters().fraction_per_meter().dropna()

9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='


In [12]: elec.clear_cache()
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed

building2/elec/cache/meter3/
building2/elec/cache/meter4/
building2/elec/cache/meter5/
building2/elec/cache/meter6/
building2/elec/cache/meter7/
building2/elec/cache/meter8/
building2/elec/cache/meter9/
building2/elec/cache/meter10/
building2/elec/cache/meter11/

In [13]: elec.submeters().fraction_per_meter()

9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='


4

Out[13]: (3, 2, REDD)


(4, 2, REDD)
(5, 2, REDD)
(6, 2, REDD)
(7, 2, REDD)
(8, 2, REDD)
(9, 2, REDD)
(10, 2, REDD)
(11, 2, REDD)
dtype: float64

0.101074
0.074877
0.018925
0.074877
0.000000
0.301417
0.258510
0.170321
0.000000

In [14]: # Create convenient labels


#labels = elec.get_appliance_labels(fraction.index)
plt.figure(figsize=(8,8))
fraction.plot(kind='pie') # , labels=labels);
Out[14]: <matplotlib.axes._subplots.AxesSubplot at 0x7efe1b977050>

2.3

Select meters on the basic of appliance category

In [15]: # Find all appliances with a particular type of motor


elec.select_using_appliances(category='single-phase induction motor')

Out[15]: MeterGroup(meters=
ElecMeter(instance=7, building=2, dataset='REDD', appliances=[Appliance(
ElecMeter(instance=9, building=2, dataset='REDD', appliances=[Appliance(
ElecMeter(instance=10, building=2, dataset='REDD', appliances=[Appliance
)
In [16]: top2=elec.submeters().select_top_k(k=2)

9/9 ElecMeter(instance=11, building=2, dataset='REDD', appliances=[Appliance(type='

2.4

Training and disaggregation

In [17]: # Train
co = CombinatorialOptimisation()
co.train(top2)

Training model for submeter 'ElecMeter(instance=9, building=2, dataset='REDD', appl


Training model for submeter 'ElecMeter(instance=8, building=2, dataset='REDD', appl
Done training!

In this training stage of the combinatroial optimization, the algorithms uses cluster techniques to get the possible state of the appliance
cluster (data,max_num_cluster)

In [18]: for model in co.model:


print_dict(model)
<IPython.core.display.HTML object>

<IPython.core.display.HTML object>

In [19]: # Disaggregate with Combinatorial Optimization


disag_filename = join(data_dir, 'redd-disag.h5')
output = HDFDataStore(disag_filename, 'w')
co.disaggregate(elec.mains(), output)
output.close()

Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')


Done loading data all meters for this chunk.
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Including vampire_power = 79.8444976807 watts to model...
Estimating power demand for 'ElecMeter(instance=9, building=2, dataset='REDD',
Estimating power demand for 'ElecMeter(instance=8, building=2, dataset='REDD',
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Including vampire_power = 79.1660003662 watts to model...
Estimating power demand for 'ElecMeter(instance=9, building=2, dataset='REDD',
Estimating power demand for 'ElecMeter(instance=8, building=2, dataset='REDD',
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Including vampire_power = 69.5443344116 watts to model...
Estimating power demand for 'ElecMeter(instance=9, building=2, dataset='REDD',
Estimating power demand for 'ElecMeter(instance=8, building=2, dataset='REDD',
Loading data for meter ElecMeterID(instance=2, building=2, dataset='REDD')
Done loading data all meters for this chunk.
Including vampire_power = 81.7918319702 watts to model...
Estimating power demand for 'ElecMeter(instance=9, building=2, dataset='REDD',
Estimating power demand for 'ElecMeter(instance=8, building=2, dataset='REDD',

3.1

Examine disaggregated data from the combinatorial optimization algorithm

In [20]: disag = DataSet(disag_filename)


disag_elec = disag.buildings[building_number].elec
disag_elec.plot()
disag.store.close()

appl
appl

appl
appl

appl
appl

appl
appl

3.2

Calculate accuracy of disaggregation

In [21]: disag = DataSet(disag_filename)


disag_elec = disag.buildings[building_number].elec
f1 = f1_score(disag_elec, top2)
f1.index = disag_elec.get_labels(f1.index)
f1.plot(kind='bar')
plt.xlabel('appliance');
plt.ylabel('f-score');
disag.store.close()

General machine learning metrics


-

Precision, Recall, F-scor

The F1 score, commonly used in information retrieval, measures accuracy using the statistics
precision p and recall r. Precision is the ratio of true positives (tp) to all predicted positives (tp +
fp). Recall is the ratio of true positives to all actual positives (tp + fn). The F1 score is given by
- F-score = 2p*r/p+r

where

p= t*p/tp+fp,

r=t*p/tp+fn

The F1 metric weights recall and precision equally, and a good retrieval algorithm will maximize both precision and recall simultaneously. Thus moderately good performance on both will
be favored over extremely good performance on one and poor performance on the other.
F-Measure scores range from 0-100%. A score less than 15% means that your KantanMT engine is not performing optimally and a high level of post-editing will be required to finalise your
translations and reach publishable quality.
A score greater than 50% is a very good score and significantly less post-editing will be require
to achieve publishable translation quality.
In [22]: f1
Out[22]: Sockets
0.000000
Fridge
0.789888
dtype: float64
In [23]: elec.clear_cache()

Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed
Removed

building2/elec/cache/meter1/
building2/elec/cache/meter3/
building2/elec/cache/meter4/
building2/elec/cache/meter5/
building2/elec/cache/meter6/
building2/elec/cache/meter7/
building2/elec/cache/meter8/
building2/elec/cache/meter9/
building2/elec/cache/meter10/
building2/elec/cache/meter11/

Using Factorial Hidden Markov Model

In [24]: fhmm = fhmm_exact.FHMM()


fhmm.train(top2)

Training model for submeter 'ElecMeter(instance=9, building=2, dataset='REDD', appl


Training model for submeter 'ElecMeter(instance=8, building=2, dataset='REDD', appl

In [25]: # Disaggregate with Factorial Hidden Markov Model


disag_filename = join(data_dir, 'redd-disag-fhmm.h5')
output = HDFDataStore(disag_filename, 'w')
fhmm.disaggregate(elec.mains(), output)
output.close()
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading
Loading data
Done loading

for meter ElecMeterID(instance=2,


data all meters for this chunk.
for meter ElecMeterID(instance=2,
data all meters for this chunk.
for meter ElecMeterID(instance=2,
data all meters for this chunk.
for meter ElecMeterID(instance=2,
data all meters for this chunk.
for meter ElecMeterID(instance=2,
data all meters for this chunk.
for meter ElecMeterID(instance=2,
data all meters for this chunk.
for meter ElecMeterID(instance=2,
data all meters for this chunk.

building=2, dataset='REDD')
building=2, dataset='REDD')
building=2, dataset='REDD')
building=2, dataset='REDD')
building=2, dataset='REDD')
building=2, dataset='REDD')
building=2, dataset='REDD')

In [26]: disag_filename = join(data_dir, 'redd-disag-fhmm.h5')


disag = DataSet(disag_filename)
disag_elec = disag.buildings[building_number].elec
disag_elec.plot()
plt.title("FHMM");
disag.store.close()
10

In [27]: disag = DataSet(disag_filename)


disag_elec = disag.buildings[building_number].elec
f1 = f1_score(disag_elec, top2)
f1.index = disag_elec.get_labels(f1.index)
f1.plot(kind='barh')
plt.ylabel('appliance');
plt.xlabel('f-score');
plt.title("FHMM");
disag.store.close()

11

In [28]: f1
Out[28]: Sockets
0.000000
Fridge
0.999081
dtype: float64

In [29]: fridges = nilmtk.global_meter_group.select_using_appliances(type='fridge')

From the disaggregation result for the fridge in the graph and also
from the f-score result (f1-co=0.79 and f1-fhmm=0.999) , we can see
that the markov hidden model is quite accurate eventhough it is more
complex than the combinatorial optimization.

and these days I am also studying on the Algorithms, till the next
tasks. . .

12

Potrebbero piacerti anche