Combining
datasources for
further analysis
A N A LY Z I N G I OT D ATA I N P Y T H O N
Matthias Voppichler
IT Developer
Combining data sources
print([Link]()) print([Link]())
value value
timestamp timestamp
2018-10-03 [Link] 16.3 2018-10-03 [Link] 1798.7
2018-10-03 [Link] 17.7 2018-10-03 [Link] 1799.9
2018-10-03 [Link] 20.2 2018-10-03 [Link] 1798.1
2018-10-03 [Link] 20.9 2018-10-03 [Link] 1797.7
2018-10-03 [Link] 21.8 2018-10-03 [Link] 1798.0
ANALYZING IOT DATA IN PYTHON
Naming columns
[Link] = ["temperature"]
[Link] = ["sunshine"]
print([Link](2))
print([Link](2))
temperature
timestamp
2018-10-03 [Link] 16.3
2018-10-03 [Link] 17.7
sunshine
timestamp
2018-10-03 [Link] 1798.7
2018-10-03 [Link] 1799.9
ANALYZING IOT DATA IN PYTHON
Concat
environ = [Link]([temp, sun], axis=1)
print([Link]())
temperature sunshine
timestamp
2018-10-03 [Link] 16.3 1798.7
2018-10-03 [Link] NaN 1799.9
2018-10-03 [Link] 17.7 1798.1
2018-10-03 [Link] NaN 1797.7
2018-10-03 [Link] 20.2 1798.0
ANALYZING IOT DATA IN PYTHON
Resample
agg_dict = {"temperature": "max", "sunshine": "sum"}
env1h = [Link]("1h").agg(agg_dict)
print([Link]())
temperature sunshine
timestamp
2018-10-03 [Link] 16.3 3598.6
2018-10-03 [Link] 17.7 3595.8
2018-10-03 [Link] 20.2 3596.2
2018-10-03 [Link] 20.9 3594.1
2018-10-03 [Link] 21.8 3599.9
ANALYZING IOT DATA IN PYTHON
Fillna
env30min = [Link](method="ffill")
print([Link]())
temperature sunshine
timestamp
2018-10-03 [Link] 16.3 1798.7
2018-10-03 [Link] 16.3 1799.9
2018-10-03 [Link] 17.7 1798.1
2018-10-03 [Link] 17.7 1797.7
2018-10-03 [Link] 20.2 1798.0
ANALYZING IOT DATA IN PYTHON
Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N
Correlation
A N A LY Z I N G I OT D ATA I N P Y T H O N
Matthias Voppichler
IT Developer
[Link]()
print([Link]())
temperature humidity sunshine light_veh heavy_veh
temperature 1.000000 -0.734430 0.611041 0.401997 0.408936
humidity -0.734430 1.000000 -0.637761 -0.313952 -0.318198
sunshine 0.611041 -0.637761 1.000000 0.408854 0.409363
light_veh 0.401997 -0.313952 0.408854 1.000000 0.998473
heavy_veh 0.408936 -0.318198 0.409363 0.998473 1.000000
ANALYZING IOT DATA IN PYTHON
heatmap
[Link]([Link](), annot=True)
ANALYZING IOT DATA IN PYTHON
heatmap
[Link]([Link](), annot=True)
ANALYZING IOT DATA IN PYTHON
heatmap
[Link]([Link](), annot=True)
ANALYZING IOT DATA IN PYTHON
heatmap
[Link]([Link](), annot=True)
ANALYZING IOT DATA IN PYTHON
Pairplot
[Link](data)
ANALYZING IOT DATA IN PYTHON
Summary
heatmap
Negative correlation
Positive correlation
Correlation close to 1
ANALYZING IOT DATA IN PYTHON
Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N
Outliers
A N A LY Z I N G I OT D ATA I N P Y T H O N
Matthias Voppichler
IT Developer
Outliers
Reasons why outliers appear in Datasets:
Measurement error
Manipulation
Extreme Events
ANALYZING IOT DATA IN PYTHON
Outliers
temp_mean = data["temperature"].mean()
temp_std = data["temperature"].std()
data["mean"] = temp_mean
data["upper_limit"] = temp_mean + (temp_std * 3)
data["upper_limit"] = temp_mean - (temp_std * 3)
print([Link][0]["upper_limit"])
print([Link][0]["mean"])
print([Link][0]["lower_limit"])
29.513933116002725
14.5345
-0.44493311600272456
ANALYZING IOT DATA IN PYTHON
Outlier plot
[Link]()
ANALYZING IOT DATA IN PYTHON
Autocorrelation
from [Link] import tsaplots
tsaplots.plot_acf(data['temperature'], lags=50)
ANALYZING IOT DATA IN PYTHON
Autocorrelation
from [Link] import tsaplots
tsaplots.plot_acf(data['temperature'], lags=50)
ANALYZING IOT DATA IN PYTHON
Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N
Seasonality and
Trends
A N A LY Z I N G I OT D ATA I N P Y T H O N
Matthias Voppichler
IT Developer
Time series components
Trend
Seasonal
Residual / Noise
series[t] = trend[t] + seasonal[t] + residual[t]
20.2 = 14.9 + 4.39 + 0.91
ANALYZING IOT DATA IN PYTHON
Seasonal decompose
import [Link] as sm
# Run seasonal decompose
decomp = [Link].seasonal_decompose(data["temperature"])
print([Link]())
[Link]()
timestamp
2018-10-01 [Link] -3.670394
2018-10-01 [Link] -3.987451
2018-10-01 [Link] -4.372217
2018-10-01 [Link] -4.534066
2018-10-01 [Link] -4.802165
Freq: H, Name: temperature, dtype: float64
ANALYZING IOT DATA IN PYTHON
Seasonal decompose
ANALYZING IOT DATA IN PYTHON
Combined plot
decomp = [Link].seasonal_decompose(data)
# Plot the timeseries
[Link](data["temperature"], label="temperature")
# Plot trend and seasonality
[Link]([Link]["temperature"], label="trend")
[Link]([Link]["temperature"], label="seasonal")
[Link]()
ANALYZING IOT DATA IN PYTHON
Combined plot
ANALYZING IOT DATA IN PYTHON
Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N