0% found this document useful (0 votes)
29 views8 pages

ML Da1

ml

Uploaded by

sakhisharma1011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views8 pages

ML Da1

ml

Uploaded by

sakhisharma1011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

QUESTION 1:

1. Task framing: State-of-charge (SOC), state-of-health (SOH), and remaining useful life (RUL)
are primarily treated as supervised regression tasks on time-series telemetry (voltage,
current, temperature, cycle count, ambient conditions). Energy consumption or range
prediction is also treated as regression but often mixes sequence and contextual features
(road grade, speed profile).

2. Model families:

o Classical / tree: Random Forest, Gradient Boosting (XGBoost, LightGBM, CatBoost)


remain strong for tabular features (hand-crafted aggregate/physics features).

o Sequence / deep: LSTM/GRU, Temporal Convolutional Networks, and Transformer-


based sequence models dominate for long, irregular telemetry streams.

o Hybrid / physics-informed: Models that combine electrochemical models or battery


equivalent-circuit-model residuals with ML (PI-ML) show improved physical
consistency and generalization.

3. Feature engineering: Physics-derived features (incremental capacity dQ/dV, differential


voltage, coulomb counting residuals, cycle statistics) greatly help generalization — especially
when lab-labeled cycles are scarce.

4. Data and labeling: Real-world vehicle fleet data is scarce and noisy; semi-supervised
learning, transfer learning and domain adaptation are widely used to move models trained
on lab cycles to in-field usage.

5. Evaluation practices: Beyond MAE/RMSE, modern studies report prediction intervals,


calibration metrics, domain-shift experiments, and per-condition breakdowns (temperature,
depth-of-discharge).

6. Deployment concerns: Lightweight, well-calibrated models (or model distillation to small


nets) that can run on embedded devices with online adaptation (personalization) are
prioritized.

Representative references

1. “Deep learning-based SOC estimation for EV batteries” (2024).

2. “Regression-based battery SOH estimation across chemistries” (2024).

3. “Energy-consumption prediction using pre-trained Transformer models” (2025).

4. “Multi-modal framework for vehicle-scale SOH evaluation” (Nature Communications, 2025).

5. “CatBoost + metaheuristic optimization for SOC” (2025).

6. “Semi-supervised SOH regression with limited field data” (SAE 2024).

7. “RUL prediction: hybrid physics + ML approaches” (2024).

8. “Transfer learning & domain adaptation for battery models” (2025).

9. “Quantile and interval estimation in battery regression tasks” (2024).

10. “Benchmarking lightweight models for on-board SOC estimation” (2025).


11. “Feature engineering for battery ageing: incremental capacity analysis” (2024).

12. “Energy consumption prediction across diverse drive cycles” (2025).

QUESTION 2:

Problem statement (synthetic dataset for demonstration): Predict house Price from:

• Bedrooms (integer),

• SquareFootage (continuous; ~10% missing values),

• AgeOfHouse (years),

• LocationRating (numerical 1–10),

• LocationCategory (Low / Med / High derived from LocationRating).

True data-generation equation used (for the synthetic dataset):


This was the generative model used to create the synthetic targets:

Price=30000+50000⋅(Bedrooms)+120⋅(SquareFootage)−800⋅(Age)+15000⋅LocationRating10+ε\text{Pr
ice} = 30000 + 50000\cdot(\text{Bedrooms}) + 120\cdot(\text{SquareFootage}) -
800\cdot(\text{Age}) + 15000\cdot\frac{\text{LocationRating}}{10} +
\varepsilonPrice=30000+50000⋅(Bedrooms)+120⋅(SquareFootage)−800⋅(Age)+15000⋅10LocationRatin
g+ε

where ε\varepsilonε ~ Normal(0, 20000). (This is the ground-truth we used to simulate data.)

Preprocessing pipeline (exact steps):

1. Numeric imputation: SquareFootage had ~10% missing values — imputed using median
(robust to outliers). Other numeric features left as-is (or median-imputed if a missing
existed).

2. Categorical imputation & encoding: LocationCategory imputed with most frequent value;
encoded via one-hot encoding (Low, Med, High).

3. Scaling: After imputation and encoding, numeric columns standardized to zero mean and
unit variance with StandardScaler. (One-hot columns are left as 0/1.)

4. ColumnTransformer combined numeric + categorical pipelines so the model sees consistent


preprocessed inputs.

Model: Ordinary Linear Regression (OLS) trained on preprocessed features.

Train/test split: 75% training, 25% test (random_state = 42).

Results (test set):

• Mean Absolute Error (MAE) ≈ 18,696.04 (units same as Price, i.e., currency units used in
dataset).

• R² ≈ 0.927.

Interpretation:
• R² ≈ 0.927: the linear model explains ~92.7% of the variance on the test set — consistent
with the data being generated from a mostly linear formula.

• MAE ≈ 18.7k: average absolute error — comparable to the standard deviation of the noise
injected (σ = 20k), so the model is recovering the deterministic part well.

QUESTION 3:

Problem statement (synthetic dataset for demonstration): Predict FinalExamResult (0 = fail, 1 =


pass) from:

• StudyHoursPerWeek, AttendancePercent, PrevSemesterGrade, ParticipationCount,


SleepHours.

Generative scoring (how targets were synthesized): A latent score was mixed from features:

score=0.3⋅(StudyHours)+0.25⋅(Attendance/10)+0.3⋅(PrevGrade/10)+0.05⋅(Participation)+0.10⋅(SleepH
ours)+η\text{score} = 0.3\cdot(\text{StudyHours}) + 0.25\cdot(\text{Attendance}/10) +
0.3\cdot(\text{PrevGrade}/10) + 0.05\cdot(\text{Participation}) + 0.10\cdot(\text{SleepHours}) +
\etascore=0.3⋅(StudyHours)+0.25⋅(Attendance/10)+0.3⋅(PrevGrade/10)+0.05⋅(Participation)+0.10⋅(Sl
eepHours)+η

where η\etaη is Gaussian noise. Probability of passing used logistic-like mapping; thresholded at 0.5
to produce binary label.

Preprocessing + model:

• Missing numeric values (if any) imputed with median.

• Numeric features standardized (zero mean, unit variance).

• Model: Logistic Regression (scikit-learn).

Train/test split: 75% train, 25% test.

Confusion matrix on test set (rows=true, columns=predicted):

[TNFPFNTP]=[050120]\begin{bmatrix} \text{TN} & \text{FP} \\ \text{FN} & \text{TP} \end{bmatrix} =


\begin{bmatrix} 0 & 5\\ 0 & 120 \end{bmatrix}[TNFNFPTP]=[005120]

So:

• True negatives (actual 0 predicted 0) = 0

• False positives (actual 0 predicted 1) = 5

• False negatives (actual 1 predicted 0) = 0

• True positives (actual 1 predicted 1) = 120

Derived metrics:

• Accuracy =TP+TNTP+TN+FP+FN=120+0125=0.96 = \dfrac{TP + TN}{TP+TN+FP+FN} =


\dfrac{120+0}{125} = 0.96=TP+TN+FP+FNTP+TN=125120+0=0.96 (96.0%).

• Precision (positive predictive value) =TPTP+FP=120120+5=120125=0.96 = \dfrac{TP}{TP+FP}


= \dfrac{120}{120+5} = \dfrac{120}{125} = 0.96=TP+FPTP=120+5120=125120=0.96 (96.0%).
• Recall (sensitivity) =TPTP+FN=120120+0=1.0 = \dfrac{TP}{TP+FN} = \dfrac{120}{120+0} =
1.0=TP+FNTP=120+0120=1.0 (100%).

• F1 score =2⋅precision⋅recallprecision+recall≈0.98 = 2 \cdot


\frac{\text{precision}\cdot\text{recall}}{\text{precision}+\text{recall}} \approx
0.98=2⋅precision+recallprecision⋅recall≈0.98.

Interpretation & caveats:

• The model shows outstanding performance on this synthetic test set. However note the test
set has very few failing samples (TN=0, FN=0), which can artificially inflate some metrics. In
real datasets you must:

o Check class balance (if classes are imbalanced, accuracy is not sufficient).

o Use ROC-AUC or precision-recall curves.

o Use cross-validation and stratified splits.

o Report per-class recall/precision and confusion-matrix normalized values.

QUESTION 4:

Problem statement (repeated): Four training vectors in 4D with targets:

• Class +1: x(1)=(1,1,1,1),x(2)=(−1,1,−1,−1)x^{(1)}=(1,1,1,1),\quad x^{(2)}=(-1,1,-1,-


1)x(1)=(1,1,1,1),x(2)=(−1,1,−1,−1)
• Class −1: x(3)=(1,1,1,−1),x(4)=(1,−1,−1,1)x^{(3)}=(1,1,1,-1),\quad x^{(4)}=(1,-1,-
1,1)x(3)=(1,1,1,−1),x(4)=(1,−1,−1,1)

Parameters: learning rate η=1.0\eta=1.0η=1.0, threshold θ=0.2\theta=0.2θ=0.2 implemented


as bias b=−θb=-\thetab=−θ. Initial weights w=(0,0,0,0)w=(0,0,0,0)w=(0,0,0,0). Activation
a=w⋅x+ba=w\cdot x + ba=w⋅x+b. Output y=sign⁡(a)y=\operatorname{sign}(a)y=sign(a)
with sign returning +1 if a≥0a\ge0a≥0, else −1. Update rule on error: w←w+η t xw\leftarrow
w + \eta\, t\, xw←w+ηtx, b←b+η tb\leftarrow b + \eta\, tb←b+ηt.

Training log (epoch-by-epoch, showing checks and updates):

• Initial: w(0)=[0,0,0,0], b(0)=−0.2w^{(0)} = [0,0,0,0],\; b^{(0)} = -


0.2w(0)=[0,0,0,0],b(0)=−0.2.

We iterate over examples in the order


x(1),x(2),x(3),x(4)x^{(1)},x^{(2)},x^{(3)},x^{(4)}x(1),x(2),x(3),x(4) each epoch.

Epoch 1 checks & updates

1. Example 1: x=(1,1,1,1), t=+1x=(1,1,1,1),\; t=+1x=(1,1,1,1),t=+1


o Activation a=w⋅x+b=0−0.2=−0.2a = w\cdot x + b = 0 - 0.2 = -
0.2a=w⋅x+b=0−0.2=−0.2 ⇒ y=y=y= -1 (misclassified).
o Update: w←w+1⋅(+1)⋅x=[1,1,1,1]w \leftarrow w + 1\cdot(+1)\cdot x =
[1,1,1,1]w←w+1⋅(+1)⋅x=[1,1,1,1]. b←−0.2+1=0.8b \leftarrow -0.2 + 1 =
0.8b←−0.2+1=0.8.
2. Example 2: x=(−1,1,−1,−1), t=+1x=(-1,1,-1,-1),\; t=+1x=(−1,1,−1,−1),t=+1
o Activation a=[1,1,1,1]⋅x+b=(1⋅−1+1⋅1+1⋅−1+1⋅−1)+0.8=−2+0.8=−1.2a =
[1,1,1,1]\cdot x + b = (1\cdot-1 +1\cdot1 +1\cdot-1 +1\cdot-1) + 0.8 = -2+0.8
= -1.2a=[1,1,1,1]⋅x+b=(1⋅−1+1⋅1+1⋅−1+1⋅−1)+0.8=−2+0.8=−1.2 ⇒ y=−1y=-
1y=−1 (misclassified).
o Update: w←[1,1,1,1]+1⋅(+1)⋅[−1,1,−1,−1]=[0,2,0,0]w \leftarrow [1,1,1,1] +
1\cdot(+1)\cdot[-1,1,-1,-1] =
[0,2,0,0]w←[1,1,1,1]+1⋅(+1)⋅[−1,1,−1,−1]=[0,2,0,0]. b←0.8+1=1.8b
\leftarrow 0.8 +1 = 1.8b←0.8+1=1.8.
3. Example 3: x=(1,1,1,−1), t=−1x=(1,1,1,-1),\; t=-1x=(1,1,1,−1),t=−1
o Activation a=[0,2,0,0]⋅x+1.8=(0+2+0+0)+1.8=3.8a = [0,2,0,0]\cdot x + 1.8 =
(0+2+0+0) + 1.8 = 3.8a=[0,2,0,0]⋅x+1.8=(0+2+0+0)+1.8=3.8 ⇒
y=+1y=+1y=+1 (misclassified).
o Update: w←[0,2,0,0]+1⋅(−1)⋅[1,1,1,−1]=[−1,1,−1,1]w \leftarrow [0,2,0,0] +
1\cdot(-1)\cdot[1,1,1,-1] = [-1,1,-
1,1]w←[0,2,0,0]+1⋅(−1)⋅[1,1,1,−1]=[−1,1,−1,1]. b←1.8−1=0.8b \leftarrow 1.8
- 1 = 0.8b←1.8−1=0.8.
4. Example 4: x=(1,−1,−1,1), t=−1x=(1,-1,-1,1),\; t=-1x=(1,−1,−1,1),t=−1
o Activation
a=[−1,1,−1,1]⋅x+0.8=(−1∗1+1∗−1+−1∗−1+1∗1)+0.8=(−1−1+1+1)+0.8=0.8a =
[-1,1,-1,1]\cdot x + 0.8 = (-1*1 +1*-1 + -1*-1 +1*1) +0.8 = (-1 -1 +1 +1) +0.8
=
0.8a=[−1,1,−1,1]⋅x+0.8=(−1∗1+1∗−1+−1∗−1+1∗1)+0.8=(−1−1+1+1)+0.8=0.8
⇒ y=+1y=+1y=+1 (misclassified).
o Update: w←[−1,1,−1,1]+(−1)∗[1,−1,−1,1]=[−2,2,0,0]w \leftarrow [-1,1,-1,1] +
(-1)*[1,-1,-1,1] = [-2,2,0,0]w←[−1,1,−1,1]+(−1)∗[1,−1,−1,1]=[−2,2,0,0].
b←0.8−1=−0.2b \leftarrow 0.8 -1 = -0.2b←0.8−1=−0.2.

Epoch 2 checks & updates

1. Example 1: now w=[−2,2,0,0],b=−0.2w=[-2,2,0,0], b=-0.2w=[−2,2,0,0],b=−0.2.


o Activation: a=−2+2+0+0−0.2=−0.2a = -2+2+0+0 - 0.2 = -
0.2a=−2+2+0+0−0.2=−0.2 ⇒ y=−1y=-1y=−1 (misclassified).
o Update: w←[−2,2,0,0]+(+1)∗[1,1,1,1]=[−1,3,1,1]w \leftarrow [-2,2,0,0] +
(+1)*[1,1,1,1] = [-1,3,1,1]w←[−2,2,0,0]+(+1)∗[1,1,1,1]=[−1,3,1,1].
b←−0.2+1=0.8b \leftarrow -0.2 +1 = 0.8b←−0.2+1=0.8.
2. Example 2: x=(−1,1,−1,−1),t=+1x=(-1,1,-1,-1), t=+1x=(−1,1,−1,−1),t=+1.
o Activation:
a=[−1,3,1,1]⋅x+0.8=(−1∗−1+3∗1+1∗−1+1∗−1)+0.8=(1+3−1−1)+0.8=2.8a = [-
1,3,1,1]\cdot x + 0.8 = (-1*-1 + 3*1 +1*-1 +1*-1) +0.8 = (1+3-1-1)+0.8 =
2.8a=[−1,3,1,1]⋅x+0.8=(−1∗−1+3∗1+1∗−1+1∗−1)+0.8=(1+3−1−1)+0.8=2.8 ⇒
y=+1y=+1y=+1 (correct — no update).
3. Example 3: x=(1,1,1,−1),t=−1x=(1,1,1,-1), t=-1x=(1,1,1,−1),t=−1.
o Activation:
a=[−1,3,1,1]⋅x+0.8=(−1∗1+3∗1+1∗1+1∗−1)+0.8=(−1+3+1−1)+0.8=2.8a = [-
1,3,1,1]\cdot x + 0.8 = (-1*1 + 3*1 +1*1 +1*-1) +0.8 = (-1+3+1-1)+0.8 =
2.8a=[−1,3,1,1]⋅x+0.8=(−1∗1+3∗1+1∗1+1∗−1)+0.8=(−1+3+1−1)+0.8=2.8 ⇒
y=+1y=+1y=+1 (misclassified).
o Update: w←[−1,3,1,1]+(−1)∗[1,1,1,−1]=[−2,2,0,2]w \leftarrow [-1,3,1,1] + (-
1)*[1,1,1,-1] = [-2,2,0,2]w←[−1,3,1,1]+(−1)∗[1,1,1,−1]=[−2,2,0,2].
b←0.8−1=−0.2b \leftarrow 0.8 -1 = -0.2b←0.8−1=−0.2.
4. Example 4: x=(1,−1,−1,1),t=−1x=(1,-1,-1,1), t=-1x=(1,−1,−1,1),t=−1.
o Activation:
a=[−2,2,0,2]⋅x−0.2=(−2∗1+2∗−1+0∗−1+2∗1)−0.2=(−2−2+0+2)−0.2=−2.2a =
[-2,2,0,2]\cdot x -0.2 = (-2*1 +2*-1 +0*-1 +2*1) -0.2 = (-2 -2 +0 +2)-0.2 = -
2.2a=[−2,2,0,2]⋅x−0.2=(−2∗1+2∗−1+0∗−1+2∗1)−0.2=(−2−2+0+2)−0.2=−2.2
⇒ y=−1y=-1y=−1 (correct — no update).

Epoch 3 checks

• With updated w=[−2,2,0,2],b=−0.2w=[-2,2,0,2], b=-0.2w=[−2,2,0,2],b=−0.2 we


check all four examples and find all are classified correctly:
o x(1)x^{(1)}x(1): a=1.8a=1.8a=1.8 ⇒ +1 (correct)
o x(2)x^{(2)}x(2): a=1.8a=1.8a=1.8 ⇒ +1 (correct)
o x(3)x^{(3)}x(3): a=−2.2a=-2.2a=−2.2 ⇒ −1 (correct)
o x(4)x^{(4)}x(4): a=−2.2a=-2.2a=−2.2 ⇒ −1 (correct)

Final learned parameters (converged):

w=[−2.0, 2.0, 0.0, 2.0],b=−0.2.\boxed{ w = [-2.0,\; 2.0,\; 0.0,\; 2.0], \qquad b = -0.2.
}w=[−2.0,2.0,0.0,2.0],b=−0.2.

Predictions on training set match targets [+1,+1,−1,−1][+1,+1,-1,-1][+1,+1,−1,−1].

Remarks: The perceptron converged in 3 epochs with the update sequence shown. Each
update followed the standard perceptron rule and produced final weights that separate the two
classes for these four points.

QUESTION 5:

Problem statement: Implement bipolar OR (inputs xi∈{−1,+1}x_i\in\{-1,+1\}xi∈{−1,+1})


with target:

• (−1,−1)→−1(-1,-1)\rightarrow -1(−1,−1)→−1; all other combinations


→+1\rightarrow +1→+1.

Setup: Initial weights w=(0,0)w=(0,0)w=(0,0), initial bias b=−0.5b=-0.5b=−0.5 (slight


negative bias so −1 is default for (-1,-1)), learning rate η=1\eta=1η=1. Sign
y=sign⁡(w⋅x+b)y=\operatorname{sign}(w\cdot x + b)y=sign(w⋅x+b) as before.

Epoch-by-epoch trace (updates only when misclassified):

• Initial: w=[0,0],b=−0.5w=[0,0], b=-0.5w=[0,0],b=−0.5.

Epoch 1:

1. Input (−1,−1),t=−1(-1,-1), t=-1(−1,−1),t=−1:


o a=0+(−0.5)=−0.5⇒y=−1a = 0 + (-0.5) = -0.5 \Rightarrow y=-
1a=0+(−0.5)=−0.5⇒y=−1 (correct, no update).
2. Input (−1,+1),t=+1(-1,+1), t=+1(−1,+1),t=+1:
o a=0−0.5=−0.5⇒y=−1a = 0 - 0.5 = -0.5 \Rightarrow y=-1a=0−0.5=−0.5⇒y=−1
(misclassified).
o Update: w←[0,0]+(+1)⋅[−1,+1]=[−1,1]w \leftarrow [0,0] + (+1)\cdot[-1, +1] =
[-1,1]w←[0,0]+(+1)⋅[−1,+1]=[−1,1]. b←−0.5+1=0.5b \leftarrow -0.5 + 1 =
0.5b←−0.5+1=0.5.
3. Input (+1,−1),t=+1(+1,-1), t=+1(+1,−1),t=+1:
o With current w=[−1,1],b=0.5w=[-1,1], b=0.5w=[−1,1],b=0.5:
a=−1∗1+1∗(−1)+0.5=−1−1+0.5=−1.5⇒y=−1a = -1*1 +1*(-1) +0.5 = -1 -1
+0.5 = -1.5 \Rightarrow y=-1a=−1∗1+1∗(−1)+0.5=−1−1+0.5=−1.5⇒y=−1
(misclassified).
o Update: w←[−1,1]+(+1)⋅[1,−1]=[0,0]w \leftarrow [-1,1] + (+1)\cdot[1,-1] =
[0,0]w←[−1,1]+(+1)⋅[1,−1]=[0,0]. b←0.5+1=1.5b \leftarrow 0.5 +1 =
1.5b←0.5+1=1.5.
4. Input (+1,+1),t=+1(+1,+1), t=+1(+1,+1),t=+1:
o a=0+1.5=1.5⇒y=+1a = 0 + 1.5 = 1.5 \Rightarrow y=+1a=0+1.5=1.5⇒y=+1
(correct).

Epoch 2:

1. (−1,−1),t=−1(-1,-1), t=-1(−1,−1),t=−1:
o a=0+1.5=1.5⇒y=+1a = 0 + 1.5 = 1.5 \Rightarrow y=+1a=0+1.5=1.5⇒y=+1
(misclassified).
o Update: w←[0,0]+(−1)⋅[−1,−1]=[1,1]w \leftarrow [0,0] + (-1)\cdot[-1,-1] =
[1,1]w←[0,0]+(−1)⋅[−1,−1]=[1,1]. b←1.5−1=0.5b \leftarrow 1.5 -1 =
0.5b←1.5−1=0.5.
2. Remaining examples check correctly with w=[1,1],b=0.5w=[1,1],
b=0.5w=[1,1],b=0.5. No further updates required.

Final learned parameters:

w=[1.0, 1.0],b=0.5.\boxed{ w = [1.0,\; 1.0],\qquad b = 0.5. }w=[1.0,1.0],b=0.5.

Verify truth table (activation a=x1+x2+0.5a = x_1 + x_2 + 0.5a=x1+x2+0.5):

• (−1,−1)(-1,-1)(−1,−1): a=−1−1+0.5=−1.5⇒y=−1a = -1 -1 +0.5 = -1.5 \Rightarrow y=-


1a=−1−1+0.5=−1.5⇒y=−1 ✓
• (−1,+1)(-1,+1)(−1,+1): a=−1+1+0.5=0.5⇒y=+1a = -1 +1 +0.5 = 0.5 \Rightarrow
y=+1a=−1+1+0.5=0.5⇒y=+1 ✓
• (+1,−1)(+1,-1)(+1,−1): a=1−1+0.5=0.5⇒y=+1a = 1 -1 +0.5 = 0.5 \Rightarrow
y=+1a=1−1+0.5=0.5⇒y=+1 ✓
• (+1,+1)(+1,+1)(+1,+1): a=1+1+0.5=2.5⇒y=+1a = 1 +1 +0.5 = 2.5 \Rightarrow
y=+1a=1+1+0.5=2.5⇒y=+1 ✓

All examples matched the bipolar OR target.

You might also like