Debug discrepancy of `avg_inference_latency` reported from optimum-et models

### 🐛 Describe the bug

![Image](https://github.com/user-attachments/assets/c064da5a-7482-4efc-8391-1568bb81de46)

As highlighted in the screenshot, `avg_inference_latency` doesn't make sense between the etLLM and optimum-et generated models. 

Upon checking the raw results from the CI, I can see the other latency related metrics like `generate_time` and `tokens_per_sec (TPS)` are close between etLLM and optimum-et. 

The `avg_inference_latency` is a separate metric that is measured separately from how `generate_time`an and `tokens_per_sec (TPS)` are reported (a separate test), and it should just report the latency on `forward()` call regardless the model is LLM or not. Since the reported `generate_time` and `TPS` are very close, I suspect if there is a bug in how the `avg_inference_latency` are measured, or wired to the dashboard.

### Versions

trunk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Debug discrepancy of `avg_inference_latency` reported from optimum-et models #11650

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Debug discrepancy of avg_inference_latency reported from optimum-et models #11650

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Debug discrepancy of `avg_inference_latency` reported from optimum-et models #11650