Compile the model online in EP #1111

wujiangGitHub · 2024-12-03T02:31:57Z

wujiangGitHub
Dec 3, 2024

Hello, some of the backends of onnxruntime require online compilation of models, such as openvino. Some CPUs with poor online compilation of large models may take more than 30 minutes. Is there any solution for this? Or are there any requirements for supporting such backends?

RyanUnderhill · 2024-12-03T21:24:49Z

RyanUnderhill
Dec 3, 2024

Is this an onnxruntime specific issue? Or is there a solution with onnxruntime that genai is not making accessible?

3 replies

wujiangGitHub Dec 4, 2024
Author

Thank you for your answer. 
Because cuda and cpu are different, openvino is the backend engine that needs to compile the model online. The online compilation time of large models may be longer. I would like to ask, if jenAI supports EPs such as openvino and rknpu in the future, can the inference performance of large models meet the requirements?

RyanUnderhill Dec 4, 2024

Ah yes, we do plan on supporting OpenVINO and compile based EPs. We will ensure the user experience is usable, as a long compilation time would not be usable. It will probably be through a separate optimized model for these providers.

wujiangGitHub Dec 4, 2024
Author

Thank you very much for your answer. Are these optimized models compiled and saved offline? Or can the backend provider optimize the model to improve the online compilation time to make it usable?

ankitm3k · 2025-07-05T07:53:59Z

ankitm3k
Jul 5, 2025

Update: ORT GenAI now supports OpenVINO EP, please use the latest commits from https://github.com/microsoft/onnxruntime main branch

0 replies

wujiangGitHub · 2025-07-09T06:12:25Z

wujiangGitHub
Jul 9, 2025
Author

Hello, how is the performance of the large language generation model?

…

------------------ 原始邮件 ------------------ 发件人: "microsoft/onnxruntime-genai" ***@***.***>; 发送时间: 2025年7月5日(星期六) 下午3:54 ***@***.***>; ***@***.******@***.***>; 主题: Re: [microsoft/onnxruntime-genai] Compile the model online in EP (Discussion #1111) Update: ORT GenAI now supports OpenVINO EP, please use the latest commits from https://github.com/microsoft/onnxruntime main branch — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compile the model online in EP #1111

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Compile the model online in EP #1111

Uh oh!

wujiangGitHub Dec 3, 2024

Replies: 3 comments · 3 replies

Uh oh!

RyanUnderhill Dec 3, 2024

Uh oh!

wujiangGitHub Dec 4, 2024 Author

Uh oh!

RyanUnderhill Dec 4, 2024

Uh oh!

wujiangGitHub Dec 4, 2024 Author

Uh oh!

ankitm3k Jul 5, 2025

Uh oh!

wujiangGitHub Jul 9, 2025 Author

wujiangGitHub
Dec 3, 2024

Replies: 3 comments 3 replies

RyanUnderhill
Dec 3, 2024

wujiangGitHub Dec 4, 2024
Author

wujiangGitHub Dec 4, 2024
Author

ankitm3k
Jul 5, 2025

wujiangGitHub
Jul 9, 2025
Author