Panicked with OpenAI-compatible mlx_lm.server

**Describe the bug**

Panicked with a working OpenAI-compatible mlx_lm.server on macOS 15.6.1.

**To Reproduce**
Steps to reproduce the behavior:
1. Run mlx_lm.server with `lmstudio-community/Qwen3-4B-MLX-4bit`
2. Ensure the server works
3. Configure goose to use the server
4. Make sure goose found all available models
5. Query

**Expected behavior**
I expected to get a response from goose cli.

**Screenshots**

<img width="997" height="939" alt="Image" src="https://github.com/user-attachments/assets/2a59ca82-187b-4525-b067-807867b75ba2" />

**Please provide following information:**
 - **OS & Arch:** macOS 15.6.1 on MacBook Pro M1 2021 13"
 - **Interface:** CLI
 - **Version:** v1.5.0
 - **Provider & Model:** OpenAI - lmstudio-community/Qwen3-4B-MLX-4bit

**Additional context**
RUST_BACKTRACE=full:
```sh
starting session | provider: openai model: lmstudio-community/Qwen3-4B-MLX-4bit
    logging to /Users/mkozjak/.local/share/goose/sessions/20250822_124435.jsonl
    working directory: /Users/mkozjak

Goose is running! Enter your instructions, or try asking what goose can do.

Context: ○○○○○○○○○○ 0% (0/128000 tokens)
( O)> hello
◓  Mapping context vectors...
thread 'main' panicked at crates/goose/src/providers/formats/openai.rs:478:84:
index out of bounds: the len is 0 but the index is 0
stack backtrace:
◒  Mapping context vectors...                                                                                                   0:        0x1076fa18c - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::hca5ddac00d0fe81d
   1:        0x107715744 - core::fmt::write::h345c34c786f611a2
   2:        0x1076c8a1c - std::io::Write::write_fmt::hf3ba17a7aed57370
   3:        0x1076fa084 - std::sys::backtrace::BacktraceLock::print::h5f3c3a4eb4693f7a
   4:        0x1076e0528 - std::panicking::default_hook::{{closure}}::haaa8c3be8e963b10
   5:        0x1076e0424 - std::panicking::default_hook::hf46b416954f6334b
   6:        0x1076e0b04 - std::panicking::rust_panic_with_hook::hae736c323e602f79
   7:        0x1076fa698 - std::panicking::begin_panic_handler::{{closure}}::h0f4c8174a03fed47
   8:        0x1076fa3b0 - std::sys::backtrace::__rust_end_short_backtrace::hc96ad38e8f461380
   9:        0x1076e0628 - __rustc[75b86e412cc6dff0]::rust_begin_unwind
  10:        0x1077db68c - core::panicking::panic_fmt::h6f546b4a6f225fa9
  11:        0x1077db80c - core::panicking::panic_bounds_check::hd382327e3f27e417
  12:        0x1069e0fdc - <async_stream::async_stream::AsyncStream<T,U> as futures_core::stream::Stream>::poll_next::hfdbef9deedd00789
  13:        0x1069dd904 - <async_stream::async_stream::AsyncStream<T,U> as futures_core::stream::Stream>::poll_next::h5f992b557c4d9d6c
  14:        0x1051b93cc - <async_stream::async_stream::AsyncStream<T,U> as futures_core::stream::Stream>::poll_next::hd9f8b3b4fef7cac7
  15:        0x1051fad98 - goose::agents::agent::Agent::reply_internal::{{closure}}::{{closure}}::hc6aa0e48387db047
  16:        0x1051b64a8 - <async_stream::async_stream::AsyncStream<T,U> as futures_core::stream::Stream>::poll_next::h0584de3f3d89745c
  17:        0x104ffb108 - <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::h79edfaa660883a4b
  18:        0x104f41208 - goose_cli::session::Session::process_agent_response::{{closure}}::h62f3e2efd1f5db36
  19:        0x104f383d8 - goose_cli::session::Session::interactive::{{closure}}::h9ed6dd2874531469
  20:        0x104f26e48 - goose_cli::cli::cli::{{closure}}::h5a957a9c79e6a21d
  21:        0x104f6ca0c - goose::main::{{closure}}::hf6ef7b5286b6dbd5
  22:        0x10501e5b8 - tokio::runtime::park::CachedParkThread::block_on::hbfb7e8eb15ce5b8d
  23:        0x10525e790 - goose::main::h25eb38abe2385176
  24:        0x10535aa1c - std::sys::backtrace::__rust_begin_short_backtrace::hf584e981f3000ed7
  25:        0x1052aa18c - std::rt::lang_start::{{closure}}::h60bce630afd490e8
  26:        0x1076deb84 - std::rt::lang_start_internal::h7aa982cc60ae55c0
  27:        0x10525ea4c - _main
```

A working sample using `mlx_lm.chat`:
```sh
[INFO] Starting chat session with lmstudio-community/Qwen3-4B-MLX-4bit.
The command list:
- 'q' to exit
- 'r' to reset the chat
- 'h' to display these commands
>> hello
<think>
Okay, the user said "hello". I need to respond appropriately. Since they're greeting me, I should respond with a friendly greeting. Maybe "Hi there! How can I assist you today?" That sounds good. It's welcoming and opens the door for them to ask questions. I should keep it simple and positive. Let me make sure there's no markdown and that the response is natural. Yep, that should work.
</think>

Hi there! How can I assist you today? 😊
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Panicked with OpenAI-compatible mlx_lm.server #4276

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Panicked with OpenAI-compatible mlx_lm.server #4276

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions