Old onnx version #39

guyluz11 · 2025-01-16T11:06:38Z

The file coreml_provider_factory.h got updated 2 weeks ago.

Ours coreml_provider_factory.h got updated 2 years ago.

I think this affects the performance in my app, as the summary speed is so much longer than in Python (with the same code).

guyluz11 · 2025-01-16T12:20:53Z

It turns out that running outputs[0].value is taking longer than the process itself

final List<OrtValue?>? outputs =
          await session.runAsync(runOptions, inputs);

And all my code is inside a loop to make a summary, so 100 iterations * 412 ms = 41.2seconds

Running

   // Run the decoder

      final stopwatch1 = Stopwatch()..start(); // Start the stopwatch

      final List<OrtValue?>? outputs =
          await session.runAsync(runOptions, inputs);


      stopwatch1.stop(); // Stop the stopwatch
      print('Execution time: ${stopwatch1.elapsedMilliseconds} ms');    // max at: Execution time: 34 ms


      if (outputs == null || outputs.isEmpty) {
        printInDebug('Decoder outputs are empty!');
        break;
      }

      // Extract logits and calculate the next token
      final OrtValue? output0 = outputs[0];
      if (output0 == null) {
        printInDebug('Decoder output[0] is null!');
        break;
      }
      final stopwatch = Stopwatch()..start(); // Start the stopwatch

      final List<List<List<double>>> output0Value =
          output0.value! as List<List<List<double>>>;

      stopwatch.stop(); // Stop the stopwatch

      print('Execution time: ${stopwatch.elapsedMilliseconds} ms');  // max at Execution time: around 412 ms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Old onnx version #39

Old onnx version #39

guyluz11 commented Jan 16, 2025 •

edited

Loading

guyluz11 commented Jan 16, 2025

Old onnx version #39

Old onnx version #39

Comments

guyluz11 commented Jan 16, 2025 • edited Loading

guyluz11 commented Jan 16, 2025

guyluz11 commented Jan 16, 2025 •

edited

Loading