Skip to content

Extend image optimization to support in-memory streams(BytesIO/bytes) and dst_format param#289

Open
cking100 wants to merge 1 commit intoopenzim:mainfrom
cking100:convert-optimize-single-pass
Open

Extend image optimization to support in-memory streams(BytesIO/bytes) and dst_format param#289
cking100 wants to merge 1 commit intoopenzim:mainfrom
cking100:convert-optimize-single-pass

Conversation

@cking100
Copy link
Copy Markdown

@cking100 cking100 commented Mar 27, 2026

Fixes #284

Changes:

  • Modified optimize_image and optimize_gif to accept io.BytesIO as input
  • Added support for bytes as src inside optimize_image
  • convert parameter deprecated for optimize_image. Users should now use dst_format instead.
  • Delegated format conversion to optimize_xxx methods
  • Added test cases for BytesIO/bytes support and dst_format param

@cking100 cking100 changed the title convert and optimize image in a single pass [WIP] convert and optimize image in a single pass Mar 29, 2026
@benoit74 benoit74 marked this pull request as draft March 30, 2026 08:55
@benoit74
Copy link
Copy Markdown
Collaborator

I've converted PR to draft since it is not ready, please mark it as ready once ... ready 🤣

@benoit74 benoit74 marked this pull request as ready for review March 30, 2026 08:56
@benoit74 benoit74 marked this pull request as draft March 30, 2026 08:56
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 91.37931% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.33%. Comparing base (8c16fa6) to head (4dcf9e7).

Files with missing lines Patch % Lines
src/zimscraperlib/image/optimization.py 91.37% 4 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #289      +/-   ##
==========================================
- Coverage   99.52%   99.33%   -0.20%     
==========================================
  Files          41       41              
  Lines        2516     2545      +29     
  Branches      354      365      +11     
==========================================
+ Hits         2504     2528      +24     
+ Misses          8        6       -2     
- Partials        4       11       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cking100 cking100 force-pushed the convert-optimize-single-pass branch 3 times, most recently from e5357ec to c10b2eb Compare March 30, 2026 23:01
@cking100
Copy link
Copy Markdown
Author

@benoit74 I will add the test cases soon. But, before that, can I please get a review on the changes?

Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few remarks. You are on the right path, thank you

Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
@cking100 cking100 force-pushed the convert-optimize-single-pass branch 2 times, most recently from 631d2e7 to b0eea3c Compare April 1, 2026 20:02
@cking100
Copy link
Copy Markdown
Author

cking100 commented Apr 1, 2026

@benoit74 I have made the requested changes. Can you please take a look? Thanks

Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I realize we still have "details" to sort out ^^

Another general question: do we really need the @overload definitions now? I don't get anymore what they are adding in this specific case since it looks to me we now support all combinations (src is Path and dst is BytesIo, src is BytesIO and dst is Path, ...)

Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
@cking100
Copy link
Copy Markdown
Author

cking100 commented Apr 2, 2026

Regarding the Overloads, since these functions are used by multiple scrapers, you might know better than me. We can either remove overloads and have cleaner code, but then we will have to deal with union type every time the functions are being called, or keep them and don't worry about it. @benoit74

@benoit74
Copy link
Copy Markdown
Collaborator

benoit74 commented Apr 2, 2026

Regarding the Overloads, since these functions are used by multiple scrapers, you might know better than me. We can either remove overloads and have cleaner code, but then we will have to deal with union type every time the functions are being called, or keep them and don't worry about it. @benoit74

Let keep it as-is for now and I will double-check this later

@cking100 cking100 force-pushed the convert-optimize-single-pass branch from b0eea3c to bb3e1c6 Compare April 2, 2026 12:24
@cking100 cking100 requested a review from benoit74 April 2, 2026 12:34
@cking100 cking100 force-pushed the convert-optimize-single-pass branch from bb3e1c6 to c312f81 Compare April 6, 2026 00:56
Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more remarks + I now realize we probably do not need to add convert option to other optimize_xxx methods. We should just get src_format from src, and if it is different than method format then we do the conversion.

This also means that we probably shouldn't try to guess src_format in optimize_image and delegate that responsibility to optimize_xxx methods.

Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
Comment thread src/zimscraperlib/image/optimization.py Outdated
@cking100
Copy link
Copy Markdown
Author

cking100 commented Apr 6, 2026

I now realize we probably do not need to add convert option to other optimize_xxx methods.

Adding a convert option to each optimize method made things complicated. This change would help simplify the code.

This also means that we probably shouldn't try to guess src_format in optimize_image and delegate that responsibility to optimize_xxx methods.

So we no longer need to call ensure_matches inside optimize_xxx methods, right? @benoit74
We can guess src_format using format_for inside each of them, and the img.save() method's format parameter will handle the conversion. Even if mismatches it gets auto converted.

I will update the PR with requested changes soon. Thank you

@benoit74
Copy link
Copy Markdown
Collaborator

benoit74 commented Apr 7, 2026

So we no longer need to call ensure_matches inside optimize_xxx methods, right? @benoit74

Yes, since we will now support implicit conversion. I do not see a use-case where one would be sad that we've implicitly performed the required conversion. Maybe I miss some cases, but let's keep it simple for the time being.

@cking100 cking100 force-pushed the convert-optimize-single-pass branch from c312f81 to 1886348 Compare April 7, 2026 09:35
@cking100
Copy link
Copy Markdown
Author

cking100 commented Apr 7, 2026

@benoit74 I have updated the PR with the requested changes. Also, removed the overloads as I think it would be cleaner this way, and kept the ensure_matches just in case any scraper may be using it.

This change will result in one of the test cases test_optimize_image_unsupported_format failing, but I will fix it and also add new test cases once everything is good to go. Please have a look and lmk thank you.

@benoit74
Copy link
Copy Markdown
Collaborator

benoit74 commented Apr 7, 2026

Code changes LGTM, well done.

@cking100 cking100 force-pushed the convert-optimize-single-pass branch from 1886348 to a2145cc Compare April 8, 2026 08:48
@cking100 cking100 marked this pull request as ready for review April 8, 2026 08:49
@cking100 cking100 changed the title [WIP] convert and optimize image in a single pass Convert and optimize image in a single pass Apr 8, 2026
@cking100 cking100 force-pushed the convert-optimize-single-pass branch from a2145cc to f3334d2 Compare April 8, 2026 08:57
@cking100
Copy link
Copy Markdown
Author

cking100 commented Apr 8, 2026

@benoit74 Done. Added the test cases as well. Please have a look, and thank you

@benoit74
Copy link
Copy Markdown
Collaborator

@cking100 I've been out of office most of last week so I'm slowly recovering, I'll come back to you sometime this week. In the mean time, can you please at least fix the PR title and CHANGELOG entry to something more "aligned" with the changes we've made. Thank you.

@cking100
Copy link
Copy Markdown
Author

@benoit74 Sure, I will fix the PR title and CHANGELOG entry. And no worries, take your time.

@cking100 cking100 force-pushed the convert-optimize-single-pass branch from 7700cad to 4dcf9e7 Compare April 13, 2026 18:18
@cking100 cking100 changed the title Convert and optimize image in a single pass Extend image optimization to support in-memory streams(BytesIO/bytes) and dst_format param Apr 13, 2026
Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot, this has been a tedious work for something quite lean in the end (which I'm happy about, I just wanted to acknowledge the effort you've put in this change which could look quite minimal in the end).

@rgaudin do you want to have a look as well now that this is ready or not?

@rgaudin
Copy link
Copy Markdown
Member

rgaudin commented Apr 16, 2026

I took a quick look at it ; looks good.

@benoit74
Copy link
Copy Markdown
Collaborator

@cking100 please take a look at QA issues + we aim to have 100% code coverage, so please add required ones. I'm a bit surprised main branch is not at 100% anymore, I will have to look into it, but at least ensure coverage does not decrease.

If one line is especially stupid to cover I'm OK to introduce a pragma to ignore it in coverage report, but I feel like some important things should be covered with proper test cases (e.g. pass convert as boolean, test RGBA conversion to gif, ...)

@cking100
Copy link
Copy Markdown
Author

this has been a tedious work for something quite lean in the end (which I'm happy about, I just wanted to acknowledge the effort you've put in this change which could look quite minimal in the end).

Thank you. I know it turned out to be simple, but it needed to be done to be able to fix it. Happy to help.

please take a look at QA issues + we aim to have 100% code coverage, so please add required ones. I'm a bit surprised main branch is not at 100% anymore, I will have to look into it, but at least ensure coverage does not decrease.

I will fix the QA issues. The coverage was around 96% for optimization.py, with a few lines inside optimize_gif that were not covered. Those in which Gifsicle fails, and we clean the temp files. I will add tests for it and update the PR soon.

test RGBA conversion to gif

AFAIR this is already covered by the test case test_optimize_gif_with_non_gif_src. I will check and add those too if not already present. @benoit74

@cking100 cking100 force-pushed the convert-optimize-single-pass branch from 4dcf9e7 to c8427ef Compare April 17, 2026 14:51
@cking100 cking100 force-pushed the convert-optimize-single-pass branch from c8427ef to 08817bd Compare April 17, 2026 16:36
@cking100
Copy link
Copy Markdown
Author

@benoit74 Fixed the QA issues and also added the test cases. The overall coverage is 99% now. Please have a look, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to convert and optimize an image to webp in one step with default settings?

3 participants