-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance EXIF handling #6641
Comments
Isn't this just part of the unfortunate reality that https://exiftool.org/TagNames/EXIF.html lists EXIF tags, and one of the groups listed is an ExifIFD? I would have thought the names were inherently confusing.
I'm not opposed to the idea of named constants. I'd want to see what happens with #6630 first.
You mean adding more entries to ExifTags? Again, not opposed, just would want to see what happens with #6630.
I find this a bit vague. Some more detail on what other tags you're thinking of could be helpful.
#6630 again. Rather than methods to return the description as a string, I wonder if returning IntEnums as the values would be a solution? That would allow the names to be inspected without having to add new arguments or methods. |
I don't think so. I might be totally wrong here, though. As far as I understand, the Exif segment (which is returned by
Yes. I just rechecked. GPSTags, which can only occur in the GPSInfo IFD, can have "duplicate" tag numbers. For all other tags (maybe excluding MakerNotes, not so sure about them), including for example the Interop tags, this does not seem to be the case and Exiftool itself puts all the tag numbers in one big list. So it should be safe for Pillow to just extend the ExifTags dict (or enum in the future) as well.
That should only be the offsets to the next IFD (tag 0xa005 InteropOffset, tag 0x8825 GPSInfo, and a few others). It's not very important though. When I have to write code for "Give me all the information about this image" then the GPS long/lat are important, but it's not really important (obviously depending on the use case) at which byte the GPS Information starts. We extract the information anyway to parse the whole Exif segment, but it would be a convenience function to filter out the offsets.
#6630 should cover at least 99%, yes. One could argue that "Light Source" is a nicer string representation than what we get from But additionally, think for example of tag 0x9208 LightSource. Currently, we could only retrieve an integer as the value of this tag. But for example "1" is not very descriptive. "Daylight" (for which the 1 stands) is much nicer. This should also apply for setting this tag value. Currently, we would have to do something similar to |
yep, many people get confused by this and use |
I've created #6724 to add Interop tags. |
maybe it can help, I have the problem that the examples to obtain the gpsinfo with getexif() is not working in the last version of pillow, because gpsInfo is a number like is described in this question , the only way to obtain the gpsinfo was using _getexif |
Correct. I should check if tag names, that are allowed to appear in IFD0 and IFD1, are still missing. But that would be a minor addition.
@olmerg You can do it by using |
@olmerg if you still think that you can't get the GPS info from an image, open a new issue with more details and a copy of the image. I've created PR #6748 to add "named constants for common IFDs". If that is merged, you would be able to use I've also created PR #6749 to add a "LightSource" enum. "Flash" was also mentioned, but https://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/flash.html lists
as a possible value. I don't think that translates well to an enum. |
We're agreed that IFD0 comes first, and that the other IFDs are linked from that. That is the reality of the data, and I think Pillow is reflecting that. As an aside, Pillow used to flatten the other IFDs in, but that meant you couldn't tell which tags were from IFD0 and which were from another IFD. You're instead suggesting that the IFDs be thought of as being 1-dimensional instead, because automatically loading IFD0 might be unintuitive, and trying all IFDs in the same way might be a simpler way to think about them. I'm personally more inclined to continue to reflect the fact that one IFD links to another, and loading IFD0 automatically seems helpful, rather than forcing the user to specifically choose it. Regardless of the preferred way of thinking about this is though, Pillow values backwards compatibility. I don't think there's a way to implement this without breaking that. This change wouldn't allow for any additional functionality or fix any bugs, it's just trying to improve the clarity of the API. So I don't think the benefits outweigh the costs. |
I've created PR #6762 for this. |
There is a Now only three: Webp, Png and Jpeg file formats support In all three the code almost the same: def _getexif(self):
if "exif" not in self.info:
return None
return self.getexif()._get_merged_dict() Proposal: add optional parameter( Asking, cause i do not know, need I implement this for my heif plugin or not. I do not want to implement protected method, to not angry linters, and do not want to introduce custom flags that will be used only in one type of images. Or are there other plans to get full Exif data instead of |
@radarhere ,I got the GPS information but it was so difficult to find how, the examples that I find in the internet did not work. The only way that I find was to access the private data of the class. I think will be great your proposal about a more easy way to get that type of data. |
Close, but I would still disagree partly here. Pillow provides
I disagree. I can't imagine a scenario where I would want a merged dict, because I can never be sure that the same tag was not stored in multiple IFDs (one example would be XResolution, which can be present in IFD0 for the original image and in IFD1 for the embedded thumbnail). I am still not able to use (only) Pillow to extract the embedded thumbnail and its metadata in IFD1, because I have no way of extracting the offset to IFD1 for use in |
@bigcat88 you're not currently using |
At the moment, even if I created a PR with a Would you mind if I tried to solve this problem with the current API? Could you open a new issue about this, attaching a copy of an image and describing what you're trying to extract? What specification are you referring to? |
@radarhere Thanks for the offer. Please see the linked issue #6777 |
The
getexif()
method ofPIL.Image
could use some enhancements, which most likely require an API bump. I'm just listing some ideas/shortcomings of the current implementation here, which I would love to see addressed (and could possibly contribute to, if the changes are agreed upon).getexif()
isExif
. But without further processing, it's really only IFD0. However, already returning an IFD makes theget_ifd()
method look out of place. Therefore, I recommend to implement theExif
class as a dict linking to all extracted IFDs.Exif.get_ifd0()
) or named constants for common IFDs (e.g.Exif.get_ifd(ExifTags.IFD.IFD0)
) would improve the usage of the API a lot. (Added IFD enum to ExifTags #6748)I can't say anything about the feasability regarding licencing, but maybe the tag names could be scraped from the Exiftool website (the quasi reference implementation of the Exif format): https://exiftool.org/TagNames/EXIF.html
There are already other python libraries for Exif handling, but they all suffer from fragmentation. As the Pillow library already does the heavy lifting of extracting the IFDs anyway, I think providing a convenient and nearly complete view of the data can be a significant improvement to the project.
The text was updated successfully, but these errors were encountered: