Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object view message definitions #11

Merged
merged 5 commits into from
Aug 14, 2020

Conversation

alex-mitrevski
Copy link
Member

@alex-mitrevski alex-mitrevski commented Aug 7, 2020

Summary

Related to b-it-bots/mas_domestic_robotics#26 and b-it-bots/mas_domestic_robotics#233

The PR adds messages that are useful for permanently memorising object views. This is useful, for example, for object and face recognition, where we need to store a (small) set of prototype views of objects/faces.

I particularly added three messages to support the representation of views:

  • ObjectEmbedding: A (low-dimensional) embedding of an object, for instance found by a Siamese network
  • ObjectView: An object view represented by (i) an image, (ii) a point cloud, and (iii) an embedding
  • ObjectViews: A list of individual object views Based on the discussion below, I removed this message and instead added an ObjectView array to the Person, Face, and Object messages

I decided to represent an object view with three different types of information - a cloud, an image, and an embedding - since different modalities may be useful in different contexts: in particular, the cloud could be used for registration, while the image/embedding could be used for image-based recognition.

The image and the embedding are redundant (the message encodes an implicit assumption that the embedding is found from the associated image), but they could serve different purposes: the embedding is useful to have for fast(er) recognition (so that we don't have to recompute it every time we want to recognise an object); the image is primarily useful for transparency (for example, we might want to identify the images that were responsible for recognising an object).

Need for the PR

Currently, the repository contains separate messages for people and objects. The added messages unify objects and people into a single representation that is specifically designed for a visual recognition task. The expected use is that the view messages will be filled from the data in the object/person messages, which are used during online detection.

@minhnh
Copy link
Member

minhnh commented Aug 7, 2020

I'm not actually convinced that objects and people can be the same thing, since the tasks involved with both can differ significantly (anything speech related for example, though I realize it's not relevant here in mas_perception_msgs). An alternative maybe is creating a visual information/categorization message and reusing it in both object and person message.

@alex-mitrevski
Copy link
Member Author

alex-mitrevski commented Aug 7, 2020

I'm not actually convinced that objects and people can be the same thing, since the tasks involved with both can differ significantly

In what sense are they not the same? As I mentioned in the PR description, the messages are added specifically for a visual recognition task; they are not meant to be used for other purposes. To put it in other terms, they are data structures that will make it possible to write a generic visual recognition algorithm.

An alternative maybe is creating a visual information/categorization message and reusing it in both object and person message.

I'm not sure what you mean with this message; could you please elaborate?

@minhnh
Copy link
Member

minhnh commented Aug 7, 2020

In what sense are they not the same?

The tasks are different, with possibly more interactions than just picking and placing. The question is whether this will be affected by how the message is defined. Potentially something like age, gender will be recognition for people only. This can maybe dealt with via a list of attributes, but we have to be careful about mixing types.

I'm not sure what you mean with this message; could you please elaborate?

It may be just the naming issue ('subject' may be better than 'object'?). So is this meant to replace the Object.msg and Person.msg? What I have in mind is essentially instead of having the ObjectViews.msg just add the ObjectView.msg to the Person.msg and Object.msg to replace the other cloud/image fields.

@alex-mitrevski
Copy link
Member Author

The tasks are different, with possibly more interactions than just picking and placing. The question is whether this will be affected by how the message is defined. Potentially something like age, gender will be recognition for people only. This can maybe dealt with via a list of attributes, but we have to be careful about mixing types.

You are thinking in terms of tasks here, but visual recognition doesn't necessarily have anything to do with a task. Nor with attributes (though attributes can be used to update the confidence in the recognition). But in any case, we try to answer questions of the following type: is the face I see Minh, or Alex, or someone else; or is the cup I see Alex's cup, or Minh's cup, or an unknown cup? We try to answer this given a list of "views" of each face, or each cup.

It may be just the naming issue ('subject' may be better than 'object'?). So is this meant to replace the Object.msg and Person.msg?

No, replacement is not what I had in mind. The new messages are supposed to supplement the existing messages (particularly since not all applications need a recognition functionality).

What I have in mind is essentially instead of having the ObjectViews.msg just add the ObjectView.msg to the Person.msg and Object.msg to replace the other cloud/image fields.

This doesn't fully work because of the multiple view assumption (i.e. we may have a collection of images/clouds of the same object/person/face from multiple views). Unless I add the ObjectViews message to both Person and Object; that would work, I suppose.

@minhnh
Copy link
Member

minhnh commented Aug 7, 2020

Unless I add the ObjectViews message to both Person and Object; that would work, I suppose.

I meant adding ObjectView[] to the fields in Person and Object and use name and category from thoses messages.

If it's to supplement those messages then I don't really have any issue with the change.

@alex-mitrevski alex-mitrevski changed the base branch from kinetic to devel August 7, 2020 14:00
@alex-mitrevski
Copy link
Member Author

I meant adding ObjectView[] to the fields in Person and Object and use name and category from thoses messages.

OK, that makes sense. I added an ObjectView array to Person, Face, and Object. I will however change the PR to WIP now since these are breaking changes (requiring changes in at least mas_domestic_robotics and mas_perception_libs).

@sthoduka @deebuls Will these changes break anything in the @work code?

This makes the ObjectViews message obsolete, so I also removed that
@alex-mitrevski alex-mitrevski force-pushed the feature/object-view-msg branch from 289d7b6 to aa247f8 Compare August 7, 2020 14:16
@alex-mitrevski alex-mitrevski changed the title Object view message definitions WIP: Object view message definitions Aug 7, 2020
@mhwasil
Copy link
Member

mhwasil commented Aug 11, 2020

I meant adding ObjectView[] to the fields in Person and Object and use name and category from thoses messages.

OK, that makes sense. I added an ObjectView array to Person, Face, and Object. I will however change the PR to WIP now since these are breaking changes (requiring changes in at least mas_domestic_robotics and mas_perception_libs).

@sthoduka @deebuls Will these changes break anything in the @work code?

Yes, this will break most of our perception codes. We use Object.msg in our perception codes.

@alex-mitrevski
Copy link
Member Author

@mhwasil I was afraid that would be the case. But you will use noetic-devel from now on, right? If that's the case, I could still merge the changes to kinetic-devel since I'll continue working with that branch for the time being.

@mhwasil
Copy link
Member

mhwasil commented Aug 11, 2020

@mhwasil I was afraid that would be the case. But you will use noetic-devel from now on, right? If that's the case, I could still merge the changes to kinetic-devel since I'll continue working with that branch for the time being.

Yes, we will use noetic-devel. Nevertheless, changing our code is an easy task :-), so we do not have a problem with that.
so regarding the naming of the msg ObjectView.msg, would it be better to use ObjectRepresentation.msg instead? or any other name.
ObjectView is somehow confused with an object having different view points.

@alex-mitrevski
Copy link
Member Author

Yes, we will use noetic-devel. Nevertheless, changing our code is an easy task :-), so we do not have a problem with that.

OK, that's great!

so regarding the naming of the msg ObjectView.msg, would it be better to use ObjectRepresentation.msg instead? or any other name. ObjectView is somehow confused with an object having different view points.

Actually, that's exactly what the list of ObjectView messages is supposed to represent - the same object as seen from different view points. That's why I used that name for the message.

@sthoduka
Copy link
Contributor

I thought it was about different representations too, but I didn't read carefully enough. ObjectView sounds fine for different viewpoints

@alex-mitrevski alex-mitrevski changed the base branch from devel to kinetic-devel August 14, 2020 13:17
@alex-mitrevski
Copy link
Member Author

Thanks everyone for your suggestions. I'm going to merge this now so that I can also proceed with b-it-bots/mas_perception_libs#18 and b-it-bots/mas_domestic_robotics#234.

@alex-mitrevski alex-mitrevski changed the title WIP: Object view message definitions Object view message definitions Aug 14, 2020
@alex-mitrevski alex-mitrevski merged commit 18c1dc8 into kinetic-devel Aug 14, 2020
@alex-mitrevski alex-mitrevski deleted the feature/object-view-msg branch August 14, 2020 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants