Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Img2Mesh, Text2Mesh option for inference #133

Open
lalalune opened this issue Oct 24, 2022 · 3 comments
Open

Add Img2Mesh, Text2Mesh option for inference #133

lalalune opened this issue Oct 24, 2022 · 3 comments
Labels

Comments

@lalalune
Copy link

lalalune commented Oct 24, 2022

Is your feature request related to a problem? Please describe.
Many models are coming online from several directions which enable users to generate meshes unconditionally, from text guidance or image prior. These projects are harder to coordinate on because they are not well represented in HuggingFace's model hub or inference API, and that affects downstream work like Microsoft's MII inference pipeline which is tightly integrated with HuggingFace.

The goal of this feature request is to, looking at the future, consider adding 3D mesh tasks as a standard task type.

Example of Img2Mesh
https://github.com/monniert/unicorn

Example of Text2Mesh
https://github.com/ashawkey/stable-dreamfusion

Example of Unconditional Mesh Generation
https://nv-tlabs.github.io/GET3D

Example of text-guided animation with motion diffusion
https://github.com/GuyTevet/motion-diffusion-model

Describe the solution you'd like
Add support for 3D mesh responses. This is similar to images, but the mesh and texture can be separated in some format cases, so this will need to be considered. Some meshes may also have multiple parts or images, although in practice no model has done this.

The popular formats this takes are the following:

  1. .OBJ model, .PNG texture and .MTL material description
  2. FBX model with texture embedded
  3. GLB (binary GLTF) model with texture embedded
  4. Raw numpy, npz or npy file array
  5. ZIP file containing some custom data or other format
@Narsil
Copy link
Contributor

Narsil commented Oct 25, 2022

Hi @lalalune ,

Makes perfect sense.

Would FBX be general enough for most use cases ? Seems like a good target (single file, pretty standard).

It's definitely very addable, but I don't know if any of the existing integrations support that. (We don't support diffusers yet)

@Narsil
Copy link
Contributor

Narsil commented Oct 25, 2022

Tagging @mishig25 for his view on the front end part of such widgets.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Jun 11, 2023

GLB (gltf) would probably be best since it is a container for all of the embedded content of the 3D model itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants