Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Options to ignore auto conversion id to _id #3184

Open
kbiits opened this issue Oct 22, 2024 · 17 comments
Open

[Feature Request] Options to ignore auto conversion id to _id #3184

kbiits opened this issue Oct 22, 2024 · 17 comments

Comments

@kbiits
Copy link

kbiits commented Oct 22, 2024

Is your feature request related to a problem?

We have a merchant.id field in our mongodb document, and after upgrading to mongodb driver 5.1 we encountered issue that mongodb driver modify the where query from merchant.id to merchant._id which results to unexpected query

Describe the solution you'd like

It would be good if there's mechanism to whitelisting the fields from auto conversion .id to ._id

Describe alternatives you've considered

Additional context

This part of code modify all .id to ._id
image

@Abdullahbutt3434
Copy link

I am looking for the solution here

@Abdullahbutt3434
Copy link

**protected $primaryKey = '_id';**
I tried to put above line in model that is not working as well
if we want to go with _id there must a be 
please guide. 

@kbiits
Copy link
Author

kbiits commented Oct 22, 2024

I am looking for the solution here

What I've done to get rid of the issue is by downgrading the MongoDB driver to version 4.x

@GromNaN
Copy link
Member

GromNaN commented Oct 29, 2024

We have no plan to allow usage of .id fields in embedded documents. For your existing database, you can rename the field with the $rename operator.

DB::collection('products')->raw()->updateMany([], ['$rename' => ['merchant.id' => 'merchant._id']]);

If you need to keep both id and _id fields during update to avoid downtime, you can update in 3 steps:

  1. Duplicate the field merchant.id into merchant._id
  2. Deploy your code updating laravel-mongodb package
  3. Remove merchant.id

@kbiits
Copy link
Author

kbiits commented Oct 29, 2024

That's too scary since we have a lot of services query the same collection and we don't want to break them.
We can still use the v4 version instead of renaming the field. I hope you guys have the bandwidth to support it in the future.

@katjackson
Copy link

I'm also having an issue with this behavior. @GromNaN Would you consider making the aliasIdForQuery method protected, so that it could be overwritten? Renaming the nested id keys in our large code base is just too risky.

@GromNaN
Copy link
Member

GromNaN commented Dec 10, 2024

Could you provide more precision about your data model? Are you using id fields on the main document on only for embedded documents?

@katjackson
Copy link

katjackson commented Dec 10, 2024

They are in embedded documents. In some places, they are structured as EmbedsMany models on a parent model, and in others, they are just arrays of objects stored on a model.

{
    "_id" : ObjectId(),
    "dynamic_content" : {
        "conditions" : [
            {
                "id" : "32C67020",
                "name" : "Deposit",
                "description" : "Deposited Students",
                "active" : true,
                "rules" : [
                    {
                        "content" : {...},
                        "type" : "user_segment"
                    }
                ]
            },
            {
                "id" : "6FED4F05",
                "name" : "Admit",
                "description" : "Admitted students-Nondeposits",
                "active" : true,
                "index_weight" : NumberInt(2),
                "rules" : [
                    {
                        "content" : {...},
                        "type" : "user_segment"
                    },
                    {
                        "content" : {...},
                        "type" : "date_condition"
                    }
                ]
            }
        ]
    }
}

The changes to alias '_id' in results and queries are incredibly problematic for us. We have almost 200 models in our code base. Our API handles about 3.5 million requests a day to ~1000 endpoints. Changing the structure of the data is not a reasonable option for us.

aliasIdForResult is public, so I was able to overwrite it, to continue returning '_id' alongside 'id'. But because aliasIdForQuery is private, I would need to overwrite toMql, insert, insertGetId, delete, performUpdate, and inheritConnectionOptions in order to change the behavior of aliasIdForQuery so that I can manage those models with nested 'id' values.

@leitommi
Copy link

leitommi commented Jan 14, 2025

We have no plan to allow usage of .id fields in embedded documents. For your existing database, you can rename the field with the $rename operator.

@GromNaN Is it known why subfields rename from "id" to "_id" happens in the first place? To be clear, I am talking about subfields in document, not the primary key "_id".

Another note, this change should have been in the release notes. Currently there is mentioned only the primary key.

All in all, it is problematic for my project as well and at least opt out would be lifesaver.

@GromNaN
Copy link
Member

GromNaN commented Jan 14, 2025

We plan to add a setting to disable _id conversion for embedded documents.

@leitommi
Copy link

We plan to add a setting to disable _id conversion for embedded documents.

It is superb to hear that! Is there a target version in mind as well or any other indication about timeline?

@arfar-x
Copy link

arfar-x commented Feb 16, 2025

Hey,
Since this is a problem causing corruptions in data modeling and querying, it indeed must be considered a bug rather than a feature.
Another scenario I have encountered such problem is that, I have a document like the following stored in the database:

{
  "_id": ObjectId('abcd123'),
  "key": "value",
  "data": [
    0: {
      "content": "Lorem ipsum",
      "another_key": "another_value",
      "_id": "a unique ID from another storage",
      "id": "another unique ID from another storage"
    },
    1: {
      "content": "Lorem ipsum II",
      "another_key": "another_value_ii",
      "_id": "a unique ID from another storage II",
      "id": "another unique ID from another storage II"
    },
    ...
  ]
}

When I retrieve this document by Eloquent, all '_id' fields are converted into 'id', event those nested '_id' fields having nothing to do with primary-key. So it would be like this:

{
  "id": ObjectId('abcd123'),
  "key": "value",
  "data": [
    0: {
      "content": "Lorem ipsum",
      "another_key": "another_value",
      "id": "another unique ID from another storage"
    },
    1: {
      "content": "Lorem ipsum II",
      "another_key": "another_value_ii",
      "id": "another unique ID from another storage II"
    },
    ...
  ]
}

After realizing it, I encountered a worse case which was overriding protected $primaryKey = 'id'; to use '_id'. However, nothing affected the final result.

@mshamaseen
Copy link

This _id to id auto conversion is so awkward. I understand the intention and that the maintainers want to unify the package with Laravel way, but introducing such a huge change without any backward care is a huge pain!

Our frontend, API, and Wehbook listeners expect to receive the _id field, now, after upgrading to version 5.x, they all receive id instead, and there is no way to return it to _id. imagine how many applications will that break, and how hard to tell all these parties that they should listen for the id from now on. also it would be unjustifiable change.

would appreciate having a way to stop this conversion, as the features introduced in version 5 are amazing, and we would love to upgrade to v5 without weeks of work and downtime!

@bisht2050
Copy link
Contributor

Thanks all for your feedback.
We plan to start working on PHPORM-255 that addresses this challenge in the next couple of weeks.

@leitommi
Copy link

leitommi commented Mar 5, 2025

@bisht2050
PHPORM-255 seems to suggest to allow whitelisting fields that opt out from the conversion. What most here are looking for is disabling the conversion fully.

@alcaeus
Copy link
Member

alcaeus commented Mar 5, 2025

@leitommi the issue copies the description of this pull request, and only links a pull request received from an external contributor (now closed) for reference.

At the moment, we're favouring an option to disable this automatic conversion for embedded documents. Embedded documents don't necessarily need an ID (but can profit from having one), so we shouldn't mess around with the identifier in embedded documents. However, since we have made the hard BC break for this, the only way away from this is to keep the BC break in 5 (to avoid having to release version 6 with another BC break), but to allow people to opt out from this new behaviour. Essentially, updating to version 5 of the integration requires you to update this.

All in all, we're suffering from an old design decision made here. MongoDB always has used _id as a primary key, and there was never a way to disable this or change this. Since the value of this _id field can be anything, a compound primary key would be represented as an array or an embedded document, even though MongoDB recommends doing that because of potential issues when querying.

Laravel on the other hand has always used id as the primary key. If we were to rewrite this library from scratch, we would always want people to use id in their models, but map it to the _id field in the documents, as that's what satisfies both conventions best. Since software development usually isn't a green field, we're stuck with some decisions made long before we took over the library, and we made a judgment call to change this behaviour without fully understanding the implications and the impact it has on the users.

There's this example from earlier which I want to address:

{
  "content": "Lorem ipsum",
  "another_key": "another_value",
  "_id": "a unique ID from another storage",
  "id": "another unique ID from another storage"
}

I hope for your sanity that this is just an example designed to highlight the issue, not an actual use case. Having _id and id with two separate values is asking for trouble - simply because they're too easy to confuse. If I were to have the use case of needing my own identifier as well as storing an identifier from an external system, I'd use _id for my own identifier (as that's the convention) and some sort of externalId (or even better named for the appropriate system) for the external identifier. However, I'm not going to tell you how to design your documents. As hair raising as that example may be, we want to give users a way to keep working with these documents.

We will however, keep the current behaviour for root documents: when people use id, they usually mean the identifier of the document, which in MongoDB is _id. One simple reason for this change is that a lot of Laravel functionality assumes the primary key field to be id, even when you define something else in the model. We've previously argued for changing this (see laravel/framework#48089 and laravel/framework#52079), but the change was denied both times. So, the only way for us to allow people to use Laravel components with MongoDB was to assume that id refers to _id. Anything other than that would've involved rewriting parts of Laravel specifically for MongoDB, making the developer experience much worse as you'd need separate paginators for something that should be abstracted away by Eloquent.

@katjackson
Copy link

we made a judgment call to change this behaviour without fully understanding the implications and the impact it has on the users.

this admission has honestly given me some peace lol. thank you for addressing this issue.

i spent weeks upgrading our system to v5, and ended up overwriting a lot of the new code to avoid these conversions. but if i had been able to opt out of changing nested _id keys, i might have been able to get away with a simpler solution. so i'm sharing it here in case it helps anyone else.

You can use eloquent's $appends property to ensure '_id' is set on all of your serialized models. And then extend all of your models from the adjusted base model class.

<?php

namespace App\Models;

use Illuminate\Database\Eloquent\Casts\Attribute;
use Illuminate\Database\Eloquent\Model;

class BaseModel extends Model
{
    /**
     * The accessors to append to the model's array form.
     *
     * @var array
     */
    protected $appends = ['_id'];

    /**
     * Append _id to arrays
     */
    protected function id(): Attribute
    {
        return new Attribute(
            get: fn (string $value) => $value ?? $this->id,
        );
    }
}

This way, you don't have to find every place in your code where you expected '_id' to exist on an array. You will still have to deal with the addition of the new 'id' property, and hide it in places where you don't want to expose the primary key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants