-
Notifications
You must be signed in to change notification settings - Fork 478
Resyncing the Connector
This page describes when and how to re-sync mongo-connector. The most common reason to need to re-sync mongo-connector is that it couldn't replicate operations from the oplog fast enough. This can happen when there is a lot of write activity happening in MongoDB, such as when using mongoimport
. Because the oplog is a capped collection, older records are overwritten when the collection is full.
Mongo-connector can be more tolerant to short bursts of high write activity by increasing the oplog size in MongoDB. The greater oplog time allows mongo-connector to "catch up" when there is less write activity.
The only way to ensure that the data in your external system is consistent with what is in MongoDB is to delete and re-index all documents in the target. After all data is removed, you may delete the oplog progress file (usually called "config.txt") and re-start mongo-connector. Mongo-connector will then perform a collection dump, re-indexing all your data. Be careful and double-check that you are deleting only and exactly what you mean to delete.
The simplest and fastest way to remove data from MongoDB is to drop the database:
mongo
> db.getSisterDB("<database name>").dropDatabase()
{ "dropped" : "<database name>", "ok" : 1 }
Or only drop a collection:
> db.getSisterDB("<database name>").<collection name>.drop()
true
You can remove all data by sending a GET request to a URL:
http://<hostname>:<port>/solr/<core name>/update?commit=true&stream.body=<delete><query>*:*</query></delete>
You can remove all data quickly and efficiently by deleting the index and re-creating it:
curl -XDELETE http://<hostname>:<port>/<index name>
curl -XPUT http://<hostname>:<port>/<index name>
After this, you should refresh the index to make these changes visible:
curl -XPOST http://<hostname>:<port>/<index name>/_refresh
There aren't any other methods to restore a consistent state with the source MongoDB replica set or cluster. However, you can get mongo-connector simply running again by deleting the oplog progress file and restarting mongo-connector. This causes mongo-connector to perform a collection dump, re-saving the latest versions of all documents, then start tailing the oplog. This does not bring your target to a consistent state but may be suitable for pure insert/update use cases. If any delete operations were clobbered by the oplog collection rollover, mongo-connector cannot catch them without a proper re-sync (described above).