Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reindexing is needed #35

Open
vkolotov opened this issue Nov 28, 2011 · 13 comments
Open

Reindexing is needed #35

vkolotov opened this issue Nov 28, 2011 · 13 comments

Comments

@vkolotov
Copy link

We are trying to use elastic search and, as far as we can see, elastic search module doesn't have any feature to reindex all entities in DB. I guess it is quite usefull feature, especially when you are migrating from one to another DB or when you adding entities via SQL script.

It is quite simple and straightforward feature which can be implemented as a method in an admin controller.

@bgooren
Copy link
Contributor

bgooren commented Nov 28, 2011

Is is already possible in the current release, but by using undocumented methods. We can add a documented way to do this.

@vkolotov
Copy link
Author

Ok great! Can you tell us how to do it with current release? BTW, we already implemented this feature, but we still need "official" version. Also, we can share implementation with you, if you want.

@bgooren
Copy link
Contributor

bgooren commented Nov 29, 2011

We stop play, remove the index files and then for every indexed class do it like this:

private static <T extends Model> void updateIndex(Class<T> clazz) {
    // Update index
    ElasticSearchPlugin plugin = Play.plugin(ElasticSearchPlugin.class);
    ModelMapper<T> mapper = MapperFactory.getMapper(clazz);
    ElasticSearchAdapter.startIndex(plugin.client(), mapper);

    List<T> objs = JPQL.instance.findAll(clazz.getName());
    for (T obj : objs) {
        try {
            ElasticSearchAdapter.indexModel(plugin.client(), mapper, obj);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

As you can see, all that is needed to allow a user to index a model is a wrapper method for

ElasticSearchAdapter.indexModel(plugin.client(), mapper, obj);

@bgooren
Copy link
Contributor

bgooren commented Nov 29, 2011

Haha, seems like I already added that wrapper method. See

ElasticSearch.index()

So what you can do is remove your elastic search index files when play is stopped, start play and feed all the models you want to index to ElasticSearch.index().

@vkolotov
Copy link
Author

Great! Thank you!

But, I just want to say that it is better to implement paging while iterating entities, since the amount of entities might be too big. It might cause OutOfMemoryError.

Anyway, this feature might be implemented in admin console of ES module like how it is done in "search module". Is it supposed to be?

@bgooren
Copy link
Contributor

bgooren commented Nov 29, 2011

Sure, paging is important. But that is for the user to handle I'd say right now.

Of course it's something which could be added to the admin.

@aheritier
Copy link

I don't know if it is due to the lack of paging or to something else but I have a strange behavior when I try to create/update the index from data in my DB. I'm using the embedded ES and try to index a category of objects (~200 thus not so much) and each time I index them I don't have all objects. The first time it indexes only objects > 100 in the list and if I recall the update it adds few missing objects each time. It is like the loop of objects were too quick to let ES index them. The problem is that if I use a debugger to try to analyze the issue, the problem disappears ...

@aheritier
Copy link

Just adding a "Thread.sleep(5);" in my loop solves the issue :(

@aheritier
Copy link

Instead of the sleep, adding a request to refresh the index solves also my issue :
ElasticSearch.client().index(Requests.indexRequest().refresh(true));

@bgooren
Copy link
Contributor

bgooren commented Jun 1, 2012

Hmmm, sounds like this is more of an elasticsearch issue.
If you see everything after you ask ES to refresh its index, it means all updates did arrive, but aren't visible right away.

I haven't encountered such issues myself, but feel free to submit a patch if you find a good solution for this issue.

@isamaru
Copy link
Contributor

isamaru commented Jul 13, 2012

I have found the root cause for the issue. It is an issue of play framework: indexing tasks use play.libs.F.EventStream queue, which has a bounded buffer size (100) and discards the event with no warning on overflow.

Would be nice to have a "drop all index feature", and it would be even nicer if it worked automatically with Fixtures.deleteDatabase() and the wipe before entering initial data.

@bgooren
Copy link
Contributor

bgooren commented Jul 13, 2012

Ah, that sounds like the root cause indeed! I guess the module needs a more reliable way of handling the index events.

Regarding fixtures: do you know if we can hook into an event or plug a listener somewhere to handle fixture cleanup?

@isamaru isamaru mentioned this issue Jul 17, 2012
@isamaru
Copy link
Contributor

isamaru commented Jul 17, 2012

I solved it by extending play.test.Fixtures and including my own implementation in all code:

/**
 * Extension of Fixtures to mass delete and create search indices
 */
public class Fixtures extends play.test.Fixtures {

    /**
     * Flush the entire JDBC database and clear the indices
     */
    public static void deleteDatabase() {
        deleteIndices();
        play.test.Fixtures.deleteDatabase();
    }

    /**
     * Delete all Model instances for the all available types using the underlying persistence mechanisms, clear the indices
     */
    public static void deleteAllModels() {
        deleteIndices();
        play.test.Fixtures.deleteAllModels();
    }

    /**
     * Load Model instances from a YAML file and persist them using the underlying persistence mechanism. The format of the YAML file is constrained, see the Fixtures manual page Search indices are created synchronously.
     * 
     * @param name
     *            Name of a YAML file somewhere in the classpath (or conf/)
     */
    public static void loadModels(final String name) {
        ElasticSearchPlugin.setBlockEvents(true);
        play.test.Fixtures.loadModels(name);
        ElasticSearchPlugin.setBlockEvents(false);
        ElasticSearchPlugin.batchProcessBlockedOperations();
    }

    /**
     * @see loadModels(String name)
     */
    public static void loadModels(final String... names) {
        for (final String name : names) {
            loadModels(name);
        }
    }

    private static void deleteIndices() {
        final Client client = ElasticSearchPlugin.client();
        client.prepareDeleteByQuery("_all").setQuery(QueryBuilders.matchAllQuery()).execute().actionGet();
        Logger.info("All search indices deleted!");
    }

}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants