Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform the parser to CPS. Support pull parsing #51

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Jan 4, 2016

  1. Transform the parser to CPS. Support pull parsing

    The goal was to be able to do:
    
    > {continue, Continue, Events1} = erlsom:parse_pull(InitialData, ParserOpts),
    %%  Events1 being a list of the events produced by erlsom so far.
    
    >{continue, Continue2, Events2} = Continue(MoreData),
    ...
    
    {ok, EventsN, RemainingData} = ContinueN(FinalData).
    
    That is, invert the control with the caller passing the data in chunks
    to the parser, instead of the parser asking for more in a callback.
    
    This allow simpler integration for long running parsing (think in
    infinite xml streams comming from network), and also allows hibernating
    the parsing process,  that on current version is not always possible (as
    hibernate discard the stack).
    
    Returning the list of events instead of using sax callbacks is mostly a
    matter of preference,  sax callbacks can still be used. A found
    returning a list cleaner to work with (and size of the list is bounded
    if you feed the parser in chunks)
    
    The change was to make every possible "blocking" parsing operation
    become CPS. Then the continueFun/* in  erlsom_sax_lib check if we
    are using continuation callback or not.
    If not using continuation callback, it just return a function from where
    caller can continue parsing:
    
    continueFun2K(T, V1, V2, V3, State = #erlsom_sax_state{continuation_fun=undefined}, ParseFun, K) ->
        {continue, fun(Data) -> ParseFun(<<T/binary, Data/binary>>, V1, V2,
        V3, State#erlsom_sax_state{user_state = []}, K) end,
             lists:reverse(State#erlsom_sax_state.user_state)};
    
    Initial test show little performance degradation, and any work done by
    the user with the parser events would make the difference in parsing
    costs irrelevant, but need more complete tests to assert that.
    Pablo Polvorin committed Jan 4, 2016
    Configuration menu
    Copy the full SHA
    5575695 View commit details
    Browse the repository at this point in the history