Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse incoming text into wikicode only once #17

Open
yuvipanda opened this issue Jan 11, 2016 · 4 comments
Open

Parse incoming text into wikicode only once #17

yuvipanda opened this issue Jan 11, 2016 · 4 comments

Comments

@yuvipanda
Copy link
Contributor

Looks like in section.py the entire wikicode is parsed to get list of sections (in def _generate_flat_list_of_sections(text):) and then again in _load_fields

@kjschiroo
Copy link
Collaborator

It looks like #18 addresses everything raised here

@yuvipanda
Copy link
Contributor Author

The function to check indent levels also does a re-parse, which makes everythnig a lot slower. Can you re-open this and I can take a look at seeing if I can fix that too?

@kjschiroo
Copy link
Collaborator

Sure, feel free to try.

I am starting to wonder if it would be better to remove the dependency on mwparserfromhell. We are only using mwparserfromhell for sections and checking for outdent. Sections can probably be determined using regexes to find headers. So far as I can tell outdents don't come in that many shades either and could probably be picked up pretty well by a regex also.

@kjschiroo kjschiroo reopened this Jan 13, 2016
@yuvipanda
Copy link
Contributor Author

Using regexes on mw wikitext parsing is the start of a long road to madness
so I would highly recommend not doing it :) many have done this before and
regretted their decision later...
On Jan 13, 2016 7:08 AM, "Kevin Schiroo" [email protected] wrote:

Sure, feel free to try.

I am starting to wonder if it would be better to remove the dependency on
mwparserfromhell. We are only using mwparserfromhell for sections and
checking for outdent. Sections can probably be determined using regexes to
find headers. So far as I can tell outdents don't come in that many shades
either and could probably be picked up pretty well by a regex also.


Reply to this email directly or view it on GitHub
#17 (comment)
.

@kjschiroo kjschiroo changed the title Do not re-parse sections Parse incoming text into wikicode only once Jan 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants