Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database test #36

Merged
merged 15 commits into from
May 20, 2014
Merged

Database test #36

merged 15 commits into from
May 20, 2014

Conversation

nyee
Copy link
Contributor

@nyee nyee commented Apr 9, 2014

-Continuation of fixes to databases
-Minor improvements to checkWellFormed: be sure to pull with corresponding pull request in RMG-Py (or this won't work)

nyee and others added 15 commits May 12, 2014 17:17
Found using checkWellFormed

-Cs_H_out_Cs2 adjList in rules changed to match group.py. group.py
adjList was more consistent with siblings.

-in R3H_SS_S, some atoms changed from R to R!H. The atoms had two bonds or
a radical, so they had to be by definition

-Cs_H_out_H/OneDe LogicOr was replaced with an AdjList. This was a strange
case where the parent and children were both Groups (Usually I would
expect a given LogicOr group to have LogicOr parent. With the new
flexibility that we have we have programmed into AdjList, I'm not even
sure we need LogicOr anymore). I'm not sure if we should never have this
weird sandwiched LogicOr, but in this case I found I could come up with an
AdjList that fulfills all the correct relationships.
Previously was calculating Q using natural log, but I want to use log base
10 so that I get orders of magnitude for my comparisons.

Also I was using an averaged temperature of the rule's range for
calculating k. This was when I wanted to calculated the Leave-one-out
value vs the rule's value. Now, I'm pretty sure I want to compare node to
node for future things, so I've just simplified it so that I calculate
everything at 1000K.
A few children errors:

-Changed C to Cs in Cs_H and all its (numerous) descendants
-Changed C to Cd in Cd_H and all its descedents
Additional lines of output given for checkWellFormed family.

One error we check for is if a label in rules.py is misnamed, i.e. if any
part of the label is not a group in group.py. There a decent amount of
these errors due to the recent Java/Py database merge. To make my life
database debugging life easier, I added a new feature which tells me if
there is a group which matches the misnamed label.

For example, let's say we have a rule label, A;B. Somebody changed the
name of A to C in groups.py, but left it unchanged in rules.py. The new
feature would suggest C as the correct name for A, by comparing adj lists
of the rule and those in groups.py

Unfortunately, there is a bug I can't figure out. If somebody changed A to C
AND B to D, it suggest C for A and C for B. The manual work around
is to visually verify which one adjList C matches, make the fix, and run
checkWellFormed again. It will then suggest D for B. I feel like this
isn't worth fixing because it will be unable to suggest groups with the
upgrade to universal database (once all adjlists are removed from
rules.py)
These were found by the checkWellFormed function.

Changes to rule labels rules.py.
For template A-->B (x occurences). 'A' did not exist in groups.py, so the
name was changed to the group 'B' with a matching adjList. 'x' is the
number of rules which used 'A'.

-O_sec_rad --> OJ_sec (1 occurance)
-H_rad does --> HJ (5 occurrences)
-N3t_Ct/H --> N3t_Ct-H (1 occurrence)
-O_pri_rad --> OJ_pri (6 occurrences)
-O_rad/NonDeO --> OJ-Os (13 occurrences)
-Ct/H_N3t --> Ct-H_N3t (2 occurrences)
-N3d/H_N3d/H --> N3d-H_N3d-H (1 occurrence)
-N3d/H_Cds/H2 --> N3d-H_Cds-HH (1 occurrence)
-Cds/H2_N3d --> Cds-HH_N3d (1 occurrence)
-N3t_CtH --> N3t_Ct-H (1 occurrence)

Change to adj lists in rules.py
-Changed radical from 3 to 3Q in "N3t_N3t;CH_quartet" to match group
definition. (CH_quartet only appears in one rule by bbuesser).
-For the top node R_R;YJ, the definitions of R_R and YJ? were both updated
to the more inclusive definition in groups.py (effects top node)
-Ct-Ct_Ct-Cs was found to be wrongly defined in groups.py and rules.py, so
changed all occurrences to the correct definition (19 occurrences)
-Ct-Ct_Ct-H was found to be wrongly defined in groups.py and rules.py, so
changed all occurrences to the correct definition (19 occurrences)

Change to adjlist in groups.py
-CH2_triplet changed to only allow 2T instead of 2S in N3t_N3t;CH2_triplet
(effects 1 rule by bbuesser)
-Ct-Ct_Ct-Cs adj list was definitely wrong. Found because it was a
duplicate of Ct-Cd_Ct-Cs. Changed the Cd in the adj list to a Ct. (appears
in 19 rules)
-Ct-Ct_Ct-H adj list was definitely wrong. Found because it was a
duplicate of Ct-Cd_Ct-H. Changed the Cd in the adj list to a Ct. (appears
in 19 rules)
-Changed Ss atom to more general S atom in SJ. It had the same adj list as
it's only child SsJ. (no rules use Ss)
1) Path was hard-coded to C:\RMG-database. It now runs on whatever folder
   the EvaluateKinetics.py script itself is in.
2) We weren't specifying which kineticsFamilies to load, which is now a
   requirement of the 'load' method (perhaps there should be a default?)
These were found using the checkWellFormed function.

-changed CtJ's had an atom C, that I've changed to  so that it's children
can include nitrogen (1 occurrence in rules.py, The original paper looks
like it only consider C, so I changed the rule from Ct-H_Ct-H;CtJ to
Ct-H_Ct-H;CtJ_Ct
-changed OdR had an atom {C,S} changed to Od-R!H, to include nitrogen in
children (no rules affected)
-reordered atoms in CsJ-OneDeNsCs. Doesn't change graph, but makes Adjlist
more obviously a subgroup of parent
-added nitrogen atoms N3s and N5s into set of possible atoms for CsJ-OneDe
-changed O to Os in OJ-Os. Adj list restricts the atom to Os (and it's in
the name) (also changed in 13 rules)
-changed some C atoms to Cd or Cs according to definition in
Cds-HH_Cds-Cs\H3/H. Also reordered pretty much the entire adj list so it
matched up with it's parent's ordering. (also changed in 3 rules)
-O_atom_triplet allows 2S and 2T. Changed it to only 2T.
-CH_quartet changed to from C 3Q --> Cs 3Q. There were two groups named
CH_quartet, I clarified one of them and deleted the other.
-Two definitions of Y_1centerbirad. Deleted one of them (kept one that
allows nitrogen)
-Removed group N3_atom_quartet which was a duplicate from merge (same as
N_atom_quartet)
-Deleted one instance of OJ-OneDe (they were identical)
-O_rad/OneDe was previously both renamed to OJ-OneDe and the adjlist was
changed to include nitrogen. However, the rule that uses it was created
using AGvandeputte's GAV value without nitrogen. As such, I've re-created
a group O_rad/OneDe with no nitrogen and made it a child of the more
general OJ-OneDe. Name is kinda non-descriptive though...Also gave it a
random index, which are all screwed up anyway.
- Same history and solution as above for O_rad/NonDe to OJ-NonDe. Also had
  to move some of children around to preserve correct relationships.

  Changes to tree:
-L6: Cds-NonDe2_N3d changed to L5. Clearly a typo based on positioning,
  name, and comparing adj list with relatives.
  -Added O_atom_singlet, NH_singlet, and CH2_singlet under Y_1centerbirad.
  They had definitions but did not appear in the tree
  -Changed level of CtJ_Ct and CtJ_N3t from L3: to L4. They actually are
  the child of one of their siblings CtJ! That sounds like the plot of a
  crappy novel.
After discussing with bbuesser and agvandeputte, we concluded a singlet
biradical would react much faster as a 1,2_cycloaddition than a regular
addition. There were no rules which used singlet biradicals (probably
because they do not react through this channel). I am only restricting
singlet biradicals for *3 atoms, I still allow them in R_R tree.

Groups were found by searching for the regex '\*3 .*2', which returned all
lines with *3 atoms that were birad (there were only 9 of them). Then I
manually searched them for biradical singlets.

Removed the following groups:
O_atom_singlet
CH2_singlet
NH_singlet

Changed *3 atom from 2 to 2T in the following groups:
Y_1centerbirad
CO_birad
There has been some confusion as to whether Cdd is a specific case of Cd.
In the current builds, they are exclusive. Thus many groups have had atoms
changed from Cd to {Cd, Cdd}. Here, I have copied over all the changes
which appeared in groups.py to the proper adj lists in rules.py. The
groups that changed were:

R5_SS_D (168 changes in rules)
R4_S_D (168 changes in rules)
R6_SMS_D (168 changes in rules)
R5_SD_D (1 change in rules)
R6_SSS_D (168 changes in rules)
R3_D (1 change in rules)

A few of the more general groups had the above change and nitrogen added
to their definition:

R3_D (1 change in rules)
top node, multiplebond_intra changed to include N and Cdd
-R6_RSR was missing a label, *6, on one of it's atoms. Its parent and all
it's children had the *6 label, so I'm fairly sure it was just a typo. (0
rules affected)

-carbonyl_intra atom changed from O {1,D} to Od {1,D}. Same fix for all
its children: carbonyl_intra_H, carbonyl_intra_Nd, carbonyl_intra_De (0
rules for all groups affected)
-For three rules: changed atom from C to Ct for group Ct_rad/Ct.

-For rule Ct_rad;O_Csrad, the original rule was for C2H + CH2OH --> C2H2 +
CH2O, but the group has been changed to be inclusive for nitrogen. Since
we have an original reaction, I decided to make a training reaction and
remove this rule.

I found the last error serendipitously because the rule's adjList was
mismatched with the groups. I don't have a systematic check for when rules
may not be applicable to expanding rules.
The following groups had their *1 and *2 atom changed from C to Cd. This
is always true based on their adjList. Unless stated, the group appeared
in 0 rules:

Cd/Nd2_Cd/Nd/De
Cd/Nd2_Cd/H/De
Cd/De2_Cd/H/Nd
Cd/H/Nd_Cd/H/Nd
Cd/H/De_Cd/Nd2
Cd/Nd/De_Cd/H/Nd
Cd/H/De_Cd/H/De
Cd/H/Nd_Cd/H/De
Cd/Nd2_Cd/Nd/De
Cd/H/De_Cd/H2
Cd/NdDe_Cd/H2
Cd/Nd2_Cd/De2
Cd/De2_Cd/H/De
Cd/H/Nd_Cd/Nd2
Cd/Nd/De_Cd/H/De
Cd/Nd/De_Cd/Nd2
Cd/H2_Cd/H/Nd (3 rules)
Cd/H2_Cd/H/De
Cd/De2_Cd/Nd2
Cd/Nd2_Cd/H/Nd
Cd/H2_Cd/De2
Cd/H/De_Cd/Nd/De
Cd/H2_Cd/Nd/De
Cd/Nd/De_Cd/De2
Cd/H/Nd_Cd/Nd/De
Cd/De2_Cd/De2
Cd/H2_Cd/Nd2
Cd/De2_Cd/H2
Cd/Nd2_Cd/H2
Cd/Nd2_Cd/Nd2
Cd/De2_Cd/Nd/De
Cd/H/De_Cd/H/Nd
Cd/H2_Cd/Cs2 (3 rules)
Cd/H/Nd_Cd/H/Os (1 rule)
Cd/H/Nd_Cd/H2 (2 rules)
C atoms changed to Cs in CSm;CH2CH3 rule to match the CH2CH3 group
-Moved a comment at the top to the longDesc of the nodes it affects. They
were originally referenced by index, but who knows when that might get
reshuffled/changed

-replaced C with Cs in all instances of 'carbene'

-Many groups had their *2 atom (therefore always in group2 for
group1;group2) changed from C to Cs or Cd to match the definition in
groups.py. Listed below is all the rules affected:
CO_birad;C_pri/NonDeC
carbene;C_pri/Ct
carbene;C_pri/Cd
carbene;Cd/H/OneDe
carbene;Cd_pri
carbene;C_pri/Ct
carbene;ethene (2 atoms changed from C to Cd)
CO_birad;C/H2/NonDeC
carbene;Cd/H/NonDeC
CO_birad;C_methane
First of many changes to H-Abstraction family found by the databaseTester.

-Group definition (groups.py) of C/H/CtCt somehow got mixed up to: '1 *3
R!H 2'. Changed back to correct definition based on its rules and java's
definition

-"Cd_pri" was previously changed to include N. A child underneath it
"Cd/H2/NonDeC", is it's previous definition without N. 48 rules had their
label changed from Cd_pri;X, to Cd/H2/NonDeC;X. I checked the source for
all the rules and they shouldn't have N.

-"Ct_pri" was previously changed to include N. A child underneath it
"Ct_rad/Ct"", is it's previous definition without N. 7 rules had their
label changed from X;Ct_pri, to X;Ct_rad/Ct. I checked the source for all
the rules, and they shouldn't have N.

-Also in the rules with Ct_rad/Ct, I changed C to Ct, which is true by
 definition

-Changed definition of "N3d/H/NonDeC" in the rules to match the group's
definition. The groups.py definition was more specific and corresponded
better with the label. (affected 5 rules)

-Created new group O_rad/OneDeC, underneath O_rad/OneDe because
O_rad/OneDe was expanded to include N, but there are rules that only use
C. Changed the name on the relevant rules, and moved some children
(O_rad/Cd and all it's children) underneath the new O_rad/OneDeC node

-Because of above change, O_rad/Cd and all it's children had C atoms
changed to Cd or {Cd,Cdd}
connie added a commit that referenced this pull request May 20, 2014
@connie connie merged commit a97bbd0 into ReactionMechanismGenerator:master May 20, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants