Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: src/db/db/dbPCellHeader.cc,137,v != m_variant_map.end () #1782

Closed
lukasc-ubc opened this issue Jul 8, 2024 · 8 comments · Fixed by #1791
Closed

ERROR: src/db/db/dbPCellHeader.cc,137,v != m_variant_map.end () #1782

lukasc-ubc opened this issue Jul 8, 2024 · 8 comments · Fixed by #1791
Assignees
Labels
Milestone

Comments

@lukasc-ubc
Copy link

Hi @klayoutmatthias

I am receiving this error after running a long script that generates a large layout. I am running the script using an external python environment, using "import pya". The error fortunately occurs after I save my layout, upon completion of the script. I have seen it come and go for the past few hours. I'm sorry I don't have a minimum example to share that reproduces this error. I am hoping you may have some suggestion on how to debug. The script includes calling PCells that I have created.

ERROR: src/db/db/dbPCellHeader.cc,137,v != m_variant_map.end ()

thank you

@klayoutmatthias
Copy link
Collaborator

Hi Lukas,

the message basically means that there is a second incarnation of the same PCell (same parameters) in the library. This should not happen as the PCellHeader manages the variants.

You may be able to spot this case by looking at the raw cell names in the layout and checking if there are two raw cells for the same parameter set.

One possible scenario which may trigger the problem is re-registration of a PCell. So if a library for some reason re-registers a PCell while there are PCell instances already present for the original PCell, this error could happen upon cleanup of the layout.

There is no log one could easily enable to detect this problem.

The code responsible for the above scenario is in Layout::register_pcell (dbLayout.cc:2527). I'd throw an exception there and see if that triggers somewhere. It is does, there is a case where a PCell is registered again with the same name in the same library.

I have to investigate this scenario. Actually I think this may be responsible for some issues we find when developing PCells inside the application.

However, this is only a hypothesis. It may be something else as well.

Matthias

@lukasc-ubc
Copy link
Author

Thanks Matthias.

My script only loads the library once, and I don't re-register it.

But indeed, I issue ly.create_cell many times, and there are repeats with the same parameters. The curious thing is that my first loop generates 17 cells, and those 17 are repeated in a second loop which generates 66 cells. Yet, I receive the error message 5 times when closing.

I'll keep an eye out for this, and see if it comes up again, ideally in a smaller project.

@klayoutmatthias
Copy link
Collaborator

klayoutmatthias commented Jul 9, 2024

Hi Lukas,

Thanks for the explanation. Could you elaborate a little on what you mean by "those 17 are repeated in a second loop"? Does that mean a (deep) copy or just instances of that cell? Or something else?

For now I was able to reproduce the problem with a code that deliberately re-registers a PCell:

# test.py
import pya
import math

class Circle(pya.PCellDeclarationHelper):
  def __init__(self):
    super(Circle, self).__init__()
    self.param("l", self.TypeLayer, "Layer", default = pya.LayerInfo(1, 0))
    self.param("r", self.TypeDouble, "Radius", default = 1.0)
    self.param("n", self.TypeInt, "Number of points", default = 16)     

  def display_text_impl(self):
    return "Circle(L=" + str(self.l) + ",R=" + ('%.3f' % self.r) + ")"
  
  def produce_impl(self):
    da = math.pi * 2 / self.n
    pts = [ pya.DPoint(self.r * math.cos(i * da), self.r * math.sin(i * da)) for i in range(0, self.n) ]
    self.cell.shapes(self.l_layer).insert(pya.DPolygon(pts))

class CircleLib(pya.Library):
  def __init__(self, name):
    self.description = "Circle Library"
    self.layout().register_pcell("Circle", Circle())
    self.register(name)

  def reregister_pcell(self):
    self.layout().register_pcell("Circle", Circle())

lib = CircleLib("CircleLib")

ly = pya.Layout()

top = ly.create_cell("TOP")

c = ly.create_cell("Circle", "CircleLib", { "l": pya.LayerInfo(1, 0), "r": 2.0, "n": 32 })
top.insert(pya.DCellInstArray(c, pya.DTrans(0.0, 0.0)))

# This triggers
# ERROR: ../../../src/db/db/dbPCellHeader.cc,137,v != m_variant_map.end ()
lib.reregister_pcell()

c = ly.create_cell("Circle", "CircleLib", { "l": pya.LayerInfo(1, 0), "r": 2.0, "n": 32 })
top.insert(pya.DCellInstArray(c, pya.DTrans(0.0, 10.0)))

ly.write("out.gds")
print("Layout written.")

Which gives:

matthias@beast:~/klayout/testdata/issue-1782$ klayout -b -r test.py
Layout written.
ERROR: ../../../src/db/db/dbPCellHeader.cc,137,v != m_variant_map.end ()
terminate called after throwing an instance of 'tl::InternalException'
addr2line: 'klayout': No such file
ERROR: Signal number: 6
Address: 0x3e800110d27
Program Version: KLayout 0.29.1 (2024-05-04 rf95aef89d)

Backtrace:
/usr/lib/klayout/libklayout_lay.so.0 +0x2ef032 lay::enable_signal_handler_gui(bool) [??:?]
/lib/x86_64-linux-gnu/libc.so.6 +0x42520 __restore_rt [libc_sigaction.c:?]
/lib/x86_64-linux-gnu/libc.so.6 +0x969fc __pthread_kill_implementation [pthread_kill.c:44]
...

The crash is due to an uncaught exception - the error happens in the Python shutdown code which is outside exception handling.

But being able to reproduce it, does not mean I know what is going on :(

I can basically make re-registration a valid operation and turn the assertion into a warning. That should prevent the crash. That might not be the real cure however.

Matthias

@lukasc-ubc
Copy link
Author

Hi Matthias,

  1. "repeated in a second loop” — what I am doing is calling cell = ly.create_cell(…, …, {…}) multiple times, and some of the times the same parameters appear in the call. Then I instantiate the cell. I am not doing any deep copying.

  2. Your crash is much more severe than mine. See screenshot — mine is only an error message.

image

@lukasc-ubc
Copy link
Author

Hi Matthias,

I discovered the source of the error.

We discovered this when two PCell calls resulted in the same output, despite having different inputs. One PCell call had a parameter with a numerical value, while the other had the same parameter being a float('nan'). The PCell was expecting a self.TypeDouble, and Python is happy with 'nan' and 'inf'.

I was passing a PCell parameter:
float('nan')
when I wanted to skip a certain feature.

Inside the PCell, I would check. But I notice that you can't compare:

>>> float('nan') == float('nan')
False

So I think the PCell wrapper wasn't handling nan correctly. I replace 'nan' with 1e6 and check for that instead, and now the errors are gone.

I also checked for infinity, float('inf'), and it also produces the same errors. So inf is also not handled properly.

Perhaps you could add a check in PCell declaration helper for self.TypeDouble, to disallow 'nan' and 'inf'? Unless there is a way to enable 'nan'?

thank you

@klayoutmatthias
Copy link
Collaborator

Hi Lukas,

thanks for this analysis. I guess that pretty well explains the weird effects. I never looked into the compare behavior of NaN in C++, but al least, strict weak ordering - as mandated by the STL sets and maps I use - is probably not the given.

I will try to debug the problem. It is probably easy to fix.

Best regards,

Matthias

@klayoutmatthias
Copy link
Collaborator

I tried to reproduce the problem with "nan", but without much success. I can confirm however, that different parameter sets give the same cell and maybe that is producing issues once the std::map structure is broken.

So I think that is enough for a tentative fix.

BTW: I noticed, that the GDS file written with the "nan" parameter is broken - reading it errors out with

ERROR: Expected a real number here: nan

So NaN isn't a good idea in general.

Basically, there is not much type checking between PCell client script and PCell code: if you pass parameters by script, you can essentially use "None" as a value of "TypeDouble" and receive "None" inside you code. However, when you edit such a layout, you will see a value of "0" instead of "None". All that is needed is some "optional" attribute for the PCell parameters and there could be an empty edit box for "None" in that case.

Best regards,

Matthias

@klayoutmatthias
Copy link
Collaborator

Got it. Here is my code to reproduce it:


import pya
import math
import random

class Circle(pya.PCellDeclarationHelper):
  def __init__(self):
    super(Circle, self).__init__()
    self.param("l", self.TypeLayer, "Layer", default = pya.LayerInfo(1, 0))
    self.param("r", self.TypeDouble, "Radius", default = 1.0)
    self.param("n", self.TypeInt, "Number of points", default = 16)     

  def display_text_impl(self):
    r = self.r
    if r is None:
      r = "nil"
    else:
      r = '%.3f' % r
    return "Circle(L=" + str(self.l) + ",R=" + r + ")"
  
  def produce_impl(self):
    r = self.r
    if str(self.r) == 'nan':
      r = 2.0
    da = math.pi * 2 / self.n
    pts = [ pya.DPoint(r * math.cos(i * da), r * math.sin(i * da)) for i in range(0, self.n) ]
    self.cell.shapes(self.l_layer).insert(pya.DPolygon(pts))

class CircleLib(pya.Library):
  def __init__(self, name):
    self.description = "Circle Library"
    self.layout().register_pcell("Circle", Circle())
    self.register(name)

  def reregister_pcell(self):
    self.layout().register_pcell("Circle", Circle())

lib = CircleLib("CircleLib")

ly = pya.Layout()

top = ly.create_cell("TOP")

for i in range(0, 5):
  for j in range(0, 5):

    if random.random() > 0.5:
      r = float('nan')
    else:
      r = random.random() * 3
    n = int(random.random() * 25 + 8)

    c = ly.create_cell("Circle", "CircleLib", { "l": pya.LayerInfo(1, 0), "r": r, "n": n })
    top.insert(pya.DCellInstArray(c, pya.DTrans(i * 10.0, j * 10.0)))

ly.write("out.gds")
print("Layout written.")

ly._destroy

And again I get the Abort signal because of the uncaught exception.

Anyway, there is enough material for debugging. I think the NaN's and inf's need special care. And I will also address the re-registration issue as this is possibly responsible for program crashes during PCell development.

Thanks for your help and for reporting the issue.

Best regards,

Matthias

@klayoutmatthias klayoutmatthias self-assigned this Jul 11, 2024
@klayoutmatthias klayoutmatthias added this to the 0.29.5 milestone Jul 11, 2024
klayoutmatthias pushed a commit that referenced this issue Jul 13, 2024
This patch establishes "nan", "inf" and "-inf" as
valid values for tl::Variant, so corresponding
PCell parameters can be serialized and are
properly managed.
@klayoutmatthias klayoutmatthias linked a pull request Jul 13, 2024 that will close this issue
klayoutmatthias added a commit that referenced this issue Jul 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants