While working on some arch project, I tried copying a window using the draft move/copy tool. This worked before, but after some changes to the window, it stopped working. I suspect the issue was triggered by having a more complicated underlying sketch, which takes longer to copy.
To reproduce:
1. Open the attached file
2. Select the window object
3. Use the draft copy tool to copy the window somewhere (anywhere)
4. Crash
See the complete in the attachment. From the backtrace and looking at the code, I suspect the following happens:
1. Draft's move_object queues a Draft.move command: https://github.com/FreeCAD/FreeCAD/blob ... #L266-L287
2. The GUI tracker adds a delayed call to _removeSwitch using ToDo.delay, which creates a QT timer to process all queued ToDo items later: https://github.com/FreeCAD/FreeCAD/blob ... ers.py#L85 and https://github.com/FreeCAD/FreeCAD/blob ... do.py#L232
2. The actual Draft.move call also gets delayed using ToDo.delayCommit, which creates a *second* QT timer to process all queued ToDo items later:
https://github.com/FreeCAD/FreeCAD/blob ... al.py#L194 and https://github.com/FreeCAD/FreeCAD/blob ... do.py#L261
3. The first QT timer fires and through doCommand the move is executed
4. The move takes a while, so the GUI keeps a progress bar and updates it.
5. When the move takes > 100ms, the progress bar runs QT's processEvents to keep the GUI responsive: https://github.com/FreeCAD/FreeCAD/blob ... r.cpp#L277
6. This triggers the second QT timer, which starts the Draft.Move a second time.
7. Both moves create a App::ExpressionParser::ExpressionImporter instance, but that is intended to be instantiated only once at any time, so this raises an assertion failure.
To trigger this bug, you need to have a queued draft operation that takes sufficiently long for the 100ms processEvents call to trigger, which is why it does not seem to trigger on a simple door, but does trigger on this door with a more complicated sketch (with a bunch of constraints that seem to need a while to settle).
I originally thought that the same timer might be firing twice and this was a QT bug, but I noticed that the "receiver" arguments in the backtrace were different, so with some debug output I verified that indeed two timers were created.
So the actual cause of this bug is in the ToDo class: There are three places that can add a timer (delay, delayCommit and delayAfter) and delay and delayAfter do try to not add a second timer, but only check their own queue (so delay won't add a second timer, but delay *and* delayAfter can both add their own timer, and delayCommit can add any number of duplicate timers). And since the handler for all these timers processes *all* queued events and also does not clear the list of queued events until all events have completed, this allows multiple timers to be scheduled simultaneously, and allows a single queued command to be ran twice.
Fortunately, I think this is actually easy to fix and it can probably be fixed inside the ToDo class completely. I think it should rather than checking if anything is queued in ToDo.itinerary or ToDo.afteritinerary to prevent adding a second timer, just keep a boolean value that says a timer is scheduled, and clear it when the timer finished running. This should ensure that only one timer is ever added, and the doTasks method called by the timer never runs re-entrantly. I think this should already fix the bug.
There is one somewhat subtle corner case: What happens when new items would be added to the ToDo queues while the doTasks method is running? I'm not sure if that happens now, but it would be good if a generic and global thing like ToDo would support this. Since ToDo keeps three lists, which are executed in turn, it is particularly tricky when e.g. adding an item to the first list, while the second list is already being executed. If no additional measures are taken, this means that the item will end up in the first list but not executed until *another* GUI action does a run of ToDo items - not very good.
To fix this, I think that when doTasks starts, it should make copies of all three lists, clear the original lists, and then start executing the copies of these lists. If now another thing is scheduled, it will be added to the now-empty lists for later. To ensure that these new things are actually ran, when doTasks is done, it should check if the lists are still empty and if not, schedule another timer callback (or maybe just run itself again).
I'll see if I can make a PR for this tomorrow.
Crash when copying window - caused by QT timer running twice?
Forum rules
Be nice to others! Respect the FreeCAD code of conduct!
Be nice to others! Respect the FreeCAD code of conduct!
-
- Posts: 72
- Joined: Thu Mar 25, 2021 10:59 am
Crash when copying window - caused by QT timer running twice?
- Attachments
-
- Window copy crash.FCStd
- (31.12 KiB) Downloaded 41 times
-
- backtrace.txt
- (39.22 KiB) Downloaded 34 times
Last edited by matthijskooijman on Sun Jun 13, 2021 7:41 am, edited 1 time in total.
-
- Posts: 72
- Joined: Thu Mar 25, 2021 10:59 am
Re: Crash when copying window - caused by QT timer running twice?
Hm, maybe this should be moved to the Draft/Arch category. When I started writing this, I suspected a generic problem in the handling of QT timers and/or Document-wide copying, but now it seems it's really a Draft-specific bug after all.
Re: Crash when copying window - caused by QT timer running twice?
Done.matthijskooijman wrote: ↑Sat Jun 12, 2021 8:41 pm Hm, maybe this should be moved to the Draft/Arch category.
A Sketcher Lecture with in-depth information is available in English, auf Deutsch, en français, en español.
-
- Posts: 72
- Joined: Thu Mar 25, 2021 10:59 am
Re: Crash when copying window - caused by QT timer running twice?
Thanks!
I've created a pull-request to fix this issue, as well as doing some cleanup of the ToDo class: https://github.com/FreeCAD/FreeCAD/pull/4859
It's still a draft that needs some more work (see PR description for details), but it is already fairly complete, so I wanted to share it already.
Re: Crash when copying window - caused by QT timer running twice?
Just a question. I think you are right when writing that this is a very peculiar Draft problem. If I remember correctly it's the only module that uses this procedure to queue the commands to be executed.
In Arch yorik changed the style and used directly "Gui.doCommand()" inside every command, without the need of another function to manage the queue.
So comes the question:
are we just keeping alive an outdated and too peculiar way of handling commands? Is there any difference between the 2 approaches that worth keeping this alive?
In Arch yorik changed the style and used directly "Gui.doCommand()" inside every command, without the need of another function to manage the queue.
So comes the question:
are we just keeping alive an outdated and too peculiar way of handling commands? Is there any difference between the 2 approaches that worth keeping this alive?
follow my experiments on BIM modelling for architecture design
-
- Posts: 72
- Joined: Thu Mar 25, 2021 10:59 am
Re: Crash when copying window - caused by QT timer running twice?
Yeah, that is a good point. I had wondered a bit about this, but then thought maybe it is because Draft has most GUI in Python, while other workbenches use C++ mostly?
Indeed, seems you are right. That makes me wonder why it was needed for draft in the first place...
Note that Arch *does* use `todo.delay`, just not `todo.commitDelay` (and because it commits right away, also doesn't seem to need `todo.delayAfter`, which I think is used mostly to restart the UI after committing, i.e. "continue" checkbox).
Even though I'm happy with my changes and cleaning up and simplifying code is good, throwing away code is of course better (though backward compatibility is a thing, of course...).
-
- Posts: 72
- Joined: Thu Mar 25, 2021 10:59 am
Re: Crash when copying window - caused by QT timer running twice?
I did a quick test making ToDo execute things immediately instead of queuing them, and clicked around a bit and noticed that creating wires and moving them worked fine, but doing a rotation crashed indeed in some Coin code (see backtrace in attachment, made with Coin 4.0.0 from Ubuntu). I haven't investigated further, but it seems that the problem that ToDo was intended to solve does indeed still exists. However, I do notice that there is no python code in the backtrace, so that suggests that the commit itself does not crash, but maybe the problem is that changes are made to the 3D view inside the event handler, which cause the event handler to later crash when it continues with the next handler or something like that?
If the problem does indeed still exist, then I wonder why it doesn't happen in Arch. Maybe the problem only occurs with particular operations that Arch does not use, or maybe Arch has some other mechanism to prevent this problem somehow? Also, some insight into what the exact problem that ToDo intends to solve would be helpful too.
If the problem does indeed still exist, then I wonder why it doesn't happen in Arch. Maybe the problem only occurs with particular operations that Arch does not use, or maybe Arch has some other mechanism to prevent this problem somehow? Also, some insight into what the exact problem that ToDo intends to solve would be helpful too.
yorik wrote: @Yorik, any comment on this?
- Attachments
-
- gdb.txt
- (7.26 KiB) Downloaded 33 times