Crash when copying window - caused by QT timer running twice?

A forum dedicated to the Draft, Arch and BIM workbenches development.
Forum rules
Be nice to others! Respect the FreeCAD code of conduct!
Post Reply
matthijskooijman
Posts: 72
Joined: Thu Mar 25, 2021 10:59 am

Crash when copying window - caused by QT timer running twice?

Post by matthijskooijman »

While working on some arch project, I tried copying a window using the draft move/copy tool. This worked before, but after some changes to the window, it stopped working. I suspect the issue was triggered by having a more complicated underlying sketch, which takes longer to copy.

To reproduce:
1. Open the attached file
2. Select the window object
3. Use the draft copy tool to copy the window somewhere (anywhere)
4. Crash

See the complete in the attachment. From the backtrace and looking at the code, I suspect the following happens:

1. Draft's move_object queues a Draft.move command: https://github.com/FreeCAD/FreeCAD/blob ... #L266-L287
2. The GUI tracker adds a delayed call to _removeSwitch using ToDo.delay, which creates a QT timer to process all queued ToDo items later: https://github.com/FreeCAD/FreeCAD/blob ... ers.py#L85 and https://github.com/FreeCAD/FreeCAD/blob ... do.py#L232
2. The actual Draft.move call also gets delayed using ToDo.delayCommit, which creates a *second* QT timer to process all queued ToDo items later:
https://github.com/FreeCAD/FreeCAD/blob ... al.py#L194 and https://github.com/FreeCAD/FreeCAD/blob ... do.py#L261
3. The first QT timer fires and through doCommand the move is executed
4. The move takes a while, so the GUI keeps a progress bar and updates it.
5. When the move takes > 100ms, the progress bar runs QT's processEvents to keep the GUI responsive: https://github.com/FreeCAD/FreeCAD/blob ... r.cpp#L277
6. This triggers the second QT timer, which starts the Draft.Move a second time.
7. Both moves create a App::ExpressionParser::ExpressionImporter instance, but that is intended to be instantiated only once at any time, so this raises an assertion failure.

To trigger this bug, you need to have a queued draft operation that takes sufficiently long for the 100ms processEvents call to trigger, which is why it does not seem to trigger on a simple door, but does trigger on this door with a more complicated sketch (with a bunch of constraints that seem to need a while to settle).

I originally thought that the same timer might be firing twice and this was a QT bug, but I noticed that the "receiver" arguments in the backtrace were different, so with some debug output I verified that indeed two timers were created.

So the actual cause of this bug is in the ToDo class: There are three places that can add a timer (delay, delayCommit and delayAfter) and delay and delayAfter do try to not add a second timer, but only check their own queue (so delay won't add a second timer, but delay *and* delayAfter can both add their own timer, and delayCommit can add any number of duplicate timers). And since the handler for all these timers processes *all* queued events and also does not clear the list of queued events until all events have completed, this allows multiple timers to be scheduled simultaneously, and allows a single queued command to be ran twice.

Fortunately, I think this is actually easy to fix and it can probably be fixed inside the ToDo class completely. I think it should rather than checking if anything is queued in ToDo.itinerary or ToDo.afteritinerary to prevent adding a second timer, just keep a boolean value that says a timer is scheduled, and clear it when the timer finished running. This should ensure that only one timer is ever added, and the doTasks method called by the timer never runs re-entrantly. I think this should already fix the bug.

There is one somewhat subtle corner case: What happens when new items would be added to the ToDo queues while the doTasks method is running? I'm not sure if that happens now, but it would be good if a generic and global thing like ToDo would support this. Since ToDo keeps three lists, which are executed in turn, it is particularly tricky when e.g. adding an item to the first list, while the second list is already being executed. If no additional measures are taken, this means that the item will end up in the first list but not executed until *another* GUI action does a run of ToDo items - not very good.

To fix this, I think that when doTasks starts, it should make copies of all three lists, clear the original lists, and then start executing the copies of these lists. If now another thing is scheduled, it will be added to the now-empty lists for later. To ensure that these new things are actually ran, when doTasks is done, it should check if the lists are still empty and if not, schedule another timer callback (or maybe just run itself again).

I'll see if I can make a PR for this tomorrow.
Attachments
Window copy crash.FCStd
(31.12 KiB) Downloaded 41 times
backtrace.txt
(39.22 KiB) Downloaded 34 times
Last edited by matthijskooijman on Sun Jun 13, 2021 7:41 am, edited 1 time in total.
matthijskooijman
Posts: 72
Joined: Thu Mar 25, 2021 10:59 am

Re: Crash when copying window - caused by QT timer running twice?

Post by matthijskooijman »

Hm, maybe this should be moved to the Draft/Arch category. When I started writing this, I suspected a generic problem in the handling of QT timers and/or Document-wide copying, but now it seems it's really a Draft-specific bug after all.
chrisb
Veteran
Posts: 53786
Joined: Tue Mar 17, 2015 9:14 am

Re: Crash when copying window - caused by QT timer running twice?

Post by chrisb »

matthijskooijman wrote: Sat Jun 12, 2021 8:41 pm Hm, maybe this should be moved to the Draft/Arch category.
Done.
A Sketcher Lecture with in-depth information is available in English, auf Deutsch, en français, en español.
matthijskooijman
Posts: 72
Joined: Thu Mar 25, 2021 10:59 am

Re: Crash when copying window - caused by QT timer running twice?

Post by matthijskooijman »

chrisb wrote: Sat Jun 12, 2021 8:57 pm Done.
Thanks!

I've created a pull-request to fix this issue, as well as doing some cleanup of the ToDo class: https://github.com/FreeCAD/FreeCAD/pull/4859

It's still a draft that needs some more work (see PR description for details), but it is already fairly complete, so I wanted to share it already.
carlopav
Veteran
Posts: 2062
Joined: Mon Dec 31, 2018 1:49 pm
Location: Venice, Italy

Re: Crash when copying window - caused by QT timer running twice?

Post by carlopav »

Just a question. I think you are right when writing that this is a very peculiar Draft problem. If I remember correctly it's the only module that uses this procedure to queue the commands to be executed.
In Arch yorik changed the style and used directly "Gui.doCommand()" inside every command, without the need of another function to manage the queue.
So comes the question:
are we just keeping alive an outdated and too peculiar way of handling commands? Is there any difference between the 2 approaches that worth keeping this alive?
follow my experiments on BIM modelling for architecture design
matthijskooijman
Posts: 72
Joined: Thu Mar 25, 2021 10:59 am

Re: Crash when copying window - caused by QT timer running twice?

Post by matthijskooijman »

carlopav wrote: Sun Jun 13, 2021 12:12 pm Just a question. I think you are right when writing that this is a very peculiar Draft problem. If I remember correctly it's the only module that uses this procedure to queue the commands to be executed.
Yeah, that is a good point. I had wondered a bit about this, but then thought maybe it is because Draft has most GUI in Python, while other workbenches use C++ mostly?
carlopav wrote: Sun Jun 13, 2021 12:12 pm In Arch yorik changed the style and used directly "Gui.doCommand()" inside every command, without the need of another function to manage the queue.
Indeed, seems you are right. That makes me wonder why it was needed for draft in the first place...

Note that Arch *does* use `todo.delay`, just not `todo.commitDelay` (and because it commits right away, also doesn't seem to need `todo.delayAfter`, which I think is used mostly to restart the UI after committing, i.e. "continue" checkbox).
carlopav wrote: Sun Jun 13, 2021 12:12 pm are we just keeping alive an outdated and too peculiar way of handling commands? Is there any difference between the 2 approaches that worth keeping this alive?
Even though I'm happy with my changes and cleaning up and simplifying code is good, throwing away code is of course better (though backward compatibility is a thing, of course...).
matthijskooijman
Posts: 72
Joined: Thu Mar 25, 2021 10:59 am

Re: Crash when copying window - caused by QT timer running twice?

Post by matthijskooijman »

I did a quick test making ToDo execute things immediately instead of queuing them, and clicked around a bit and noticed that creating wires and moving them worked fine, but doing a rotation crashed indeed in some Coin code (see backtrace in attachment, made with Coin 4.0.0 from Ubuntu). I haven't investigated further, but it seems that the problem that ToDo was intended to solve does indeed still exists. However, I do notice that there is no python code in the backtrace, so that suggests that the commit itself does not crash, but maybe the problem is that changes are made to the 3D view inside the event handler, which cause the event handler to later crash when it continues with the next handler or something like that?

If the problem does indeed still exist, then I wonder why it doesn't happen in Arch. Maybe the problem only occurs with particular operations that Arch does not use, or maybe Arch has some other mechanism to prevent this problem somehow? Also, some insight into what the exact problem that ToDo intends to solve would be helpful too.
yorik wrote: @Yorik, any comment on this?
Attachments
gdb.txt
(7.26 KiB) Downloaded 33 times
Post Reply