[Fixed] Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Here's the place for discussion related to coding in FreeCAD, C++ or Python. Design, interfaces and structures.
chrisb
Posts: 19635
Joined: Tue Mar 17, 2015 9:14 am

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby chrisb » Mon May 01, 2017 2:26 pm

FreeCAD does not rely on any order?
abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Mon May 01, 2017 8:20 pm

chrisb wrote:FreeCAD does not rely on any order?
It seems to me that you have provided proof it doesn't ;)
chrisb
Posts: 19635
Joined: Tue Mar 17, 2015 9:14 am

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby chrisb » Mon May 01, 2017 9:01 pm

abdullah wrote:
chrisb wrote:FreeCAD does not rely on any order?
It seems to me that you have provided proof it doesn't ;)
Well, I moved some parts to a place known from another working version. So I wouldn't really call this a proof, but I will take it for granted until someone prooves the opposite :mrgreen: .
abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Mon May 22, 2017 8:36 pm

I came back to this:

https://github.com/FreeCAD/FreeCAD/pull/771

I would appreciate some discussion about the solution and intended behaviour.

Until now, when using recompute, on:
1. Redundant constraints
2. Conflicting constraints
3. Solver failure

There is no update of the sketcher geometry. Up to some point, I would be happy with this, as if something is not ok, it is best to fix it first and then continue. Specially as with recompute you will be recomputing anything depending on the sketch (should anything exist). However, not showing added geometry is not nice either.

When not using recompute, solve() updates the geometry except in the case of solver failure. In the latter the situation would be similar as what happens in this bug today.

The PR above (not intending for merge), includes just a geometry update in the three cases.

The objective of this post is to determine what is what we want to do:
1. No geometry update until the problem is solved => no PR => perceived bug is a feature.
2. Geometry update with Redundant and Conflicting, as with solve(), but not with solver failure => modify PR => the perceived bug will happen in the more unlikely case of solver failure.
3. Update Geometry in all three cases => PR as it is now.

I am not sure if this will bring back unwanted behaviours in other cases, as there are many different cases that run execute and I do not have the expectations of every single case in my head.

Let me know.
abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Sat May 27, 2017 10:30 am

abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Sat May 27, 2017 10:40 am

abdullah wrote:To be reconsidered in view of:
https://freecadweb.org/tracker/view.php?id=2836
The solution is not the one proposed in the PR.

#2836 is a generalisation of the problem in #2710. A good solution for #2836 would solve #2710 and probably other difficult to debug crashes.

The problem lays in the Sketcher and the recompute mechanism. The Sketcher works by keeping synchronised the solver geometry with the 3D view geometry. Lack of synchronisation ends up in crash due to SoNode indexing.

I agree that SoNode implementation should be more robust and not crash for any reason. Arguably, all the problems could be solved by making that implementation more robust, as then there will be no crash and eventually the user will solve the problem (or die trying to). However, it seems like a good idea trying to maintain solver and 3D view in sync (so long the approach we have followed in the Sketcher).

Recompute and out of sync

Recompute returns at the first chance that recompute is not successful.

In #2710, it is a premature return in SketchObject::execute() due to redundant/conflicting constraints.

In #2836, it is even more premature, as SketchObject::execute is not even run, as the recompute process stops before that, at Document::Recompute() at:

Code: Select all

 try {
        // this sort gives the execute
        boost::topological_sort(d->DepList, std::front_inserter(make_order));
    }
    catch (const std::exception& e) {
        std::cerr << "Document::recompute: " << e.what() << std::endl;
        return -1;
    }
I think we need a general mechanism to tidy up objects when recompute is interrupted. Something like a signal onRecomputeFailed() children can subscribe to or something similar. I do not mind coding one, but I would appreciate some advice "from above" before I code an unacceptable something.
abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Sat May 27, 2017 9:43 pm

Ok, I have been working further in this issue and have implemented a general mechanism.

This solves both #2710 and #2836.

I think it is quite well explained in the commit message, so I just copy it here. Let me know if this PR is not acceptable:

https://github.com/FreeCAD/FreeCAD/pull/783

Code: Select all


    App:Document/DocumentObject onRecomputeFailed implementation
    
    ============================================================
    
    Why?
    
    Recompute returns on first "problem". This may leave some objects in an unwanted state (for example the sketches).
    
    Mechanism explanation:
    
    Document Layer:
    1. If a general recompute failure, for example a non-DAG dependency happens, then on RecomputeFailed is executed for the current document. This means that:
    a) For each documentobject in the document, it is checked whether must execute flag is set, IF SET, then the documentobject layer onRecomputeFailed() is executed
    2. If a feature recompute returns true (this means that the recompute process must be interrupted), then a) is executed.
    
    DocumentObject Layer:
    1. If the recompute of a feature fails, but it fails in a way that "false" is returned (this means the recompute process is not interrupted), then:
    b) onRecomputeFailed is unconditionally executed for that feature.
    
    DocumentObject level onRecomputeFailed is virtual and empty. It must be overriden if the documentobject must do anything special when recomputing fails.

wmayer
Site Admin
Posts: 14999
Joined: Thu Feb 19, 2009 10:32 am

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby wmayer » Thu Jun 01, 2017 9:45 pm

#2836 is a generalisation of the problem in #2710. A good solution for #2836 would solve #2710 and probably other difficult to debug crashes.
Why do you consider 2836 (btw did you succeed in reproducing the crash?) a generalisation of 2710? To me these are two completely different problems with different causes. 2836 fails to recompute because the graph is not a DAG and thus doesn't even start a recompute but exits immediately. In 2710 the graph is a valid DAG but the recompute of the sketch stops because it handles the redundant constraints as error but leaves the sketch object in an inconsistent state.
The problem lays in the Sketcher and the recompute mechanism.
No, the issue described in 2710 is a sketch-specific issue and has absolutely nothing to do with the recompute mechanism. In 2836 we have a broken linkage of objects of the document. Here the question would be to try at least to recompute the part of the document with correct links. But that's another story.
The Sketcher works by keeping synchronised the solver geometry with the 3D view geometry. Lack of synchronisation ends up in crash due to SoNode indexing.
That's exactly the point. It's up to the sketch object itself to make sure it's in a consistent state. If for any reason it cannot guarantee this in all cases then the methods accessing the geometry/constraints (in this case when updating the SoNodes) should take care of this fact and thus has to check that it can safely access the index values.
I agree that SoNode implementation should be more robust and not crash for any reason. Arguably, all the problems could be solved by making that implementation more robust, as then there will be no crash and eventually the user will solve the problem (or die trying to).
Exactly.
Recompute returns at the first chance that recompute is not successful.
No, that's wrong. The recompute only stops for severe exception like when running out of memory. But for less critical exceptions it continues to recompute the whole document.
I think we need a general mechanism to tidy up objects when recompute is interrupted.
Definitely not. When an object's execute() method is called it must make sure that even if it's in an erroneous state it's not causing a crash.
Something like a signal onRecomputeFailed() children can subscribe to or something similar.
Inside this method you don don't even know in which context an exception occurred and thus don't know what exactly happened and what to clean-up.
I think it is quite well explained in the commit message, so I just copy it here. Let me know if this PR is not acceptable:
Having the new method onRecomputeFailed() weakens the current design of document recompute and opens a whole can of worms mainly because we "hide" the actual problem instead of fixing it.
In bug 2710 we have a sketch-only problem but affect the recompute mechanism. In bug 2836 we have more or less the opposite where each object considered individually is/can be in a perfect consistent state and then a call of onRecomputeFailed() is useless.

So, IMO the first step to fix the crash is harden the code that is responsible for the crash. The next step would be to avoid to let a sketch become inconsistent where possible. As far as I can see the sketch only becomes inconsistent when in the task panel the option for automatic recompute is active. So, this should be doable with changing the code from

Code: Select all

if (autoRecompute)
    Gui::Command::updateActive();
else
    static_cast<Sketcher::SketchObject *>(sketchgui->getObject())->solve();            
to

Code: Select all

static_cast<Sketcher::SketchObject *>(sketchgui->getObject())->solve();            
if (autoRecompute)
    Gui::Command::updateActive();

As said above in 2836 the recompute already stops before it has started. Here we can think about a solution to ignore the objects that cause the non-DAG issue and recompute all other objects. Or we simply call the recompute for each object in a linear order.
abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Fri Jun 02, 2017 9:59 am

Deleted. Wrong information due to expiry of the authentication.
Last edited by abdullah on Fri Jun 02, 2017 10:00 am, edited 1 time in total.
abdullah
Posts: 3174
Joined: Sun May 04, 2014 3:16 pm

Re: Crash Bug #2710 Freecad doesn't show sketch elements added when overconstrained

Postby abdullah » Fri Jun 02, 2017 9:59 am

wmayer wrote:
Thu Jun 01, 2017 9:45 pm
Why do you consider 2836 (btw did you succeed in reproducing the crash?) a generalisation of 2710? To me these are two completely different problems with different causes. 2836 fails to recompute because the graph is not a DAG and thus doesn't even start a recompute but exits immediately. In 2710 the graph is a valid DAG but the recompute of the sketch stops because it handles the redundant constraints as error but leaves the sketch object in an inconsistent state.
Probably the choice of wording was not the best. They are two related cases to which a general solution can be applied so as to solve both. It is not that an error is a generalisation of the other error, but at least one general solution to the ticket 2710 also solves 2836. In any case, they bugs themselves are related at least because they share some commonalities (both arise during the recompute mechanism, both relate to current limitations to the recompute mechanism, both ultimately leave the sketch in an out of sync situation), which is what allows for the general solution.

To the question: I can not remember if I was able to reproduce the crash. I think it was just a problem with non appearing geometry.
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
The problem lays in the Sketcher and the recompute mechanism.

No, the issue described in 2710 is a sketch-specific issue and has absolutely nothing to do with the recompute mechanism. In 2836 we have a broken linkage of objects of the document. Here the question would be to try at least to recompute the part of the document with correct links. But that's another story.
Well, either we are working with different definitions of the same reality or I must disagree.

To the first: it is caused by operating by the choice of returning control directly after the first error when recomputing a DocumentObject. I do not know if this was a decision made by design (I was not here at the time), but it is indeed a pattern that can be seen in most execute() methods of DocumentObject derivatives. If there was not such decision (and therefore there is no such limitation, i.e. to must stop execute on the first error), then admits a higher amount of solutions. In any case, I agree that this ticket can be solved by a sketcher-only solution, only it is a non-sense to do it so, if another mechanism is needed for the other ticket that could solve both. It was done so in the original PR:

https://github.com/FreeCAD/FreeCAD/pull/771/files

To the second: the execute() of each DocumentObject is part of the recompute mechanism in my definition. If I just solve() without recomputing (updateActive), then we do not run into the problem.

While I understand your idea of "trying to recompute part of the document", more and more, and specially considering other tickets, I am of the opinion that when there is a non-DAG problem the smartest solution is to a) clearly notify the user so that he stops working further and b) try to help him providing him with the best information (we have the dependency graph, maybe reusing part of it to provide a summary of what is wrong and referring him to the dependency graph is a good start).
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
Recompute returns at the first chance that recompute is not successful.

No, that's wrong. The recompute only stops for severe exception like when running out of memory. But for less critical exceptions it continues to recompute the whole document.
The recompute of a DocumentObject generally as a pattern as said above returns on the first error. I was never referring to the recompute of the Document itself. As a consequence of an error the DocumentObject interrupts the execution of itself and returns the control to the Document, who may or may not decide to continue as you say.
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
I think we need a general mechanism to tidy up objects when recompute is interrupted.

Definitely not.
You make a good point in that the method does not know what happens and that in some cases, where the object is a perfect state a call to onRecomputeFailed is useless. However, the goal was to ensure an acceptable state, which included to reach one if not in a good state at the expense of not improving if in a good state.

In any case, I can accept your decision. So let's concentrate in what is acceptable:
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
So, IMO the first step to fix the crash is harden the code that is responsible for the crash.
I am not sure if I understood you correctly a couple of weeks ago, but you said you wanted to undertake this on your own. I have put on hold anything relating to the sketcher and representation until that happens, as it makes no sense to interfere or duplicate effort.
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
The next step would be to avoid to let a sketch become inconsistent where possible. As far as I can see the sketch only becomes inconsistent when in the task panel the option for automatic recompute is active. So, this should be doable with changing the code from
That is a very bad solution because there is no need to solve two times and all the rest of the overhead like updating the Geometry in a non-failing case two times. Let's try to clarify the issue with the pattern in the execute functions so that we can find a really good solution and not just an easy one. Below is a fragment of SketchObject::execute() showing only the recompute return paths:

Code: Select all

App::DocumentObjectExecReturn *SketchObject::execute(void)
{
    try {
        App::DocumentObjectExecReturn* rtn = Part2DObject::execute();//to positionBySupport
        if(rtn!=App::DocumentObject::StdReturn)
            //error
            return rtn;
    }
    catch (const Base::Exception& e) {
        return new App::DocumentObjectExecReturn(e.what());
    }

    if (lastDoF < 0) { // over-constrained sketch
        std::string msg="Over-constrained sketch\n";
        appendConflictMsg(lastConflicting, msg);
        return new App::DocumentObjectExecReturn(msg.c_str(),this);
    }
    if (lastHasConflict) { // conflicting constraints
        std::string msg="Sketch with conflicting constraints\n";
        appendConflictMsg(lastConflicting, msg);
        return new App::DocumentObjectExecReturn(msg.c_str(),this);
    }
    if (lastHasRedundancies) { // redundant constraints
        std::string msg="Sketch with redundant constraints\n";
        appendRedundantMsg(lastRedundant, msg);
        return new App::DocumentObjectExecReturn(msg.c_str(),this);
    }
    
    if (lastSolverStatus != 0)
        return new App::DocumentObjectExecReturn("Solving the sketch failed",this);

}
Question: Is it acceptable that I do not return just after the first error has been detected? For example, it is acceptable to define booleans for the different errors and move all the return statements to the end of the function, so that I return the right error message, but I can still leave my sketch in a sync state?
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
As said above in 2836 the recompute already stops before it has started. Here we can think about a solution to ignore the objects that cause the non-DAG issue and recompute all other objects.
I think that such a solution would still be open to similar unwanted behaviour. An object may be part of a DAG problem (e.g. because of a cyclic support dependency) and still need a recompute (a correctly coded execute() that put a solver back in sync).
wmayer wrote:
Thu Jun 01, 2017 9:45 pm
Or we simply call the recompute for each object in a linear order.
This should solve any outstanding issue.

I think we have to combine this with a clear indication of the DAG issue to the user.

Another idea is to add to the Document StatusBits a new flag isNonDAG. This flag then could be used in the DocumentObjects, to enable special modes (e.g. prevent addition of new geometry, targeted reminders of the DAG issue). An example of the latter, if we can identify the DocumentObjects creating the non-DAG problem, we enable DocumentObjects to query whether they are part of the problem, so as properly inform a user. This point though, is probably better discussed when the rest is done.