Accessing Provenance/Registry info within a workflow

Sep 13, 2012 at 12:12 PM

Dear Trident Team,

I would like to access information about a Job instance within an Activity so that I can write that information to an external Provenance database (not a local MS SQL).

  1. I see there is an instance variable of type GUID called WorkflowInstanceID available to an Activity but I can't find GUIDs reported here anywhere in the Trident database. Does it ever get persisted?
  2. Is there a way to access other parameters of a workflow Job within a workflow activity? Specifically, the ActivityID or the ID in the ActivitySequences table would be great to access.

Thanks,

Nick

Sep 14, 2012 at 7:53 AM
Edited Sep 14, 2012 at 10:11 AM

Thanks for your query.

You can use context object to get the running workflow instance and all its related parameters. ctx.WFInstance gets the job instance. Please visit this thread for details.

Please let me know if this helps you.

Regards,
Trident Support Team

Sep 14, 2012 at 1:11 PM

Hi abhisheks,

Yes, that does help very much! I just figured this out a millisecond before reading your reply.

For others who read this: you need to import TridentContext.dll in the Libraries folder in the Trident install (usually "C:\Program Files\Microsoft Project Trident - A Scientific Workflow Workbench\Libraries") to get a Context object that contains the WFInstance.

Nick

Sep 20, 2012 at 7:03 AM

Hi abhisheks,

Following on from the above: if I want to recreate the tree of Activity objects of the current workflow that I'm in, I can get the top level (Root) Activity by doing something like

TridentContext.Context ctx = Context.Get();

ctx.WFInstance.Activity.Name

 

and the next level down by:

foreach (var a in ctx.WFInstance.Activity.Children)
{
  <print etc.> a.Name;
}

... but how do I recurse through the whole tree? I can't see how to get a Context for a given Activity. Context.Create(x,y,z); seems to need

x: a Microsoft.Research.DataLayer.Connection. Can I use the connection of the main workflow (ctx.Connection in the above snippets)

y: Microsoft.Research.DataLayer.ActivityInstance. Where do I get this from?

z: Guid jobId. Do I use ctx.WFJobId in the above context?

I guess once I've created a Context for an Activity, I can access that Activitiy's Children and recurse down from there right? It seems all that's missing and getting the ActiviyInstance, right, so how do I get that?

 

Nick

Sep 21, 2012 at 5:17 AM

Actually, I can now find all the Activities in a workflow by recursing through the workflow's structure:

string ActivityClassNames = "";

TridentContext.Context ctx = Context.Get();

         printActivityClassName(ctx.WFInstance.Activity.Children, 1);
         this.StringOutput += this.ActivityClassNames;

        private string printWorkflowClassName(Context ctx)
        {           
            return ctx.WFInstance.Activity.Name + "(" + ctx.WFInstance.Activity.Type.ToString() + ")\n";
        }

        private void printActivityClassName(List<Microsoft.Research.DataLayer.ActivitySequence> a, int tabLevel)
        {
            foreach (var o in a)
            {
                for (int i = 0; i < tabLevel; i++)
                {
                    this.ActivityClassNames += "\t";
                }
                this.ActivityClassNames += o.Activity.Name + " (" + o.Activity.Type.ToString() + ")\n";

                if (o.Activity.Type != Microsoft.Research.DataLayer.ActivityType.Leaf)
                {
                    printActivityClassName(o.Children, tabLevel + 1);
                }
            }
        }

So the question is really: how do I find the Activity Instaces for Activities in this workflow other than the Activity I'm currently in? I can get the current Activity Instance by going: executionContext.Activity.Name within the Execute(ActivityExecutionContext executionContext) method and also any parents of this Activity (executionContext.Activity.Parent.Name) but how about siblings Activities and even Child Activities?

Nick

Sep 21, 2012 at 7:27 AM

Hi,

The Workflow is also an activity. You "get the context of the Job" and then reccurse the tree by using the properties Jobs and Children properties or hasAssociated jobs. It is the JobId on which a context is created every time a job is run. This createContext is call by the Execute method of the JobExecutor class.

Yo can see the details fo the working by downloading the sourc code and opening it in visual studio. the project and and code pages are mentioned below.

TridentContext project TridentContext.Conext class

Microsoft.Research.eResearch.Execution.JobExecutor Execute method and CreateContext private method. class in WFExecutor project.

Please let me know if this helps you.

 -Regards

 

 

Sep 28, 2012 at 8:33 AM

Please update us on your issue status. If you are still encountering any problem please let us know.

-Regards

Nov 9, 2012 at 3:07 AM

Hi abhisheks,

OK, thanks for the pointers. I am now able to move up and down the static and dynamic (instance) version of the workflow. I am in the process of exporting the workflow runs into PROV-O, the latest OWL provenance ontology.

One further question: is it possible to programatically export a XOML file of a workflow from within an activity of that same workflow? This would emulate clicking "Save As" > "file system" within the Composer but will allow me to export a XOML file with every run (to an external provenance system).

Nick

Nov 9, 2012 at 4:22 AM

I see in the Context of a workflow that there are two variables:

TridentContext.Context ctx = Context.Get();

ctx.WFInstance.Activity.XomlFileName and

ctx.WFInstance.Activity.XomlContents

The first one is reporting a XOML file name, <workflow_name>.xoml, but the second one gives no content. Where is it?

Nov 9, 2012 at 4:54 AM

OK, following through general WF code, I can serialize a workflow as follows:

WorkflowMarkupSerializer ser = new WorkflowMarkupSerializer();
XmlWriter xw = .XmlWriter.Create(@"C:\MyWorkflow.xoml");
ser.Serialize(xw,ctx.WFInstance.Activity);

but the endpoint must be incorrect as teh XOML file is largely blank and doesnot looks quite like a normal XOML file:

<?xml version="1.0" encoding="utf-8"?><ns0:Activity Label="{p1:Null}" Name="GeoProv" FullName="{p1:Null}" IsCondition="False" IsDeleted="False" IsBlackbox="False" IconSize="0" IsInteractive="False" Icon="{p1:Null}" Author="{p1:Null}" Placement="Horizontal" OwnerGroupPrincipal="{p1:Null}" Version="25" Source="{p1:Null}" ActivityClass="{p1:Null}" Comments="{p1:Null}" IsHidden="False" Contacts="{p1:Null}" VersionLabel="" Description="{p1:Null}" DisplayLabel="{p1:Null}" IsBuiltIn="False" XomlContents="{p1:Null}" Type="Root" Keywords="{p1:Null}" xmlns:p1="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:ns0="clr-namespace:Microsoft.Research.DataLayer;Assembly=Microsoft.Research.DataLayer.ServiceRegistry, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null">
    <ns0:Activity.OwnerUserPrincipal>
        <ns0:User Description="{p1:Null}" IsBuiltin="True" Name="car587" Enabled="True" EmployeeStatus="Employee" UncategorizedWorkflows="{p1:Null}" ActivityRoot="{p1:Null}" IsDeleted="False">
            <ns0:User.WorkflowRoot>
                <ns0:Namespace Parent="{p1:Null}" Label="{p1:Null}" User="{p1:Null}" IsEditable="False" IconSize="0" Icon="{p1:Null}" Description="{p1:Null}" EditorColor="0X00000000" Name="UserWorkflowRoot" />
            </ns0:User.WorkflowRoot>
        </ns0:User>
    </ns0:Activity.OwnerUserPrincipal>
    <ns0:Activity.Owner>
        <ns0:User Description="{p1:Null}" IsBuiltin="True" Name="car587" Enabled="True" EmployeeStatus="Employee" UncategorizedWorkflows="{p1:Null}" ActivityRoot="{p1:Null}" IsDeleted="False">
            <ns0:User.WorkflowRoot>
                <ns0:Namespace Parent="{p1:Null}" Label="{p1:Null}" User="{p1:Null}" IsEditable="False" IconSize="0" Icon="{p1:Null}" Description="{p1:Null}" EditorColor="0X00000000" Name="UserWorkflowRoot" />
            </ns0:User.WorkflowRoot>
        </ns0:User>
    </ns0:Activity.Owner>
</ns0:Activity>

 

Jan 28, 2014 at 11:58 PM
I have been doing this, but I intermittently get an exception:
System.Workflow.ComponentModel.WorkflowTerminatedException

Looking at the log it seems to be that there already an opedn data reader.
Quick investigation suggests that my method may be calling a stored procedure that is later called by something else.

Log is attached below:

Timestamp: 29/01/2014 11:07:40 AM
Message: HandlingInstanceID: af57a87e-f65e-45ee-a1ec-1c91ff86e5db

An exception of type 'System.Workflow.ComponentModel.WorkflowTerminatedException' occurred and was caught.

01/29/2014 11:07:40
Type : System.Workflow.ComponentModel.WorkflowTerminatedException, System.Workflow.ComponentModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
Message : Exception of type 'System.Workflow.ComponentModel.WorkflowTerminatedException' was thrown.
Source :
Help link :
Data : System.Collections.ListDictionaryInternal
TargetSite :
HResult : -2146233088
Stack Trace : The stack trace is unavailable.
Additional Info:

MachineName : HARRIER-BU
TimeStamp : 29/01/2014 12:07:40 AM
FullName : Microsoft.Practices.EnterpriseLibrary.ExceptionHandling, Version=5.0.414.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
AppDomainName : JobExecutorDomain
ThreadIdentity :
WindowsIdentity : NEXUS\smi99c

Category: FileLogger
Severity: Error
Machine: HARRIER-BU
Application Domain: JobExecutorDomain
Process Name: C:\Program Files\Microsoft Project Trident - A Scientific Workflow Workbench\Executor\TridentWorkflowHost.exe

Extended Properties:

Timestamp: 29/01/2014 11:07:40 AM
Message: HandlingInstanceID: 907b5fca-d5ba-4692-89f4-303eecdea272

An exception of type 'Microsoft.Research.DataLayer.BackendStorageException' occurred and was caught.

01/29/2014 11:07:40
Type : Microsoft.Research.DataLayer.BackendStorageException, Microsoft.Research.DataLayer.DataLayerCommon, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null
Message : Read error
Source : Microsoft.Research.DataLayer.DataProviders.Microsoft
Help link :
Data : System.Collections.ListDictionaryInternal
TargetSite : Void ReadOne(System.String, Action, System.Collections.Generic.List1[Microsoft.Research.DataLayer.Parameter] ByRef)
HResult : -2146233088
Stack Trace : at Microsoft.Research.DataLayer.SQLConnectionWorker.ReadOne(String name, Action action, List
1& paramList)
at Microsoft.Research.DataLayer.SQLConnectionWorker.Object(Storage obj, Action action, List1& paramList)
at Microsoft.Research.DataLayer.ConnectionSecure.Object(Storage obj, Action action, List
1& paramList)
at Microsoft.Research.DataLayer.Job.Refresh()
at Microsoft.Research.eResearch.Execution.JobMonitor.<>c__DisplayClass9.<MonitorRegistryJob>b__7()
at Microsoft.Research.eResearch.Common.ExceptionHandler.Handle(String policyName, Action action)

Additional Info:

MachineName : HARRIER-BU
TimeStamp : 29/01/2014 12:07:40 AM
FullName : Microsoft.Practices.EnterpriseLibrary.ExceptionHandling, Version=5.0.414.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
AppDomainName : JobExecutorDomain
ThreadIdentity :
WindowsIdentity : NEXUS\smi99c
Inner Exception
---------------
Type : System.InvalidOperationException, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
Message : There is already an open DataReader associated with this Command which must be closed first.
Source : System.Data
Help link : 
Data : System.Collections.ListDictionaryInternal
TargetSite : Void ValidateConnectionForExecute(System.Data.SqlClient.SqlCommand)
HResult : -2146233079
Stack Trace :    at System.Data.SqlClient.SqlInternalConnectionTds.ValidateConnectionForExecute(SqlCommand command)
   at System.Data.SqlClient.SqlCommand.ValidateCommand(String method, Boolean async)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader()
   at Microsoft.Research.DataLayer.SQLConnectionWorker.ReadOne(String name, Action action, List`1& paramList)

Category: FileLogger
Severity: Error
Machine: HARRIER-BU
Application Domain: JobExecutorDomain
Process Name: C:\Program Files\Microsoft Project Trident - A Scientific Workflow Workbench\Executor\TridentWorkflowHost.exe

Extended Properties: