Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ETL] Native support for 1:m..n rules #123

Open
arcanefoam opened this issue Sep 27, 2024 · 9 comments
Open

[ETL] Native support for 1:m..n rules #123

arcanefoam opened this issue Sep 27, 2024 · 9 comments
Labels
enhancement New feature or request

Comments

@arcanefoam
Copy link
Contributor

arcanefoam commented Sep 27, 2024

ETL natively supports 1:n transformation rules. However, in some cases we can need 1:m..n rules, which I call runtime multiplicity, in which the actual number of output elements can not be determined until runtime.

An example is a port -> implements relation between Ports and Interfaces, where the interface needs to be transformed m times, where m is the number of ports that use the interface.
Dimitris has provided an alternative:

rule OperationInInterface
	transform sOp:SysML!Operation
	to runbls: Sequence {
	
	var aClntSrvOp = sOp.equivalents().selectOne(eq | ...));
	for (p in portsUsingTheInterface(sOp.owner)) {
		var aSrvComSpec = new AUTOSAR!ServerComSpec();
		aSrvComSpec.`operation` = aClntSrvOp;
		p.equivalent().providedComSpec.add(aSrvComSpec);
		runbls.add(aSrvComSpec);	
	}
	
}

This workaround works, but using the equivalents operation is uselss unless you define the rule as lazy.
The execution time of the lazy algorithm is much worse than the base one.

My prosal is to support 1:m..n rules natively.
The porposed syntax change adds a foreach keyword in which you specify a sequence statement.
The size of the returned sequence (m) will be be used to instantiate m Tuples, each containg on set of target parameters.

CST Addiiton

(@abstract)?
(@lazy)?
(@primary)?
rule <name>
    transform <sourceParameterName>:<sourceParameterType>
    to (<targetParameterName>:<targetParameterType>
        ,<targetParameterName>:<targetParameterType>)*
	  (foreach <sequence>)?
    (extends <ruleName> (, <ruleName>*)? {

    (guard (:expression)|({statementBlock}))?

    statement+
	
}

With this the previous example would be

rule OperationInInterface
	transform sOp:SysML!Operation
	to runbls: aSrvComSpec: AUTOSAR!ServerComSpec()
			foreach portsUsingTheInterface(sOp.owner)
	{
	
	var aClntSrvOp = sOp.equivalents().selectOne(eq | ...));
	
	aSrvComSpec.`operation` = aClntSrvOp;
	loopVar.equivalent().providedComSpec.add(aSrvComSpec);
	runbls.add(aSrvComSpec);	
	
	
}

Note that the use of the for each keyword would result in a stack variable called loopVar that can be used in the rule.

When using the equivalent(s) operation you would get a List<Tuple<>> . The tuple will contain all the targetPrams + the loopVar.

In the ETL implementation

Since ETL already creates the output elements in a first pass, I think implementing this shuold not be to complex.
We would execute the foreach statement and create the list of tuples. When we get to the body we repeat it m times, replacing the stack variables as needed.

For more complex scenarios, the foreach could be a block, instead of a keyword:

rule <name>
    transform <sourceParameterName>:<sourceParameterType>
    to (<targetParameterName>:<targetParameterType>
        ,<targetParameterName>:<targetParameterType>)*
    (foreach (:expression)|({statementBlock}))?

This would allow to breakdown the sequence statement or collect elements from various places.

If something of interest, we can discuss more details and I can work on this (if @agarciadom is not too eager to jump and do it :) ).

@arcanefoam arcanefoam added the enhancement New feature or request label Sep 27, 2024
@kolovos
Copy link
Contributor

kolovos commented Sep 29, 2024

This workaround works, but using the equivalents operation is uselss unless you define the rule as lazy.

I thought the same off the top of my head but this doesn't seem to be the case. Calling .equivalent() actually causes the body of the target rule to be executed as shown in https://eclipse.dev/epsilon/playground/?notes2coins

@arcanefoam
Copy link
Contributor Author

Sadly in my case it seems I get non-deterministic results, as in, the results of the transformation change between runs. Might be caused by other rules and the complexity of my case . However I am unable to get a minimal example to highlight the issue.

Although I state the equivalents as a motivation, I also think that the syntax provides a better description of the rule's intent, and simplifies part of the rule body.

@kolovos
Copy link
Contributor

kolovos commented Oct 1, 2024

I think that the issue might be that in line 4 of your rule you are calling .equivalents() on the element under transformation (sOp). Perhaps refactoring the transformation so that you don't need to do this might help?

rule OperationInInterface
	transform sOp:SysML!Operation
	to runbls: Sequence {
	
	var aClntSrvOp = sOp.equivalents().selectOne(eq | ...));

I agree that adding some dedicated syntax would be useful (ATL provides the distinct keyword for this) however I would avoid introducing a new keyword and would favour for instead of foreach i.e.

rule WalletToPouch
    transform wallet : Source!Wallet
    to pouch : Target!Pouch {
    
    pouch.coins ::= wallet.notes;
}

rule NoteToCoins
    transform note : Source!Note
    to coins : Sequence<Target!Coin>
        for : 1.`to`(note.value) {
    
    guard: note.value > 0
    
    for (coin in coins) {
        coin.value = 1;
        coins.add(coin);
    }
}

Also, we should support these for statements for each output parameter i.e.

rule <name>
    transform <sourceParameterName>:<sourceParameterType>
    to (<targetParameterName>:<targetParameterType> (for (:expression)|({statementBlock}))?
        ,<targetParameterName>:<targetParameterType> (for (:expression)|({statementBlock}))?)*

Thoughts?

@agarciadom
Copy link
Contributor

agarciadom commented Oct 1, 2024

I like the use of the for keyword, but I'm not a big fan of having Sequence<T>s that we add() to. It'd be far too easy for someone to say they're going to create n elements but then have a bug in the rule body which results in n-1 elements or n+1 elements being added to that collection. There wouldn't be much point in providing that expression then.

I'd change the semantics slightly, so the rule populates the Sequence<Coin> in advance, then we just loop over it. We could wrap the coins sequence in an UnmodifiableList so any attempt to change it from the body would raise an exception. In addition, if the for will always be a sequence of numbers, it wouldn't even have to be a list, just an integer representing multiplicity:

rule NoteToCoins
    transform note : Source!Note
    to coins : Sequence<Target!Coin> for : note.value
{
    guard: note.value > 0
    for (coin in coins) {
        coin.value = 1;
    }
}

Another option could be something like this, which reuses the for loop syntax of EOL and exposes the collection element as a variable for use within the body and guard (assuming it runs once per (note, i) pair):

rule NoteToCoins
    transform note : Source!Note
    to coin : Target!Coin
    for (i : Integer in 1.`to`(note.value))
{
    guard: note.value > 0
    coin.value = 1;
}

The above option would not allow for different counts for each target type, though.

@arcanefoam
Copy link
Contributor Author

My idea is more like the second case in Antonio's post. If you look at my initial post, by moving the 'for' to the rule definition the idea is to remove the for loop from the rule's body. Additionally, there is no 'Sequence' type in the 'to' definition. One set of 'to' elements will be created for each iteration. This means that you always get the same number of elements of each 'to' (I would not allow a sperate collection for each 'to' element as it would complicate the semantics). Perhaps the coin example is too simple to demonstrate the idea.

This is another one: https://eclipse.dev/epsilon/playground/?0c4cef86

The intention of the change is that the second rule can be changed from this:

rule ClassroomToPC
    transform c : Source!Classroom
    to assets : Sequence {
    
    for (s in c.students) {
        var u = new Target!User;
        u.name = s.name;
        var f = new Target!Folder;
        f.path = "home/" + s.name.toLowerCase();
    }
}

to this:

rule ClassroomToPC
    transform c : Source!Classroom
    to u:Target!User, f:Target!Folder
    for c.students {
     
        u.name = loopVar.name;
        f.path = "home/" + loopVar.name.toLowerCase();
    }
}

In my head, the two main benefits is the ability to explicitly state what type of elements are created and to simplify the rule body.

@kolovos
Copy link
Contributor

kolovos commented Oct 1, 2024

This is another one: https://eclipse.dev/epsilon/playground/?0c4cef86

@arcanefoam: Wouldn't the following be a more sensible way to implement this transformation?

// Transforms a class room, into a PC
// with 1 account+folder per student
rule ClassroomToPC
    transform c : Source!Classroom
    to pc : Target!PC {
    
    pc.users.addAll(c.students.equivalent()
        .select(e | e.isTypeOf(Target!User)));
    pc.folders.addAll(c.students.equivalent()
        .select(e | e.isTypeOf(Target!Folder)));
}

rule StudentToUserAndFolder
    transform s : Source!Student
    to u : Target!User, f : Target!Folder {
    
    u.name = s.name;
    f.path = "home/" + s.name.toLowerCase();
}

@agarciadom: The call to the add() function should not have been there. What I meant to write was:

rule WalletToPouch
    transform wallet : Source!Wallet
    to pouch : Target!Pouch {
    
    pouch.coins ::= wallet.notes;
}

rule NoteToCoins
    transform note : Source!Note
    to coins : Sequence<Target!Coin> for : 1.`to`(note.value) {
    
    guard: note.value > 0
    
    for (coin in coins) {
        coin.value = 1;
    }
}

My thinking was that the for expression would return a collection and for every item of the collection, we'd create a Coin and add it to coins (we could even trace the target elements back to the source elements).

A limitation with this solution is e.g. that we could not fill the coins sequence with instances of different sub-types of Coin if we wanted to. To provide more control to the user, in addition to the for iterator, we could support initialiser expressions i.e.

rule WalletToPouch
    transform wallet : Source!Wallet
    to pouch : Target!Pouch {
    
    pouch.coins ::= wallet.notes;
}

rule NoteToCoins
    transform note : Source!Note
    to coins : Sequence<Target!Coin> = 1.`to`(note.value).collect(i | new Target!Coin(value = 1)) {
    
    guard: note.value > 0
    
}

Initialiser expressions could also be useful for single-valued targets e.g.

rule WalletToPouch
    transform wallet : Source!Wallet
    to pouch : Target!Pouch = wallet.notes.value.sum() > 10 ? new Target!LeatherPouch : new Target!FabricPouch {
    
    pouch.coins ::= wallet.notes;
}

Of course, users should avoid calling equivalent() in initialiser expressions and we could even look into detecting such calls and throwing an exception.

@arcanefoam
Copy link
Contributor Author

@kolovos
I was just trying to exemplify my idea.

I think you are focusing on getting the count of expected elements in the sequence, that would not work for my case... and I don't see the added value vs what is available atm.

My grudge is with the Sequence as a 'to' type. It makes the rule specification unclear (i.e. you need to read the code to know what is going into the sequence - the coin example with a single 'to' is not a good example to drive the discussion) . At least from what I need and perceive as improved syntax/semantics just giving the number of instances beforehand is not much of an improvement.

@kolovos
Copy link
Contributor

kolovos commented Oct 1, 2024

ETL's default execution algorithm, which is used when there are no lazy rules, works in two phases. In the first phase, it goes through all the rules and creates empty target model elements from source elmenents, and in the second phase it executes the bodies of rules to populate the contents of target elements.

I think you are focusing on getting the count of expected elements in the sequence, that would not work for my case... and I don't see the added value vs what is available atm.

At the moment, after the first phase, the coins/assets sequences are empty - which is presumably what causes the issues with equivalents() that you described in your original message. With the proposed extension, the coins/assets sequences would be populated in the first phase.

I was just trying to exemplify my idea.
... the coin example with a single 'to' is not a good example to drive the discussion

It'd be useful to come up with an example that cannot be expressed concisely enough using the current syntax of ETL so that we can use it to discuss different solutions.

@agarciadom agarciadom changed the title Native support for 1:m..n rules [ETL] Native support for 1:m..n rules Oct 9, 2024
@agarciadom
Copy link
Contributor

We had a chat about this today, and we agree that initalizers on to parameters would simplify some common scenarios, so we have filed a separate issue about adding these to ETL:

#125

However, these would necessarily help with this 1:m..n scenario. Horacio has agreed to produce a minimal working example that illustrates the problem, that we can use to draft proposals to revise ETL to make those scenarios easier to work with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants