Skip to content

Conversation

gouttegd
Copy link
Contributor

@gouttegd gouttegd commented Sep 1, 2025

This PR adds a new method to the MappingSetDataFrame class to automatically determine the minimum version of the SSSOM specification the set is compatible with – that is, the earliest version that defines all the slots and all the enum values present in the set.

This method could later be used to implement the behaviour recommended by the spec: automatically inserting the sssom_version slot when a set is written, if the set requires another version than 1.0. This could simply look like:

# Assuming msdf is the mapping set we have to write
min_version = msdf.get_compatible_version()
if min_version != "1.0":
    msdf.metadata["sssom_version"] = min_version
# Now we can actually write the msdf

Add a new method to the MappingSetDataFrame class to automatically
determine the minimum version of the SSSOM specification the set is
compatible with -- that is, the earliest version that defines all the
slots and all the enum values present in the set.
@gouttegd gouttegd self-assigned this Sep 1, 2025
@gouttegd
Copy link
Contributor Author

gouttegd commented Sep 1, 2025

⚠️ This cannot work for now because the latest released version of sssom is still the 1.0.0 from last year, in which the sssom_version_enum does not exist.

Fix wrong slot name when looking for "composed entity expression".

Let Python compare version numbers as tuples of integers.

Use `max(list)` instead of `sorted(list)[-1]`.
matentzn
matentzn previously approved these changes Sep 3, 2025
Copy link
Collaborator

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to say a huge THANK YOU for these PRs.. Much appreciated.

I dont see any red flags, lgtm!

@gouttegd gouttegd marked this pull request as draft September 3, 2025 18:00
@gouttegd
Copy link
Contributor Author

gouttegd commented Sep 3, 2025

Marking as draft to prevent inadvertent premature merging.

Amend the SSSOMSchemaView#get_minimum_version() method to return a
(major, minor) tuple, rather than a SssomVersionEnum object. The
SssomVersionObject (which is automatically generated from the LinkML
schema) is cumbersome to use, for at least two reasons:

1) obtaining the actual value of the enum requires accessing two levels
   of attributes (SssomVersionObject.code.text);
2) SssomVersionEnum values cannot be meaningfully compared (e.g. to
   check that a given version number is higher than another given
   version), we must (a) obtain the text value, (b) split that value
   over the middle dot, (c) convert the strings to integers, (d) put the
   integers into a tuple. OK, this can be done in one line of code, but
   this is cumbersome all the same, and it's best if that kind of things
   is not left to client code.
Copy link
Collaborator

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An optional comment, feel free to merge if you are satisfied.

# Should never happen, schema is incorrect
return None
return (version[0], version[1])
except AttributeError: # No added_in annotation, defaults to 1.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there no other theoretical scenarions the "AttributeError" is thrown in the try block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No realistic one. AttributeError is thrown when trying to access an attribute that doesn’t exist.

Which attribute are we trying to access here?

  • self.view: this one has to exist, it is defined a few lines above in the same file.
  • self.view.induced_slot: This field is defined in LinkML’s SchemaView class, of which self.view is an instance.
  • slot.annotations: This field is defined in LinkML’s SlotDefinition class, of which slot (as returned by the function above) should be an instance – if it is not, then there must be a pretty serious bug in LinkML.
  • slot.annotations.added_in: This is the field that may not exist, if the slot we are looking at is not tagged with a added_in annotation in the LinkML model.
  • slot.annotations.added_in.value: If the added_in field exists, then it should always have this field – again, if it does not, then something is wrong with LinkML.

@matentzn
Copy link
Collaborator

Ah we still need the sssom schema version, sorry, just remembered.

@gouttegd
Copy link
Contributor Author

Ah we still need the sssom schema version

Yep, that’s why I have not done anything to resolve the conflicts since the PR can’t be merged before the sssom dependency is updated anyway. I’ll address the conflicts then. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants