- 
                Notifications
    You must be signed in to change notification settings 
- Fork 36
WIP: Bootstrap relational database from fixtures #775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
85a5919    to
    dfdb146      
    Compare
  
            
          
                src/main/java/org/fairdatapoint/config/properties/BootstrapProperties.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | Hi @MarekSuchanek there's a typo in the package name:  | 
| @MarekSuchanek some general comments/requests: 
 | 
        
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/AbstractBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/AbstractBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/AbstractBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | } | ||
|  | ||
| @Override | ||
| protected JpaRepository getRepository() { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
| protected JpaRepository getRepository() { | |
| protected BaseRepository<Membership> getRepository() { | 
and similar for the other bootstrapper implementations?
        
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/AbstractBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/MetadataRecordsBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/MetadataRecordsBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | Another general question: | 
        
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/MetadataRecordsBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/MetadataRecordsBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | @MarekSuchanek Another general question: Looks like the fixture classes in  This also relates to the way the fixtures are stored, e.g. as separate files for each object vs a single file with an array of objects. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @MarekSuchanek,
This looks pretty good, but I've got the feeling it can be simplified considerably.
Perhaps you could take a look at the detailed comments so far and clarify some things?
To summarize:
- Please provide explanatory comments in the code, explaining what things are intended to do and especially why they are done in a certain way (at least indicating the patterns used)
- What would be the general workflow for overriding/extending default fixtures (using docker)?
- Could we use spring data repository populators, so we don't need separate fixture classes and corresponding mapper methods? (basically loading fixtures directly into entities)
- Related to the above, there's the option to define multiple related fixture objects in a json array, as opposed to separate files for each object.
- Related entities should get their own fixtures.
- Would it not be possible to use a single bootstrapper for all fixtures? The trick should be the same every time, except maybe for the records (which use the triple store).
It would be great if you could clarify this.
Also could you check if we can merge #778?
        
          
                src/main/java/org/fairdatapoint/service/bootstrap/fixtures/UserFixture.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/fixtures/UserFixture.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/fixtures/MembershipFixture.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/bootstrap/components/MembershipBootstrapper.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/main/java/org/fairdatapoint/service/membership/MembershipMapper.java
          
            Show resolved
            Hide resolved
        
      | 
 Thanks for the review, some of the things are mistakes (like the typo or  
 | 
| 
 I agree , that  | 
| Thanks for the quick response @MarekSuchanek ! 🙂 
 Not sure about the general consensus, but in my opinion fixtures should be bound to the structure of DB entities. 
 I think the advantage is uniformity. You can do the same trick every time, without needing special handling for objects with relations. Of course you do need to ensure relational integrity yourself when creating the fixture files. Also the order of fixture loading is important. This is also how it is done in Python/Django, and in my experience that works really well. W.r.t "manage more files", with populators you could have an array of objects, so fewer files. E.g. a single  | 
| 
 Ok thanks. I guess it is also a matter of providing good documentation for this. :) | 
| @dennisvang I tried Populators with your example prototype and it would indeed simplify how to reach this functionality and make it more "universal". At the same time, two issues I spot and want to make sure it is OK or if we need to deal with that somehow: 
 | 
| 
 Hi @MarekSuchanek thanks for giving the prototype a try. I'm sure a proper implementation would need a lot more work, but I would be happy to contribute to that. 
 I would prefer only to allow hashed passwords in the fixtures. Of course we would need to provide some instructions for how to generate a hashed password. Currently the FDP is configured to use bcrypt: 
 Afaik bcrypt and argon2id hashes are fully portable, i.e. they contain all information needed for verification. For scrypt you would still need to know the correct parameters. In any case users should be able to use any (offline) tool to generate a hash. This can be done on the linux command line using the appropriate package. Languages and frameworks also have their solutions, such as the spring boot cli encodepassword command. There are online tools as well, but better not use those, I think. Another alternative would be to fire up a local fdp, create the user(s) manually, then dump the resulting database content into a json file. Note While we're at it, perhaps the v2 release would also be a good opportunity to upgrade the hashing algorithm to argon2id, as I believe that is the current best practice. For example,from OWASP: 
 For example we can  echo "mypassword" | argon2 applysomerandomsalthere -id -k 19456 -t 2 -p 1Even better would be to use a DelegatingPasswordEncoder to make this future-proof. Changing the password encoder would be a separate PR of course. 
 I think it would be good to have a clear separation between the loading of relational db fixtures and RDF (Turtle) fixtures. The RDF would need a dedicated loader anyway, I suppose. I discussed this with @luizbonino recently, and we were thinking of storing the actual "metadata schema SHACL definitions" directly in a separate repository in the triple store. The fixtures for the relational db could then just contain the URI of the metadata schema. What are your thoughts on that? 
 Could you clarify? 
 
 EDIT: On second thought, you're right: Indeed it would still be necessary to remember which fixtures have been applied, to prevent overwriting changes. For example, if we populate user accounts from fixtures, then any changes made to those accounts by the user, such as a password reset, would be reverted whenever the app restarts. | 
Co-authored-by: dennisvang <29799340+dennisvang@users.noreply.github.com>
a14e339    to
    b41a40e      
    Compare
  
    | Hi @dennisvang , so I switched from custom to populators with the last commit (and closed the discussions that are no longer relevant due to that): 
 Done, we still need to find suitable spot for declaring how to create the hashed password. Maybe docs? 
 That would be good; however, that should be separate issue as that will get some tricky parts as well probably (esp. with different versions of metadata schemas, updates, etc.). 
 JSON schemas that I tried to prepare for the custom JSON files. Let's keep it simple now. 
 True, but not sure how that should work for metadata records being loaded from fixtures. So, could you check and eventually propose changes wrt populators? And then we need to still resolve the records (aside RDF we need information if it is for draft or main repository (can be decided based on file location?) and we probably want the replacement variable with persistent URL as I suggested somehow... | 
| Hi @MarekSuchanek, thanks for the update. I'll review a.s.a.p. 
 Yes, eventually the docs are probably a good place for this, as we'll need to provide more detailed bootstrapping instructions anyway. However, I think it would also be helpful to include a small section in the readme. I can take care of that if you like. 
 Yes I agree this should be a separate issue. It's quite an invasive change. 
 Agreed. I like simple. So, we'll just refer to the entity classes for now. 
 Indeed, I think it remains a bit of a conundrum. Both approaches have their pros and cons. For now, lets keep this simple as well, while we think about it some more. I'll try to come up with a few use cases to get a clearer picture of the requirements, 
 I'll review the changes and do some manual testing as soon as I have time. | 
| Hi @MarekSuchanek, now that I'm reviewing this again, I think it would make sense to split this up into two PRs: 
 What do you think? | 
| 
 Yes, that makes sense... so we get rid of RDF bootstrapping attempt here and keep just the populators? | 
| 
 Hi @MarekSuchanek , yes, that's what I had in mind. The RDF bootstrapping part can be moved onto a new branch for another PR. I'll limit my review comments to the populators part for now. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Marek,
As discussed, I've limited my review to the files related to the relational db populator, assuming the RDF bootstrapping part will be moved into a new PR.
The populator part looks really good already.
I also like the way you've combined the related objects in the fixture files.
Thanks again! :)
| factory.setResources(resources); | ||
| } | ||
| catch (IOException exception) { | ||
| exception.printStackTrace(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably provide a meaningful message here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add this explicitly?
Is there a specific reason for using tools.jackson.core instead of the spring-boot managed com.fasterxml.jackson.core?
Isn't the latter included already in spring-boot-starter-web?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tried: looks like the app runs properly without this change.
| Member member, | ||
| CustomIdGeneratorCreationContext creationContext | ||
| ) { | ||
| super(createUuidGeneratorFromCustomUuid(config), member, creationContext); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still wondering if there's a cleaner way to do this, without needing the createUuidGeneratorFromCustomUuid()...
@MarekSuchanek do you know of a way to simplify this?
Would be nice if we could just cast as org.hibernate.annotations.UuidGenerator, but I'm not sure how to make that work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this change be reverted completely? Looks like permissionFromDTO is not used.
| } | ||
|  | ||
| private SettingsSearchFilterItem fromSearchFilterItemDTO( | ||
| public SettingsSearchFilterItem fromSearchFilterItemDTO( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change? Is there a specific reason for this, or can it be rolled back?
| "uuid": "7e64818d-6276-46fb-8bb1-732e6e09f7e9", | ||
| "firstName": "Albert", | ||
| "lastName": "Einstein", | ||
| "email": "albert.einstein@example.org", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the switch from example.com to example.org.
It did take me a while to figure out why my login attempts were failing though: I'd already started checking the password hashes before finally noticing the .org. 😅
Probably good to add a warning in the readme or docs to point this out.
| "version": "1.0.0", | ||
| "name": "Resource", | ||
| "description": "", | ||
| "definition": "@prefix : <http://fairdatapoint.org/> .\n@prefix dash: <http://datashapes.org/dash#> .\n@prefix dcat: <http://www.w3.org/ns/dcat#> .\n@prefix dct: <http://purl.org/dc/terms/> .\n@prefix foaf: <http://xmlns.com/foaf/0.1/> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n:ResourceShape a sh:NodeShape ;\n sh:targetClass dcat:Resource ;\n sh:property [\n sh:path dct:title ;\n sh:nodeKind sh:Literal ;\n sh:minCount 1 ;\n sh:maxCount 1 ;\n dash:editor dash:TextFieldEditor ;\n sh:order 1 ;\n ], [\n sh:path dct:description ;\n sh:nodeKind sh:Literal ;\n sh:maxCount 1 ;\n dash:editor dash:TextAreaEditor ;\n sh:order 2 ;\n ], [\n sh:path dct:publisher ;\n sh:node :AgentShape ;\n sh:minCount 1 ;\n sh:maxCount 1 ;\n dash:editor dash:BlankNodeEditor ;\n sh:order 3 ;\n ], [\n sh:path dcat:version ;\n sh:name \"version\" ;\n sh:nodeKind sh:Literal ;\n sh:minCount 1 ;\n sh:maxCount 1 ;\n dash:editor dash:TextFieldEditor ;\n dash:viewer dash:LiteralViewer ;\n sh:order 4 ;\n ], [\n sh:path dct:language ;\n sh:nodeKind sh:IRI ;\n sh:maxCount 1 ;\n dash:editor dash:URIEditor ;\n dash:viewer dash:LabelViewer ;\n sh:defaultValue <http://id.loc.gov/vocabulary/iso639-1/en> ;\n sh:order 5 ;\n ], [\n sh:path dct:license ;\n sh:nodeKind sh:IRI ;\n sh:maxCount 1 ;\n dash:editor dash:URIEditor ;\n dash:viewer dash:LabelViewer ;\n sh:defaultValue <http://purl.org/NET/rdflicense/cc-zero1.0> ;\n sh:order 6 ;\n ], [\n sh:path dct:rights ;\n sh:nodeKind sh:IRI ;\n sh:maxCount 1 ;\n dash:editor dash:URIEditor ;\n dash:viewer dash:LabelViewer ;\n sh:order 7 ;\n ] .\n\n:AgentShape a sh:NodeShape ;\n sh:targetClass foaf:Agent ;\n sh:property [\n sh:path foaf:name ;\n sh:nodeKind sh:Literal ;\n sh:minCount 1 ;\n sh:maxCount 1 ;\n dash:editor dash:TextFieldEditor ;\n ] .\n", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's not that important, we could cut back on the white space to restore the 2-space indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for reference:
I assume the json files in the src/main/resources/fixtures dir are based on the original mongo migrations from v1.x src/main/java/nl/dtls/fairdatapoint/database/mongo/migration/production?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps the dev migrations from src/main/resources/dev/db/migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why this is added here?
using constructor injection instead of field injection
| Hi @MarekSuchanek , I've opened #784 to be merged into this branch. It moves the fixtures into the project root, so it is easier to override individual files by mounting them into the docker container. Could you have a look? Also I think it would be helpful to split the PR as mentioned above, before proceeding. | 
| 
 @MarekSuchanek on second thought I guess you're right: remembering which files have been loaded is still important, otherwise changes made by users to any of the fixture objects will be overwritten again when the app is restarted. Alternatively we could only run the populator when explicitly requested via the command line. I've created a fix/workaround for this in #784. | 
and use @value injection instead of BootstrapProperties this makes it slightly more difficult to see where the properties are defined in code, but easier to see the name of the actual property in the external config file
we want to rely on the default false, and only override by setting environment property BOOTSTRAP_ENABLED on the command line
- Move db fixtures dir from src/main/resources into project root - Modify Dockerfile to copy the fixtures into the image - Add configuration option for the (relational) db fixtures path - Use bootstrap.enabled property to disable the populator by default, so ops person can decide when to load fixtures This allows us to mount fixture files into the container at /fdp/fixtures/<filename> to override the defaults, or extend them.
to be re-applied in a new branch
| Hi @MarekSuchanek, as discussed above, I've split this PR into two parts, keeping the relational-db bootstrap here, and moving the triple-store bootstrap into a new PR #785. | 
| factory.setResources(resources); | ||
| } | ||
| catch (IOException exception) { | ||
| exception.printStackTrace(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| exception.printStackTrace(); | |
| log.error("Failed to load relational database fixtures", exception); | 
| <jwt.version>0.12.6</jwt.version> | ||
| <lombok.version>1.18.38</lombok.version> | ||
| <hypersistence.version>3.9.10</hypersistence.version> | ||
| <jackson.version>3.0.0</jackson.version> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <jackson.version>3.0.0</jackson.version> | 
| <dependency> | ||
| <groupId>tools.jackson.core</groupId> | ||
| <artifactId>jackson-databind</artifactId> | ||
| <version>${jackson.version}</version> | ||
| </dependency> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <dependency> | |
| <groupId>tools.jackson.core</groupId> | |
| <artifactId>jackson-databind</artifactId> | |
| <version>${jackson.version}</version> | |
| </dependency> | 
| public MembershipPermission permissionFromDTO(Membership membership, MembershipPermissionDTO permission) { | ||
| return MembershipPermission.builder() | ||
| .membership(membership) | ||
| .code(permission.getCode()) | ||
| .mask(permission.getMask()) | ||
| .build(); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public MembershipPermission permissionFromDTO(Membership membership, MembershipPermissionDTO permission) { | |
| return MembershipPermission.builder() | |
| .membership(membership) | |
| .code(permission.getCode()) | |
| .mask(permission.getMask()) | |
| .build(); | |
| } | 
Please let me know if there is still a reason to keep this.
Description
A new feature to support bootstrapping the relational database (postgresql) from JSON fixtures.
Also see follow-up #785
Checklist
TODO:
src/main/resources/dev/db/migration(add separate set of dev fixtures, or just use the default ones?)