Can't import a RDF file retrieved via TRKG API to Neo4j using the semantics.importRDF stored procedu

How can I import a RDF file retrieved via TRKG API to Neo4j using the semantics.importRDF stored procedure?

After retrieving a rdf file via TRKG API and placing the file in the import directory of Neo4j which I built on my laptop PC, I tried to import the file to the Neo4j executing the following commands, but somehow I can't complete importing it successfully, even though no error messages appear.

image

image

Best Answer

  • faruk.cay
    Answer ✓

    It appears that the file path you have used in semantics.importRDF call is incorrect. On Linux systems or similar systems including Mac your call should look like

    CALL semantics.importRDF("file:///home/fcay/data/CORE_ENTITIES_value_chains.nt","N-Triples",{})

    On Windows systems it should look like

    CALL semantics.importRDF("file:////C:/Users/u6067304/Documents/content sets/CORE_ENTITIES_value_chains.nt","N-Triples", {})

    Please make sure that you are providing the absolute path to the n-triple file and you are using the correct number of slashes right after file: in the call. Please let us know if this resolves the problem you are seeing.

    Best wishes,

    Faruk

Answers

  • I have forwarded your issue to Faruk Cay, data scientist on the graph team, who can help
    you with loading to neo4j. Best wishes, Brian

  • Thank you so much for your swift advice and I could import the value chain file successfully!

    I will continue to import the other data sets(e.g. organization, people, metadata, etc.) to the same Neo4j data base and wonder how to stitch all the contents sets after I imports the others.

    Could you advise?

  • I am glad that your problem is resolved. Before I answer your next question I want to point out that Neo4j is not Data Fusion and it represents data and relationships in a different way. Depending on your use case stitching may not be necessary and appropriate cypher queries could be sufficient to get the results. If you have to stitch the data you loaded, you can follow several strategies. One approach would be to create bidirectional sameAs relationships between two nodes that are being stitched. Other approaches will require creating new nodes for the stitched entities.

  • Sorry again on this advice of yours above.

    Though I could import the files on Windows, I'm in trouble when importing them on Linux as you can see in the attached screenshots.

    Do you have any idea why I can't import the files even though the files are located in the directory which I write in the code?

    image

    image

  • Based on the information under extraInfo column in the first screenshot, you have not yet created an index on Resource(uri). Please issue the following command before executing semantics.importRDF:

    CREATE
    INDEX ON :Resource(uri)

    Please let me know if this resolves the problem you are having with importing RDF data on your linux system.

  • Thanks, @faruk.cay!

    I could import several data sets like value chains, metadata successfully, but somehow haven't been able to finish importing organization data for more than 12 hours...and I couldn't even kill the importing process as follows....

    Could you please give me your advice on how to resolve this issue?

    image

    image

  • First make sure that the system you are using is equipped to handle such large datasets. It should have sufficiently large memory, reasonably high CPU speeds and enough free disk space. If you have executed some cypher queries that you expect to return large results or closed execution frames before getting the results, sometimes this is causing Neo4j to hang. You may need to shutdown Neo4j and restart it before executing the semantics.importRDF call. We have observed that Neo4j can take 5 hours or more for importing organization data on a laptop.