Fixes #1327 replacing py2neo with neo4j#1402
Conversation
Rafiot
left a comment
There was a problem hiding this comment.
Alright, I did a very quick review of the code and I have a few notes.
Just making sure: you're using neo4j regularly and know what you're doing, right? I have not tested your code yet, this review is simply based on a quick read through.
| def __init__(self, host: str='localhost', port: int=7687, username: str='neo4j', password: str='neo4j') -> None: | ||
| if not has_neo4j: | ||
| raise Exception('neo4j is required, please install: pip install neo4j') | ||
| self.driver = GraphDatabase.driver(f"bolt://{host}:{port}", auth=(username, password)) |
There was a problem hiding this comment.
Quick googling and it seems you should use neo4j as a scheme, not bolt
| raise Exception('neo4j is required, please install: pip install neo4j') | ||
| self.driver = GraphDatabase.driver(f"bolt://{host}:{port}", auth=(username, password)) | ||
|
|
||
| def close(self) -> None: |
There was a problem hiding this comment.
Requiring the user to manually call the closing method is prone to mistake and will often lead to never closing the driver.
Best practice is to make it a context, or opening/closing the driver when needed.
| f"CREATE (attr:Attribute:{safe_type} {{uuid: $uuid, category: $category, name: $value}}) " # Create an Attribute node with the sanitized type as a label | ||
| f"CREATE (e)-[:`is member`]->(attr)", # Then create a relationship (is member) between the Event and Attribute | ||
| event_uuid=str(event.uuid), uuid=str(a.uuid), | ||
| category=a.category, value=a.value |
There was a problem hiding this comment.
I don't know anything about the query system used by neo4j, but is it safe to pass an arbitrary value like that?
|
|
||
| event = MISPEvent() | ||
| event.info = "Test Event - Neo4j Import" | ||
| event.uuid = "test-event-uuid-001" |
There was a problem hiding this comment.
It is not a valid UUID, and that's automatically created by PyMISP.
| import os | ||
|
|
||
|
|
||
| def test_neo4j(): |
There was a problem hiding this comment.
That's not a real test, please use unittest.
|
|
||
| brotli = [ "urllib3[broti] (>=2.6.3)" ] | ||
|
|
||
| neo4j = [ "neo4j (>=5.0)" ] |
There was a problem hiding this comment.
You also need to update the poetry.lock file.
| with self.driver.session() as session: | ||
| session.execute_write(self._import_event_tx, event) | ||
|
|
||
| @staticmethod |
There was a problem hiding this comment.
I don´t get the point of a static method there, why not have that code in the import_event directly?
|
Thank you @Rafiot for the valuable feedback. I will look into the comments and respond/make the necessary changes within the course of the week |
Problem Statement
We need to migrate from
py2neoto officialneo4jorneomodelsdriver because py2neo has been deprecated since 2021 as pointed out in issue #1327.Issue analysis & solution
pymisp/tools/neo4j.pywhich migrates events into a Neo4j graph, establishing relationships between the events, attributes and attribute values.neo4jbecause of the simplicity of the script. Using it allows better performance and support because it is the official Neo4j libneo4jthe Cypher queries were defined directly below is a summarized mapping of the old functions and the new ones:authenticate() + Graph()GraphDatabase.driver()with Bolt protocolNode()objectsCREATE (e:Event {$event.uuid})Relationship()objectsCREATE (a)-[:type]->(b)tx.begin()/tx.commit()session.execute_write(callback)tx.merge(subgraph)MERGEkeywordFurther implementation details are as follows:
ip-src,domain|ip). Neo4j label names only accept alphanumeric + underscores. Solution:re.sub(r'[^A-Za-z0-9_]', '_', a.type)withcontext managers to guarantee session/connection cleanup$parametersyntax, preventing Cypher injectionneo4j = [ "neo4j (>=5.0)" ]topyproject.tomlunder optional-dependenciesTesting
test/test_neo4j.pyto test the new version ofneo4j.py.MISPEvent.add_attribute()import_event()which is the main function